Skip to content

Implement REBUILD_SEARCH_DOCUMENT cron worker #438

@Syndesi

Description

@Syndesi

There are two security-related facts which form the foundation of this ticket:

  • The security model of Ember Nexus is based in graphs.
  • Elasticsearch, the primary search database, works in individual documents and has no native knowledge of graphs.

Therefore in order for Elasticsearch to perform in a security-minded way, it needs to access security-data in other ways. We already implemented the properties _groupsWithSearchAccess and _usersWithSearchAccess, which basically serialize the graph security model into per-node-attributes for Elasticsearch. However up until now these were only calculated during the creation of new elements. This left quite a few of other scenarios uncovered:

  • Addition and removal of users and groups did not reflect permission changes in Elasticsearch.
  • New relations, which should grant existing users access to already existing nodes, were not yet updated.

This ticket aims to implement this long awaited feature.
The following general aspects need to be kept in mind:

  • Calculation of access lists can be somewhat expensive.
    -> Do not recompute this inside HTTP request handlers.
    -> Avoid exponential avalanches and recursive loops.
  • Loops can exist, but they must not be followed. Otherwise we end up with endless loops.
  • Many situations require access checks, but only a few actually end in changed permissions. Make checks themself faster, if possible.
  • If permission lists actually change, then this will most likely trigger permission updates down the tree as well.
  • The default limit of how many elements are checked for access changes should be high enough to not cause issues in regular deployments, but low enough to not cause long congestion either. Likely a limit of 100-1000 elements per command iteration is a good start. Maybe a "burst mode" to work through a backlog of elements is a good idea as well?

As of now, the general way of implementation will likely look something like this:

  • If node: Recompute access lists. If different, persist them and add children to stack (i.e. RabbitMQ queue).
  • If relation: Recompute access lists of its start and end nodes. If at least one of them is different, persist them and add the end node's children to the stack.

Acceptance criteria:

  • Create cron command to rebuild search document permissions.
  • Think about renaming this process.
  • Add cron command to cron tab list.
  • Make limits, e.g. number of elements to be checked per execution, configurable.
  • Document command.
  • Add unit tests, where possible.
  • Add feature tests.
    • Simple case for a single node.
    • Simple case for a single relation.
    • Simple case for a new user being added directly.
    • Simple case for an existing user being removed.
    • Simple case for a new group being added.
    • Simple case for an existing group being removed.
    • Simple case for a nested group being added.
    • Simple case for a nested group to be removed.
    • Advanced case with a small loop (2 nodes).
    • Advanced case with a bigger loop (3 nodes).
    • Advanced case with a big loop (7 nodes).

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureIntroducing new capabilities.

    Type

    No type

    Projects

    Status

    In Progress

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions