Skip to content

Tracking Self-Referential FK Order #297

@bjester

Description

@bjester

Target branch: release-0.9.x

Current behavior

The current deserialization behavior recurses into 'children' models when it encounters a model with a self-referential foreign key (FK). This contributes to its monolithic nature and its memory consumption by requiring the algorithm to track more data in memory to ensure comprehensive deserialization of store data.

The serialization process tracks the FK value for self-referential models in a field called _self_ref_fk. This allows it to lookup those FK records using that value.

Additionally, the Morango registry, which tracks the syncable models defined by any apps using Morango, orders them according to their cross-model FK dependencies.

Desired behavior

The deserialization algorithm would be more amenable to the streaming architecture by ordering the store records more intelligently before they enter the pipeline. Combined with the existing ordering by the Morango registry of syncable models, this should create a predictable order in which store records can be properly deserialized without broken FKs.

Enhancements

Data Model

This specifically addresses self-referential FK ordering within the same model (e.g., parent-child relationships in hierarchical data). The registry handles cross-model dependencies, but self-referential FKs need special ordering.

  • A new nullable field _self_ref_order should be added to the AbstractStore with type of Integer
  • The new field be greater than or equal to 0 unless null
  • Django model migrations should be created to add the new field to Store and Buffer
  • The SyncableModel should have a new tuple attribute morango_ordering which should hold one or more field expressions in the format that Django uses for order_by() (similar to Meta.ordering)
  • The SyncableModelRegistry.get_model_querysets should order its querysets according to morango_ordering if set; nulls should always be last

Serialization

Update StoreUpdate transform in serialize.py to set _self_ref_order during Store record creation and updates.

  • The _handle_store_create method should be updated:
    • if the model is not self-referential, the order field should be null
    • if _self_ref_fk is None and the model is self-referential, the order field should be 0
    • otherwise, it should query for the parent's order (the Store._self_ref_order for id=_self_ref_fk)
  • The _handle_store_create method should be updated:
    • to detect if the _self_ref_fk has changed, and if so, update the order like _handle_store_create

Compatibility

Morango should communicate during a sync whether it supports ordering with _self_ref_order. Later on, this will allow Morango to determine whether it can fully streamline the sync.

  • A new capability SELF_REF_ORDER should exist in constants/capabilities.py
  • The capability should be provided by default during a sync
  • If the client device sends the capability, the server should trust and expect _self_ref_order in the sent data
  • If the client does not send the capability, the server should compute the order during the dequeue stage

Transmission

Value add

Provides more certainty to the pipeline that we process store records in the order of their relational dependency in the DB.

Possible tradeoffs

  • Morango defines other fields on SyncableModel for consumers to override, which could be better to have in the model Meta. For example, you can define ordering in the model Meta, but we're putting morango_ordering on the model itself.
  • Morango could let the app models provide the ordering information through a defined API, but backwards compatibility would still need logic to determine it manually. Therefore always leveraging that logic seems sensible, unless we let go of backwards compatibility
  • We're adding yet another capability. While it isn't a concern at the moment, growth of the capabilities should be monitored since it's sent in an HTTP header, which has inconsistent limits on size depending on server.

AI usage

AI was used for brainstorming approaches.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions