Skip to content

Merkle/Causal structure #38

@gritzko

Description

@gritzko

RON ops certainly have some redundancy. The op specifier consists of four UUIDs: type, object, event and location ids, which may be derived from each other in many cases.

Indeed, having the object id, we may learn the type; having the location id we may learn the object (as location ids are globally unique).
This redundancy may lead to contradictions. Indeed, we may specify a wrong type for the object or a wrong object for the location (or vice versa).

While a detailed four-component specifier makes it easy to process an op, it also makes it difficult to handle a RON database as a Merkle structure.

My solution is to continue the Causal Tree line of thinking all the way forward. Namely, make object and type ids 100% derived. Use two versions of the protocol: Open RON (two-UUID specifier, event and ref) and Closed RON (four-UUID specifier for client's convenience). Having the full history of events, a peer/server may transitively close ops, deriving the closed form for the clients (who don't have the full history).

This way, the protocol is much less vulnerable to data inconsistencies. Also, the Merkle structure of the event graph becomes self- evident. Last but not least, a RON log gets a simpler key-value structure which is much easier to store and navigate.

Formally, the new rules are:

  • a RON op has two UUIDs (own id and a reference) and zero or more value atoms (ints, UUIDs, strings and floats)
  • both UUIDs are RON event UUIDs, essentially Lamport timestamps
  • the reference points to some past op of the same object, thus describing the new op's place of application;
  • an op with a null reference is an object creation op; its value must mention the object's type (i.e. RDT UUID)
  • an op with an empty value is an ack op, which is only significant in the context of the Causal/Merkle graph; it modifies no object.

This way, ops form an orderly tree. The existing types and algorithms need to be corrected in two cases (at least):

  • lww could no longer mention the field name in the reference UUID
  • rga remove op has to work differently (at the very least, it cannot have is value empty)

Otherwise, the overall structure of the protocol stays the same; all the reducer/RDT logic stays the same as reducers did not read type/object ids anyway. The existing grammar stays valid.

In a follow up, I'll describe how the Merkle tree is defined and used in this context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions