Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 52 additions & 3 deletions docs/lqp.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,64 @@ LQP clients send `Transaction`s that the engine executes.

Write := Define(fragment::Fragment)
| Undefine(fragment_id::FragmentId)
| Context(relations::RelationId[])
| Snapshot(mappings::SnapshotMapping[], prefix::String[])

Read := Demand(relation_id::RelationId)
| Output(name::String, relation_id::RelationId)
| Export(config::ExportCSVConfig)
| WhatIf(branch::String, epochs::Epoch[])
| Export(config::ExportConfig)
| WhatIf(branch::String, epoch::Epoch)
| Abort(name::String, relation_id::RelationId)

Transactions are structured into one or more epochs, which correspond to observable states
of the installed program. This allows users to execute a sequence of steps in a single
transaction. Within an epoch writes execute before reads. Multiple writes or multiple reads
can be performed concurrently and in any order. Of special note are the WhatIf operations,
can be performed concurrently and in any order. Of special note are the `WhatIf` operations,
which allow executing an epoch in a throwaway clone of the runtime state.

## Execution Model

Transaction execution proceeds in two passes. First, the _simulator_ runs the transaction
against a transient copy of the runtime state to validate it and minimize it (e.g. dropping
writes whose effects are clobbered by later writes). Then the _driver_ executes the
validated, minimized transaction against the actual runtime. If the simulator detects invalid
state at any point, the transaction is aborted and errors are returned.

## Write Operations

`Define` installs a fragment and its declarations into the execution graph. `Undefine`
removes a fragment. `Context` declares which relations should be jointly optimized — more
context gives the optimizer more reuse opportunities but increases planning time. `Snapshot`
materializes derived relations into durable EDB (base) relations, associating new relation
values with stable identities over time.

## Read Operations

`Demand` triggers computation of a relation without returning its contents — useful for
warming caches. `Output` computes and returns a relation's contents under a human-readable
name. `Export` writes data to external storage (CSV or Iceberg). `WhatIf` runs a speculative
epoch on a transient fork; writes don't persist, reads observe the modified state. `Abort`
enforces integrity constraints: the transaction fails if the referenced relation is non-empty.

## Types

All types are primitive and aligned with the Apache Iceberg type system. The engine uses
type information for equality, ordering, promotion, and algebraic properties of operations.
Overloading must be handled by higher-level compilers. See the `Type` message in
`logic.proto` for the full list.

## External Data

`Data` declarations describe external sources (CSV, Iceberg, BeTree) without eagerly
ingesting them — data is loaded lazily when first demanded. `EDB` declares durable
engine-managed base relations (the result of `Snapshot` operations). `CSVData` and
`IcebergData` describe how to read from those respective formats, with column-to-relation
mappings via `GNFColumn`.

## Protobuf Specification

The proto files in `../proto/relationalai/lqp/v1/` are the authoritative specification:

- `logic.proto` — Declarations, formulas, types, values, and external data sources
- `fragments.proto` — Content-addressable compilation units and debug info
- `transactions.proto` — Transaction structure, write/read operations, and export config
26 changes: 25 additions & 1 deletion proto/relationalai/lqp/v1/fragments.proto
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
// Logical Query Protocol — Fragments
//
// Fragments are the unit of incremental compilation and installation. The full
// execution graph is partitioned into content-addressable fragments that can be
// defined, redefined, and undefined independently.
//
// Instead of resending the entire program on every transaction, a client can
// use the Sync mechanism (see transactions.proto) to reconcile its expected
// set of fragments with the engine's installed state. The engine identifies
// fragments by their content hash, so unchanged fragments are never resent.
//
// The granularity of fragments is chosen by the compiler: one per source file,
// per module, per definition, or even a single fragment for the entire program.

syntax = "proto3";

option go_package = "github.com/RelationalAI/logical-query-protocol/sdks/go/src/lqp/v1";
Expand All @@ -6,17 +20,27 @@ package relationalai.lqp.v1;

import "relationalai/lqp/v1/logic.proto";

// A content-addressable unit of the execution graph containing one or more
// declarations. Each declaration can only belong to a single fragment.
// Fragments are installed/removed via Define/Undefine write operations.
message Fragment {
FragmentId id = 1;
repeated Declaration declarations = 2;
// Optional human-readable name mappings for debugging and logging.
DebugInfo debug_info = 3;
}

// Maps opaque RelationIds to human-readable names for use in logs,
// error messages, and debugging tools. The ids and orig_names arrays
// are parallel (same length, matched by index).
message DebugInfo {
repeated RelationId ids = 1;
repeated string orig_names = 2;
}

// Content-based identifier for a fragment. Typically a hash of the
// fragment's declarations, enabling deduplication and cache-friendly
// synchronization.
message FragmentId {
bytes id = 1; // Variable-length identifier, up to 256 bits (32 bytes) or less
bytes id = 1; // Variable-length identifier, up to 256 bits (32 bytes)
}
Loading
Loading