This document defines the scope of the ANDB v1 prototype.
The main rule is:
v1 is a validation prototype, not a production-ready database platform.
The purpose of this document is to keep the repository aligned around a narrow but meaningful validation target.
v1 is a framework-level research prototype for an agent-native database. It is intended to validate that the following end-to-end thesis is workable:
- events can be ingested as the source of state change
- canonical objects can be materialized from events
- retrieval can operate over object-centric projections
- graph expansion can assemble structured evidence
- query responses can return more than isolated top-k chunks
v1 is not:
- a production-grade distributed database
- a full cloud-native control plane
- a complete governance engine
- a full tensor execution runtime
- a finalized enterprise integration platform
- a complete conflict resolution system
- the final future-proof API for all later versions
Every major implementation choice should be checked against that distinction.
The repository should use v1 to answer these questions:
Can memory, state, event, and artifact be modeled as canonical objects instead of plain rows or raw chunk records?
Can event-driven flow produce retrieval-ready object views?
Can retrieval combine multiple signals over canonical objects rather than over chunks only?
Can relation expansion assemble a minimal evidence subgraph?
Can the response expose structured evidence with provenance and version hints?
If these are validated, v1 succeeds.
v1 must support:
- event input
- event envelope decoding
- write-first event persistence into the event backbone
- ingest acknowledgment
Current anchors:
v1 must support the concept and contract for generating at least:
EventMemoryStateArtifact
Current reality:
Eventis explicitly ingested- retrieval projection currently derives a memory-like object directly from event text
- full dedicated materialization workers are still to be expanded
v1 must support at least:
- version fields on mutable objects
- mutation-event linkage
- the presence of object-version metadata in the response contract
This does not require a complete logical-time or publication model.
v1 must support:
- retrieval over object-derived representations
- metadata-aware filtering hooks
- candidate return suitable for later graph expansion
The long-term target includes dense, sparse, and filter-based retrieval. The current implementation is lighter, but the contract should move in that direction.
v1 must support:
- typed edges in the canonical model
- relation-aware response structure
- at least a constrained 1-hop or 2-hop expansion design
Full graph execution can remain shallow during bootstrap, but the schema and flow cannot ignore relations.
v1 must assemble or at least meaningfully scaffold a structured local evidence package from retrieval results.
At minimum, the response path should preserve:
- object identity
- edges category
- provenance category
- versions category
- proof trace category
The query response must move toward returning:
- objects
- edges
- provenance
- version hints
- applied filters
- proof trace notes
Even if some fields are still simplified in the current runtime, the contract should remain evidence-oriented.
v1 must support:
- mock data ingest
- runnable query demo
- basic testability
- baseline-oriented benchmark planning
Current anchors:
The following can remain minimal:
Basic scope fields can exist without a full governance engine.
Policy references and policy coordinators can exist without complete enforcement.
Artifacts can be linked simply without external federation.
Proof trace can be explanatory and shallow rather than formally complete.
visible_time and logical_ts can exist as contract fields before the full runtime semantics are implemented.
Do not build:
- TSO-grade global semantics beyond current lightweight support
- full visibility publication engine
- bounded staleness engine
- complete time-travel query model
Do not build:
- full ACL engine
- full TTL enforcement engine
- quarantine workflow engine
- production audit pipeline
Do not build:
- fact arbitration engine
- shared plan merge runtime
- CRDT-style merge engine
Do not build:
- generalized subtensor execution
- tensor-native storage engine
- full tensor operator runtime
Do not build:
- elastic worker autoscaling
- production scheduler framework
- HA deployment architecture
Do not build:
- large connector suites
- cross-system policy federation
- enterprise orchestration stack
The following are explicitly acceptable in v1:
- in-process workers instead of independent services
- in-memory or lightweight storage
- shallow graph store behavior
- rough scoring
- proof trace as execution notes
- shallow versioning
- approximate visibility filtering
- bootstrap response objects that are less rich than the final target
These are acceptable only if the architecture stays extensible.
v1 is successful if the repository can demonstrate:
- a collaborator can ingest mock events through the public API
- the system can project or materialize canonical-object-oriented records
- a query can retrieve candidate objects from the data plane
- the response preserves evidence-oriented structure
- the docs, code, and tests agree on the main flow
- benchmark work can compare ANDB-style response with a simpler baseline
No new top-level feature should be added unless it directly improves the v1 validation loop.
Any feature requiring major infrastructure should be challenged.
If a capability can be represented as a field now and implemented later, prefer that path.
Correct abstraction is more important than complete functionality in v1.
Later versions may extend v1 with:
- policy-aware retrieval
- rollback/time-travel query
- visibility-aware retrieval
- share contracts
- conflict merge
- tensor slicing and aggregation
- distributed scaling
- external enterprise connectors
These should layer onto the v1 skeleton rather than forcing v1 to absorb them prematurely.
The scope of v1 is deliberately narrow in runtime ambition but strong in abstraction.
Its job is to prove that the following system is real and implementable:
event-driven + object-centric + retrieval-aware + graph-assembled + structured-evidence-returning
Anything that does not directly serve that loop should be postponed.