Context
The current per-project catalog files (catalog/<project>.json) declare FKs and relationships only against tables within their own project. Each project is parsed in isolation, so cross-project lineage is invisible to downstream consumers.
This was discovered while implementing the Data Governance Explorer's lineage rendering on the AISPEG site (slice ui-insight/AISPEG#59 / epic ui-insight/AISPEG#53). The audit walked all five vendored catalogs:
| Metric |
Value |
| Total tables |
71 |
| Distinct table names |
67 |
| Tables that share a name across projects |
4 (AllowedValues, ActivityLog, Document, users) |
| FK strings whose target table exists only in another project |
0 |
Relationships whose target exists only in another project |
0 |
The same-named tables across projects are pattern-shared (each project owns its own AllowedValues, etc.), not pointers — i.e. there is no encoded statement that audit-dashboard.AuditReport ever references the institutional Personnel registry, even though the implied dependency exists in practice.
Why this matters
Stakeholders ask questions like:
- "What other AI4RA projects depend on
Personnel?"
- "If we change the canonical UDM
Document shape, who is downstream?"
- "Which projects share a single source of truth versus duplicating the entity?"
These are unanswerable from today's catalog. The downstream lineage renderer (AISPEG#59) ends up empty until an upstream representation exists.
Proposal
Two viable representations — pick one or combine.
Option A — extend FK strings with an explicit project prefix
Allow foreign_key / relationship target to take the form
<otherproject>:<TableName>.<column> when the target lives in another
project's catalog. Bare <TableName>.<column> continues to mean
"resolves inside this catalog". Example:
{ "name": "Lead_Personnel_ID", "type": "Integer", "foreign_key": "openera:Personnel.Personnel_ID" }
Pros: minimal schema change; round-trips through the existing parser with
a small extension; preserves the column-level FK granularity.
Cons: requires every project to know the slug of every other project; no
top-level overview of inter-project dependencies.
Option B — add a top-level cross_project_references array
Add to each catalog/<project>.json:
"cross_project_references": [
{
"source_table": "AuditReport",
"source_column": "Lead_Personnel_ID",
"target_project": "openera",
"target_table": "Personnel",
"target_column": "Personnel_ID",
"kind": "foreign-key"
}
]
Pros: zero impact on the existing tables[] schema; a single read gives
the full outbound dependency graph for a project; supports relationship-
level entries (set kind: "declared-relationship" and omit source_column).
Cons: introduces a second source of truth — the per-column FK string and
the cross-reference array can disagree.
Option C (recommended) — Option B as the canonical surface, Option A as syntactic sugar that the build script expands
The build script in ui-insight/AISPEG/scripts/build-governance-catalog.ts
parses both forms and emits a flat lineage list. The catalog author
chooses whichever is more ergonomic per project.
Acceptance criteria
Related
Context
The current per-project catalog files (
catalog/<project>.json) declare FKs and relationships only against tables within their own project. Each project is parsed in isolation, so cross-project lineage is invisible to downstream consumers.This was discovered while implementing the Data Governance Explorer's lineage rendering on the AISPEG site (slice ui-insight/AISPEG#59 / epic ui-insight/AISPEG#53). The audit walked all five vendored catalogs:
AllowedValues,ActivityLog,Document,users)targetexists only in another projectThe same-named tables across projects are pattern-shared (each project owns its own
AllowedValues, etc.), not pointers — i.e. there is no encoded statement thataudit-dashboard.AuditReportever references the institutionalPersonnelregistry, even though the implied dependency exists in practice.Why this matters
Stakeholders ask questions like:
Personnel?"Documentshape, who is downstream?"These are unanswerable from today's catalog. The downstream lineage renderer (AISPEG#59) ends up empty until an upstream representation exists.
Proposal
Two viable representations — pick one or combine.
Option A — extend FK strings with an explicit project prefix
Allow
foreign_key/ relationshiptargetto take the form<otherproject>:<TableName>.<column>when the target lives in anotherproject's catalog. Bare
<TableName>.<column>continues to mean"resolves inside this catalog". Example:
{ "name": "Lead_Personnel_ID", "type": "Integer", "foreign_key": "openera:Personnel.Personnel_ID" }Pros: minimal schema change; round-trips through the existing parser with
a small extension; preserves the column-level FK granularity.
Cons: requires every project to know the slug of every other project; no
top-level overview of inter-project dependencies.
Option B — add a top-level
cross_project_referencesarrayAdd to each
catalog/<project>.json:Pros: zero impact on the existing
tables[]schema; a single read givesthe full outbound dependency graph for a project; supports relationship-
level entries (set
kind: "declared-relationship"and omitsource_column).Cons: introduces a second source of truth — the per-column FK string and
the cross-reference array can disagree.
Option C (recommended) — Option B as the canonical surface, Option A as syntactic sugar that the build script expands
The build script in
ui-insight/AISPEG/scripts/build-governance-catalog.tsparses both forms and emits a flat lineage list. The catalog author
chooses whichever is more ergonomic per project.
Acceptance criteria
docs/(markdown ADR)docs/catalog-schema.md)Related