Skip to content

feat: full 33-table cBioPortal corpus sign-off (blocked on ingest) #72

@deanban

Description

@deanban

Problem

The source-semantic-hardening change was validated on a 12-table dev slice, but the proposal target is the full 33-table cBioPortal corpus. Full-corpus sign-off — plus the holdout-vs-dev-slice bias check in tasks.md §8.8 — is still pending.

Currently blocked: only 12 of 33 tables are landed in Databricks. The remaining 21 tables require the cbioportal-omop-data-bridge runbook to be completed before evaluation can proceed.

Proposed solution

Once ingest is unblocked:

  1. Complete the cbioportal-omop-data-bridge runbook to land all 33 tables.
  2. Run the full build + sema eval telemetry dump on the 33-table corpus.
  3. Run the holdout-vs-dev-slice bias comparison (tasks.md §8.8) to confirm few-shot library generalizes.
  4. Sign off OpenSpec tasks §10.1 and §10.4.

Alternatives considered

  • Declaring 12-table dev-slice results sufficient — rejected; the proposal target is the full corpus and we want the holdout check before archiving the change.

Not closed by #63 — tracks post-merge follow-up work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions