Describe the bug
Running the staged L2 pipeline against the 12-table dev slice produced 52 regressions where previously-generated has_alias assertions disappeared. Root cause: the 12 Stage B few-shot examples all had empty synonyms: [] fields, which the LLM treated as a positive signal that synonyms should be omitted.
To reproduce
- Run L2 build against the dev slice before commit
783266d.
- Diff assertions against a prior run.
- Observe 52 missing
has_alias assertions concentrated on columns with obvious synonyms (e.g. PATIENT_ID ↔ patient).
Expected behavior
Few-shot examples should demonstrate realistic synonym output, not empty arrays. The LLM should emit has_alias whenever a plausible synonym exists.
Environment
- Affects staged L2 builds prior to commit
783266d.
src/sema/engine/ (Stage B few-shot example library).
Fix: populated all 12 Stage B examples with realistic synonyms and switched the prompt to compact JSON to save tokens while making the synonym field visually prominent. Fixed in commit 783266d. Closed by #63.
Describe the bug
Running the staged L2 pipeline against the 12-table dev slice produced 52 regressions where previously-generated
has_aliasassertions disappeared. Root cause: the 12 Stage B few-shot examples all had emptysynonyms: []fields, which the LLM treated as a positive signal that synonyms should be omitted.To reproduce
783266d.has_aliasassertions concentrated on columns with obvious synonyms (e.g.PATIENT_ID↔patient).Expected behavior
Few-shot examples should demonstrate realistic synonym output, not empty arrays. The LLM should emit
has_aliaswhenever a plausible synonym exists.Environment
783266d.src/sema/engine/(Stage B few-shot example library).Fix: populated all 12 Stage B examples with realistic synonyms and switched the prompt to compact JSON to save tokens while making the synonym field visually prominent. Fixed in commit
783266d. Closed by #63.