Skip to content

feat: generic few-shot base layer with domain pack composition #76

@deanban

Description

@deanban

Problem

few_shot.py was healthcare-only. For every other industry, format_examples() returned empty and prompts ran effectively zero-shot — no teaching signal for Stage A entity identification, Stage B property classification, or Stage C value decoding unless the workspace happened to be healthcare.

Proposed solution

Introduce a generic base layer that composes with domain-specific overlays:

  • New few_shot_generic.py with industry-agnostic archetypes: 5 Stage A (event-stream, transaction, dimension, bridge, wide tables), 8 Stage B (identifier, temporal, numeric, categorical, boolean, ordinal, free-text columns), and 4 Stage C (status, Y/N, prefix-encoded, ordinal decoding) examples.
  • Split existing healthcare examples into few_shot_healthcare.py.
  • Rewrite few_shot.py as a thin registry plus compose_examples() that prepends generic to any domain-specific overlay.
  • format_examples(None, stage) now returns generic examples instead of empty.

Every prompt gets teaching signal regardless of industry; healthcare still gets its specialized overlay on top.

Alternatives considered

  • Keep zero-shot for non-healthcare domains — rejected; measurably worse Stage A/B quality on non-healthcare data.
  • One monolithic example list with no domain axis — rejected; dilutes healthcare-specific examples and bloats every prompt.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions