Skip to content

fix(public): de-hardcode wave 1 + retrieval-profiles spec + evidence roadmap#201

Merged
mbachaud merged 2 commits into
masterfrom
fix/public-dehardcode-wave1
Jun 10, 2026
Merged

fix(public): de-hardcode wave 1 + retrieval-profiles spec + evidence roadmap#201
mbachaud merged 2 commits into
masterfrom
fix/public-dehardcode-wave1

Conversation

@mbachaud

Copy link
Copy Markdown
Owner

Output of the 2026-06-09 research → examine → document loop (3 parallel research agents: issues/PRs evidence audit, public-product hardcoding audit, retrieval-tuning evidence study).

De-hardcode wave 1 (runtime retrieval code)

  • Tagger: owner-project vocabulary (BigEd/BookKeeper/CosmicTasha/ModuleHub/Dr. Ders/FleetDB/SwiftWing21 + matching tech terms) removed from _PROJECT_ENTITIES/_TECH_TERMS — these drive the tag_exact tier, so every public user's corpus was being tagged against the owner's project names. Operators extend via HELIX_TAGGER_EXTRA_ENTITIES ("Label:Text" pairs) / HELIX_TAGGER_EXTRA_TERMS.
  • lexical_rescue: removed the product's self-boost (helix query + helix-context path +2.0, /helix.toml +2.0) and the owner's /_worktrees/ −1.0 layout penalty; replaced with a generic +2.0 when a query term (len≥4) names a path segment/filename stem. Structural bias toward the product's own repo is gone (helix/acme now score identically).
  • helix.toml template: owner-project synonym rows and the personal mem_sync.watch_dirs path scrubbed.
  • Dashboard placeholders: F:\path\to\new.genome.dbpath/to/new.genome.db.
  • bridge.py How-to-Use renders the instance helix_base_url (was hardcoded :11437); installer service message threads the real --port.

Docs (same loop)

Test plan

  • 16 new tests (test_public_dehardcode.py): owner-vocab absence, env extension, generic-vs-helix score equality, template neutrality
  • Updated fixtures in density-gate / shard-router / bench-needles / installer tests
  • 157 passed across touched suites on Windows; 114 passed + compileall clean in sandbox

mbachaud and others added 2 commits June 9, 2026 17:46
…ing/config, generic path-affinity rescue

Public-product readiness pass. The shipped defaults baked the original
author's private project vocabulary and machine paths into runtime
retrieval surfaces. That matters beyond cosmetics: CpuTagger's
EntityRuler patterns and _TECH_TERMS become promoter tags at ingest,
and promoter tags drive the Tier-1 tag_exact retrieval tier
(tag_exact_weight = 3.0) — so any public deployment was spending its
strongest lexical tier matching another person's project names.
Likewise lexical_rescue's path bonus hardwired +2.0 boosts for
"helix-context" paths and "/helix.toml", biasing source rescue toward
the product's own repository on every corpus.

Changes:

* tagger.py — removed owner-project entries from _PROJECT_ENTITIES
  (BigEd, BigEd CC, BookKeeper, CosmicTasha, two-brain, ModuleHub,
  Dr. Ders, FleetDB, fleet.toml as PRODUCT; SwiftWing21 as ORG) and
  from _TECH_TERMS (biged, bookkeeper, cosmictasha, modulehub,
  fleetdb, "dr ders"). The product's own stack (helix, agentome,
  scorerift, splade, ...) and generic IR/NLP vocabulary stay. New
  module-level _extend_vocabulary_from_env() reads
  HELIX_TAGGER_EXTRA_ENTITIES (comma-separated "Label:Text" pairs)
  and HELIX_TAGGER_EXTRA_TERMS (comma-separated lowercased terms)
  once at import so operators re-add deployment vocabulary without
  forking; malformed pairs are skipped with a warning. Comments note
  the previously owner-specific defaults removed for public release.

* retrieval/lexical_rescue.py — dropped the hardwired helix-in-query
  + "helix-context"-in-path +2.0, the "/helix.toml" +2.0, and the
  owner-specific "/_worktrees/" -1.0 penalty. Replaced with one
  corpus-neutral heuristic: +2.0 (once) when any query term of
  len >= 4 names a whole path segment or filename stem of the
  candidate — "acme" boosts /acme/ and acme.toml exactly the way
  "helix" boosts helix.toml. Existing generic scores (config-ext
  +1.5/+1.0, substring agreement +0.4, tests-path -0.75) unchanged.
  Module docstring now states the full scoring contract.

* helix.toml — [synonyms]: removed owner rows biged / bookkeeper /
  cosmictasha / fleet, dropped "scorerift" from the scoring row and
  "biged" from the agents row; generic software vocabulary kept.
  [mem_sync]: personal watch_dirs path replaced with an empty list
  plus an example comment.

* launcher templates (dashboard.html, components/database_panel.html)
  — placeholder F:\path\to\new.genome.db -> path/to/new.genome.db.

* bridge.py — generated SHARED_CONTEXT.md "How to Use" line now
  renders self.helix_base_url instead of hardcoding
  http://127.0.0.1:11437.

* launcher/installer.py + launcher/app.py — install_service() grew a
  defaulted port param (11438) used in the printed dashboard URL,
  threaded from the CLI's existing --port flag through
  _handle_service_command.

Tests: new tests/test_public_dehardcode.py (16 tests) pins (a) no
owner vocabulary in tagger patterns/terms, (b) env-var extension via
the importlib.reload pattern, (c) the generic path-segment boost,
structural score equality between helix-context and acme-context
paths, worktrees penalty removed, tests penalty kept, and end-to-end
rescue preference for a query-term path match, (d) helix.toml parses
with no owner synonym keys and empty watch_dirs. Owner-named fixtures
replaced with generic equivalents in test_density_gate.py,
test_shard_router.py, test_bench_needles.py;
test_launcher_installer.py call-through assertion updated to expect
port=11438.

Out of scope, noted for wave 2: com.swiftwing21.helix-launcher plist
id (installer + deploy templates + tests pin it), "swiftwing" org
fixtures in test_server/test_mcp_server, biged_skills empirical-curve
comments in test_abstain_tier.py / pipeline/tier_logic.py, and
docstring examples in knowledge_store.py / storage/ddl.py.
…-steps roadmap

From the 2026-06-09 research/examine/document loop (3 parallel research
agents over issues+PRs, hardcoding audit, and tuning-evidence study).
Answers the profiles question: one pipeline cannot serve all workloads,
but only ~6 knobs are corpus-sensitive — design = auto-calibration +
classifier-owned query-time adaptation + 3 small corpus profiles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant