feat(skillopt): implement SkillOpt skill-document optimizer#75
Draft
wesleysimplicio wants to merge 1 commit into
Draft
feat(skillopt): implement SkillOpt skill-document optimizer#75wesleysimplicio wants to merge 1 commit into
wesleysimplicio wants to merge 1 commit into
Conversation
Add the SkillOpt loop (Rollout -> Reflect -> Edit -> Gate) from https://microsoft.github.io/SkillOpt/ as a tool for this skill ecosystem. The skill markdown is the only trainable artifact; the target model stays frozen and edits are accepted only when they improve a held-out task split, with a rejected-edit buffer providing negative feedback and an edit budget acting as the textual learning rate. - scripts/skillopt/engine.js: deterministic, dependency-free engine (optimize, reflect, applyEdits, evaluateSplit) with a pluggable scorer so real LLM adapters can replace the default heuristic. - bin/skillopt.js + cli.js subcommand: optimize a SKILL.md against a task suite, emit best_skill.md, an optional report, and a content-addressed receipt under .catalog/receipts/. - .skills/skillopt: skill manifest plus runnable example fixtures. - Unit tests (engine + CLI) and a Playwright CLI e2e with evidence. https://claude.ai/code/session_01MuMv2kN3x5s6UwjXMap2ZP
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements SkillOpt (Microsoft Research — Executive Strategy for Self-Evolving Agent Skills) as a tool for this repo's
.skills/ecosystem. SkillOpt treats a natural-language skill document as the only trainable artifact and optimizes it for a frozen target model through a four-stage loop:budgetadd/delete/replace ops (the textual learning rate), skipping any op already in the rejected-edit buffer.Output is
best_skill.md, plus an optional run report and a content-addressed receipt under.catalog/receipts/(matching the repo's receipt schema).What's included
scripts/skillopt/engine.js— deterministic, dependency-free engine (optimize,reflect,applyEdits,evaluateSplit). The rollout scorer is pluggable (opts.scorer) so a real LLM adapter can replace the default heuristic without touching the loop. Runs fully offline.bin/skillopt.js+cli.jsskilloptsubcommand —npx ... skillopt --suite <suite.json> [--skill ...] [--out best_skill.md] [--report ...] [--rounds N] [--budget N] [--no-receipt] [--json]..skills/skillopt/SKILL.md— skill manifest, plus runnableexample.skill.md/example.suite.json.tests/unit/skillopt.test.js(engine + CLI, 15 tests) andtests/e2e/skillopt.spec.ts(Playwright CLI e2e with evidence attachments).package.jsonbin, README "Companion tooling",CHANGELOG.md,.skills/README.md, and a.gitignorewhitelist forscripts/skillopt/.Design notes
best(acceptance requirescandidateGate > bestGate). When noholdouttasks exist the gate falls back to the train split and reportsusedHoldout: false.2cleanly on bad input. Regex used inreplaceedits is escaped (no ReDoS / injection).Test plan
npm test— 57 pass / 5 pre-existing skips, 0 failnpm run lint— 0 errorsnpx playwright test --project=chromium— 10 pass / 1 pre-existing skip (incl. newskillopt.spec.ts)0.5 -> 1,EXIT_SIGNAL: true,best_skill.mdgains the missing directivesbest_skill.mddiff workflow reads wellhttps://claude.ai/code/session_01MuMv2kN3x5s6UwjXMap2ZP
Generated by Claude Code