Releases: DrDexter6000/talking-cli
Releases · DrDexter6000/talking-cli
v0.2.0 — Distributed Prompting, Hardened
What changed
This is the first public-quality release of talking-cli. The project has gone through a full TDD-driven development cycle (Phase 0 through H), emerging with a hardened methodology, reproducible evidence, and a self-auditing tool.
Methodology
- Distributed Prompting methodology formalized in PHILOSOPHY.md (four channels, four rules, budget, five anti-patterns)
- Prompt-On-Call as the concrete implementation pattern
- Adversarial Case Study documenting four known failure modes with mitigations
Evidence
- 2x2 ablation benchmark on GLM-5.1 (15 curated tasks): -67% tokens, +26pp pass rate
- MCP ecosystem audit: 4 Anthropic servers, 68 scenarios, 0/68 returned actionable guidance
- MCP-Atlas public corpus: 10 tasks adapted from real sample_tasks.csv (CC-BY-4.0)
- SkillsBench independent validation: comprehensive skills at P99.5 degrade -2.9pp
Tool
- audit: H1-H4 heuristics for skill directories (ruleset v1.0.0)
- audit-mcp: M1-M4 heuristics for MCP servers (static + deep runtime mode)
- init: Scaffold a new skill directory with passing templates
- optimize: Generate optimization plans with --apply auto-fix
- Self-audit score: 100/100 (CI-enforced, badge in README)
Security
- SECURITY.md with threat model documentation
--no-spawnflag for safe static-only MCP analysis--deepwarning in README
Infrastructure
- GitHub Actions CI workflow (self-audit on every PR, fails below 80)
- Anti-bloat regression test (SKILL.md <=150 lines, persona count <=2)
- Heuristic ruleset versioning (v1.0.0)
- Publication-standard benchmark reports (PROVEN/SUCCESS/PARTIAL/FAILURE verdict tiers)
Stats
- 36 test files, 295 tests passing
- Self-audit: 100/100 (H1=100, H2=100, H3=100, H4=100)
- Ruleset version: 1.0.0