Releases · DrDexter6000/talking-cli · GitHub

27 Apr 16:22

DrDexter6000

v0.2.0 — Distributed Prompting, Hardened Latest

Latest

What changed

This is the first public-quality release of talking-cli. The project has gone through a full TDD-driven development cycle (Phase 0 through H), emerging with a hardened methodology, reproducible evidence, and a self-auditing tool.

Methodology

Distributed Prompting methodology formalized in PHILOSOPHY.md (four channels, four rules, budget, five anti-patterns)
Prompt-On-Call as the concrete implementation pattern
Adversarial Case Study documenting four known failure modes with mitigations

Evidence

2x2 ablation benchmark on GLM-5.1 (15 curated tasks): -67% tokens, +26pp pass rate
MCP ecosystem audit: 4 Anthropic servers, 68 scenarios, 0/68 returned actionable guidance
MCP-Atlas public corpus: 10 tasks adapted from real sample_tasks.csv (CC-BY-4.0)
SkillsBench independent validation: comprehensive skills at P99.5 degrade -2.9pp

Tool

audit: H1-H4 heuristics for skill directories (ruleset v1.0.0)
audit-mcp: M1-M4 heuristics for MCP servers (static + deep runtime mode)
init: Scaffold a new skill directory with passing templates
optimize: Generate optimization plans with --apply auto-fix
Self-audit score: 100/100 (CI-enforced, badge in README)

Security

SECURITY.md with threat model documentation
--no-spawn flag for safe static-only MCP analysis
--deep warning in README

Infrastructure

GitHub Actions CI workflow (self-audit on every PR, fails below 80)
Anti-bloat regression test (SKILL.md <=150 lines, persona count <=2)
Heuristic ruleset versioning (v1.0.0)
Publication-standard benchmark reports (PROVEN/SUCCESS/PARTIAL/FAILURE verdict tiers)

Stats

36 test files, 295 tests passing
Self-audit: 100/100 (H1=100, H2=100, H3=100, H4=100)
Ruleset version: 1.0.0

Assets 2