The auto-pilot recovery layer for agents. A zero-runtime-dependency TypeScript library and CLI that scores output confidence, detects ungrounded claims with GSAR-style typed grounding, monitors behavioral drift, and decides a recovery action the moment an agent leaves the expected path.
An agent started inventing product IDs that do not exist. It took three days to notice, because the output looked plausible, and by then customers had the problem. AutoMend exists to catch that the moment it happens, and to act.
AutoMend sits around an agent step. You feed it the signals you already have (a self-reported confidence, claims classified against your evidence, behavioral metrics, a tool result), and it turns detect, decide, act into one deterministic, auditable loop: score the output, detect the failure mode, decide a recovery (rollback, retry with adjusted guardrails, escalate, or ask a human), run it through your executors, and seal every decision in a tamper-evident audit trail.
Honest by design. AutoMend is a deterministic policy and scoring engine, not a model. It does not call an LLM, and it does not decide on its own whether a claim is true. It scores the classifications and signals you supply, makes the recovery decision repeatable and auditable, and ships heuristic detectors as a starting point you can replace with a real matcher. Zero required runtime dependencies, node-free core, ESM + CJS, SLSA provenance on every release.
pnpm add @takk/automend
# or: npm install @takk/automend
# or: yarn add @takk/automend
# or: bun add @takk/automendThe core has zero required runtime dependencies. Optional peers are sibling @takk packages, installed only if you bridge to them.
import { createAutoMend } from '@takk/automend';
const automend = createAutoMend();
const report = await automend.guard({
// confidence signals you already measure, each in [0, 1]
confidence: [
{ name: 'self-reported', value: 0.32 },
{ name: 'evidence-coverage', value: 0.4, weight: 2 },
],
// claims classified against your evidence (by a model, a matcher, or a human)
claims: [
{ id: 'c1', type: 'grounded', evidenceType: 'observed' },
{ id: 'c2', type: 'contradicted', evidenceType: 'observed' },
],
// recovery actions are your callbacks; AutoMend decides which one to run
executors: {
retry: () => regenerateWithGuardrails(),
escalate: () => pageTheOncall(),
askHuman: () => openReviewTask(),
},
});
if (!report.healthy) {
console.log(report.decision?.strategy); // e.g. "escalate"
}guard scores the confidence, runs the GSAR grounding math over your claims, picks the most severe issue, decides a recovery strategy under a safe-mode policy, runs the matching executor, and records every step in the audit trail.
The detectors are heuristic and dependency-free. They turn raw output into the classifications the scorers consume, so the loop works before you wire in anything smarter.
import { classifyClaims, detectLoop, detectCorruption, loopIssue } from '@takk/automend/detectors';
// lexical grounding matcher: text + evidence -> classified claims
const claims = classifyClaims(
[{ id: 'c1', text: 'the order shipped on monday' }],
['observation: the order shipped on monday from the warehouse'],
); // -> [{ id: 'c1', type: 'grounded', ... }]
// loop / recursion detector over step fingerprints
const loop = detectLoop(['search', 'search', 'search']); // -> { looping: true, ... }
// feed a detected loop straight into guard
await automend.guard({ issues: [loopIssue(loop)].filter(Boolean), executors });
// output corruption (empty, control-character-laden, or validator-failing)
detectCorruption('{not json', { validate: (t) => { try { JSON.parse(t); return true; } catch { return false; } } });Swap any detector for a real natural-language inference model or an embedding matcher when you need higher fidelity; the rest of the loop is unchanged.
Every detection, decision, and outcome is recorded append-only and sealed with a SHA-256 hash chain via the Web Crypto API. This supports the immutable execution-record requirement that regulations such as EU AI Act Article 12 ask of high-risk systems. It is an integrity seal, not a digital signature: it proves the log was not altered after sealing.
import { createAuditLog } from '@takk/automend/audit';
const log = createAuditLog({ id: 'run-42' });
log.append('detection', 'low confidence', { score: 0.31 });
log.append('decision', 'escalate');
const seal = await log.seal(); // { algorithm: 'sha-256', root: '...', count: 2 }
(await log.verify(seal)).valid; // true; flips to false if any entry is alteredThe createAutoMend facade wires this in automatically: read automend.audit for the live log.
Each is importable from its own subpath, or from the root.
| Subpath | What it does |
|---|---|
@takk/automend |
createAutoMend().guard(), the unified detect, decide, act loop |
@takk/automend/confidence |
Aggregate weighted signals into a confidence and a verdict |
@takk/automend/grounding |
GSAR typed grounding: four-way claims, asymmetric scoring, three-tier decision |
@takk/automend/detectors |
Heuristic grounding matcher, loop detector, corruption detector |
@takk/automend/drift |
Welford baseline and z-score drift detection |
@takk/automend/recovery |
Ordered recovery policy, decide and run, safe mode, retry budget |
@takk/automend/escalation |
Immutable, content-addressed escalation records |
@takk/automend/audit |
Append-only audit log with a SHA-256 seal |
@takk/automend/interceptors |
guardStep to wrap any function, deterministic clock |
@takk/automend/mcp |
Turn a failed MCP tool call into a recovery trigger |
@takk/automend/edge |
The full node-free core for edge runtimes and the browser |
decideRecovery evaluates an ordered policy, first matching rule wins, then applies two safety overrides:
- A
retrywhose attempts reachedmaxRetriesbecomes anescalate(no infinite loops). - In
safeMode(default on), any automatic strategy on ahighorcriticalissue becomes anescalate(no silent auto-acting on serious failures).
import { decideRecovery, DEFAULT_RECOVERY_POLICY } from '@takk/automend/recovery';
const decision = decideRecovery(
{ kind: 'contradicted', severity: 'high' },
DEFAULT_RECOVERY_POLICY,
);
// decision.strategy === 'escalate' (safe mode escalates high-severity issues)The default policy: contradictions roll back, ungrounded and low-confidence outputs retry, high-severity drift escalates, tool errors retry, loops roll back, everything else escalates.
# score confidence signals from a JSON file
npx automend score signals.json # { "signals": [{ "name": "self", "value": 0.9 }] }
# assess typed grounding for a claims file
npx automend assess claims.json # { "claims": [{ "id": "1", "type": "grounded" }] }
# inspect and verify an audit log
npx automend inspect audit-log.json
npx automend verify audit-log.json audit-seal.jsonExit codes follow sysexits: 0 ok, 1 verify failed, 64 usage error, 65 bad data, 66 unreadable input.
| AutoMend | Post-hoc eval (LangSmith, Braintrust, Langfuse) | Guardrail libraries | |
|---|---|---|---|
| When it runs | In-line, real time | After the run | In-line |
| Acts on failure | Yes, recovery orchestration | No, analysis only | Blocks, but no recovery |
| Audit trail | Tamper-evident SHA-256 seal | Hosted logs | Varies |
| Runtime dependencies | Zero | Hosted SDK | Varies |
| Calls a model | No | Yes (judges) | Sometimes |
AutoMend is the deterministic decision and audit layer. It does not replace your evals or your model-based judges; it turns their signals into a repeatable, auditable recovery decision in the hot path.
- AutoMend does not detect hallucinations by itself. It scores the classifications you supply; the built-in detectors are lexical heuristics, not a model.
- The audit seal is tamper-evident, not a digital signature; pair it with your own signing for non-repudiation.
- "Self-healing" means AutoMend automates the detect, decide, act loop you wire up. The repair actions are your executors.
- Determinism is by policy: same inputs, same decision. AutoMend cannot make a sampled model deterministic.
- 112 tests across 14 suites, all passing under Vitest 4, green on Node 20, 22, and 24.
- Coverage: lines 93.8%, statements 93.7%, functions 98.7%, branches 88.0%.
- Lint clean under Biome 2.
- Typecheck clean under TypeScript 6 in maximum strict mode (
exactOptionalPropertyTypes,useUnknownInCatchVariables,noUncheckedIndexedAccess). publintclean,are-the-types-wrongclean across all eleven entry points.- Zero runtime dependencies; the core entry point is about 4.2 kB brotli, enforced by
size-limit. - A distribution smoke test spawns the built CLI as a single Node process and exercises the ESM and CJS artifacts.
- Published with
--provenance(SLSA attestation by GitHub Actions).
See SPEC.md for the formal specification, public surface, and stability promise.
Does it actually heal the agent? It automates the loop: it detects the failure, decides a recovery, and calls the executor you provided (retry, rollback, escalate, ask a human). The repair action is your code; AutoMend makes the decision deterministic and auditable.
Does it detect hallucinations on its own?
No. It scores the grounding classifications you supply. The built-in classifyClaims is a lexical heuristic to get you started; for production fidelity, classify with a real natural-language inference model and feed AutoMend the result.
Does it call a model or the network? Never. AutoMend makes zero outbound calls. It is a deterministic, in-process engine.
Does it work in Cloudflare Workers, Vercel Edge, Bun, Deno, or the browser?
Yes. The core and the ./edge entry point are node-free; the audit seal uses Web Crypto SHA-256, available in all of them.
How is this different from post-hoc evaluation tools? Those analyze runs after the fact. AutoMend runs in-line and acts on the failure with a recovery decision, recorded in a tamper-evident audit trail.
What is on the roadmap?
Streaming capture, OpenTelemetry and observability exporters, Ed25519 signing on the audit seal, native bridges to sibling @takk packages, and input-matched recovery for concurrent fan-out. All additive; the 1.0.0 API is stable.
See .github/CONTRIBUTING.md for the contributor guide. Substantive proposals open a GitHub Issue first; trivial fixes can go straight to a PR. All commits require DCO sign-off (git commit -s). Non-trivial contributions are governed by the Contributor License Agreement.
- Issues and feature requests. Open a GitHub issue at
davccavalcante/automend/issues. Include the package version, a minimal reproduction, expected vs actual behaviour, and the relevant audit entries where applicable. - Security disclosures. Do NOT open public issues for vulnerabilities. Follow the responsible-disclosure flow in
SECURITY.md, contactdavcavalcante@proton.me(orsay@takk.ag) with the[SECURITY]prefix. - Code of Conduct. This project follows the Contributor Covenant 2.1. Participation in any AutoMend space (issues, PRs, discussions) implies agreement.
- Contributions. All non-trivial contributions go through the Contributor License Agreement. Tests, lint, typecheck, and build must be green before review (
pnpm verify).
Created by David C Cavalcante, davcavalcante@proton.me (preferred), say@takk.ag (Takk relay), linkedin.com/in/hellodav, x.com/davccavalcante, takk.ag
AutoMend is the reliability tier of a broader portfolio of NPM packages targeting Massive Intelligence (IM) and non-human entity (NHE) infrastructure for 2026-2030, built at Takk Innovate Studio.
The architectural philosophy behind AutoMend, separating detection, decision, and audit into composable, independently-governed layers, echoes the author's research frameworks:
- MAIC (Massive Artificial Intelligence Consciousness), the universe, the framework: a systemic intelligence framework to coordinate, supervise, and govern large-scale Massive Intelligence (IM) ecosystems, providing global context awareness, alignment, and orchestration across models, agents, and decision layers.
- HIM (Hybrid Entity Intelligence Model), the spirit, the model: a hybrid intelligence layer that integrates Massive Intelligence (IM) systems with human-defined logic, rules, and strategic intent, interpreting objectives and structuring decision-making before and after execution.
- NHE (Noumenal Higher-order Entity), the reincarnated body, the agent: a non-human entity with a defined functional identity and operational agency within an intelligence ecosystem, operating through coordinated layers while maintaining a non-anthropomorphic identity.
These frameworks are published independently of AutoMend and are separate works:
- Research papers: The Soul of the Machine, Beyond Consciousness in LLMs, The Cave of Silence.
- PhilPapers profile: David Cortes Cavalcante.
- Hugging Face: TeleologyHI.
- GitHub: davccavalcante, Takk8IS.
Join the journey as the portfolio continues to ship Massive Intelligence (IM) infrastructure. Your support is the cornerstone of this work.
- Sponsor on GitHub: github.com/sponsors/davccavalcante
- USDT (TRC-20):
TS1vuhMAhFpbd7y68cu5ZtP9PsXVmZWmeh
AutoMend runs entirely inside your own process and infrastructure. It makes no outbound calls, collects no telemetry, and ships no analytics. See PRIVACY.md for the full data-handling notice, including how the audit trail records only what you hand it.
Licensed under the Apache License 2.0. See LICENSE for the full text and NOTICE for attribution and third-party component licenses. You may use, modify, and distribute the code under the terms of that license, including its patent grant and attribution requirements.
