MYTHOS Harness

One file. Four rules. Opus 4.7 + MYTHOS Harness won 7 of 8 tasks against base Opus 4.7 in an 8-task pilot. No SDK, no runtime — paste one Markdown file into your coding agent's system prompt.

MYTHOS Harness is a prompt doctrine you paste into a coding agent's system prompt. Four short rules push the agent to name the target before acting, tag claims as known / inferred / guessed, resist unrequested abstraction, and report what was actually run before declaring done.

It ships no model, no runtime, and no SDK — just a single Markdown file. The clean proof artifact: Opus 4.7 + MYTHOS Harness won 7 of 8 tasks (22.88 avg) vs base Opus 4.7 at 1 of 8 (21.00 avg) in an 8-task run (see benchmarks/). The average-score gap is modest; the task-win flip is the louder signal. Small-sample and setup-dependent — re-run on your own suite before relying on it.

_{Core 8-task run. Same task family, same single-judge setup, same environment. Bronze = Opus 4.7 + MYTHOS Harness. Slate = base Opus 4.7.}

Why you might want this

Most coding-agent failures in practice are not reasoning failures. They are:

Acting on the wrong target (wrong file, wrong caller, wrong symptom).
Upgrading a guessed claim into a confident one by repeating it.
Building an abstraction the code did not need.
Declaring done on unverified output.

MYTHOS Harness is four short sections that push back on exactly those four patterns. No config, no knobs. Copy the file, paste it in, done.

Quick start (30 seconds)

# Claude Code
cp mythos-harness.md ~/.claude/rules/

# Codex CLI, Hermes, OpenClaw, or any custom harness with a system-prompt file
cat mythos-harness.md >> your-system-prompt.md

That is the entire install. No SDK, no runtime, no lockfile, no build step. If it does not improve the next coding task you care about, remove the file and you are back where you started.

Install

The public doctrine file is: mythos-harness.md.

Claude Code

Drop the file into ~/.claude/rules/ (or your project's rules directory). It will load alongside your other rules. No restart required for most setups.

cp mythos-harness.md ~/.claude/rules/

Codex CLI / generic agent harnesses

Append the contents to your system prompt, or load it as an additional rule file if your harness supports rule layering.

Custom orchestrator

Treat it as a system-prompt suffix. It is additive — it assumes your base prompt already covers tool use, safety posture, and working style.

No install script, no dependency tree, no lockfile. That is intentional.

What it contains

Four short sections:

Strategic depth — name the second-order effect, surface cheaper reframes, state the tradeoff you are accepting.
Epistemic honesty — tag claims as known / inferred / guessed; unverified symbols stay unverified until checked this session.
Abstraction restraint — prefer duplication over an abstraction you cannot name precisely; new interfaces need two real callers.
Clean execution — finish or revert; before declaring done, list what was run, what was observed, what was not checked.

Full text: mythos-harness.md.

How the four rules reshape agent behavior

The doctrine does not tell the agent what to build. It constrains how the agent reasons before and after the edit. That is why a four-rule file can move a benchmark: the failures it targets are behavioral, not intellectual.

Measured results (summary)

There are two different benchmark stories in this package:

Core harness benchmarks — the clean evidence for MYTHOS Harness itself versus base Opus 4.7.
Combined-workflow pilot — a broader A1–A6 experiment involving base models, harnessed models, and dual-model executor/judge workflows.

The core harness benchmark is the main proof artifact for the doctrine itself.

Full writeups:

Core harness benchmarks — Opus 4.7 + MYTHOS Harness vs base Opus 4.7 (the main story):

Condition	Total score	Avg / task	Task wins
Opus 4.7 base	168	21.00	1 / 8
Opus 4.7 + MYTHOS Harness	183	22.88	7 / 8

Combined-workflow pilot (separate context, not the core harness proof):

Arm	Configuration	Avg	Wins
A1	Opus 4.7 base	21.875	1
A2	GPT-5.4 base	15.812	0
A3	Opus 4.7 + MYTHOS Harness	23.062	5
A4	GPT-5.4 + MYTHOS Harness	16.312	0
A5	GPT-5.4 + MYTHOS Harness executor + Opus judge	22.312	2
A6	Opus executor + GPT-5.4 judge	22.875	4

Honest read of the numbers:

The core harness benchmarks are the strongest evidence that MYTHOS Harness improves Opus 4.7 on the measured task sets.
The combined-workflow pilot is useful context, but it is a broader systems experiment, not the cleanest proof artifact for the harness itself.
A dual-model executor+judge arrangement (A6) beat plain base Opus, but did not beat Opus 4.7 + MYTHOS Harness on this pilot.
The win is real but modest. This package supports a narrow claim of observed gains for Opus 4.7 + MYTHOS Harness in these small-sample runs, not a statistical or state-of-the-art claim.
Re-run it on your own suite if you want stronger confidence.

Works with Claude Code, Codex, Hermes, OpenClaw, and custom agent harnesses

MYTHOS Harness is a plain Markdown rule file. It is naturally compatible with:

Claude Code and other Anthropic-style coding-agent workflows
Codex CLI and OpenAI-style coding harnesses
Hermes and OpenClaw style orchestrators that layer additive rule files
Custom orchestrators running Opus, Sonnet, GPT-class models, or other capable coding agents

It is not tied to one model family. Opus 4.7 is simply where the strongest measured evidence lives right now.

What this is not

Not a model. This repository ships a prompt doctrine, not Anthropic's Mythos Preview model. See DISCLAIMER.md for the naming clarification.
Not an evaluation framework. It is four pages of constraints. Measurement lived in a separate pilot harness.
Not a replacement for a good system prompt. It is additive. If your base prompt is weak, MYTHOS will not fix that.
Small-sample pilot evidence only. The package summarizes two harness-favoring benchmark runs in benchmarks/CORE-HARNESS-BENCHMARKS.md, but there is still no statistical significance test, no public reproducibility package, and no third-party replication.
Not affiliated with Anthropic, OpenAI, or any other company. See DISCLAIMER.md.

Design principles

Small surface. One file, four sections. If you cannot read it in two minutes, it is too big.
Additive, not prescriptive. It does not override your existing rules on tool use, safety, or voice.
Honest about epistemics. "Haven't checked X" beats "X should work." That rule applies to this README too.
Forkable. MIT licensed. Rewrite the sections in your own voice if that fits your team better.

License

MIT.

Chosen over Apache-2.0 because:

The contribution is text and methodology, not code with patent surface.
MIT is shorter, universally understood, and maximally permissive for forking and embedding in proprietary prompts.
We explicitly want people to copy, rewrite, and re-ship this doctrine inside their own agent frameworks without ceremony.

If you need Apache-2.0-style patent protection for your use case, fork and relicense your derivative — MIT allows it.

Fork, rewrite, ship

The fastest way to improve this is to fork it, rewrite one section in your own words, run it against a task suite you care about, and publish what you saw. If you change the doctrine in a way that clearly helps, we want to read it.

Issues, PRs, and "here is what broke on my setup" notes are all welcome.

Files in this package

mythos-harness.md — the doctrine file.
README.md — this file.
DISCLAIMER.md — non-affiliation and benchmark-scope notice.
LICENSE — MIT.
assets/ — benchmark graphics, behavior diagrams, and README visual assets.
benchmarks/CORE-HARNESS-BENCHMARKS.md — harness-focused benchmark evidence.
benchmarks/PILOT-A1-A6.md — separate combined-workflow pilot.
benchmarks/PILOT-A1-A6-summary.json — machine-readable summary of the combined-workflow pilot.
site/ — optional dark landing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MYTHOS Harness

Why you might want this

Quick start (30 seconds)

Install

Claude Code

Codex CLI / generic agent harnesses

Custom orchestrator

What it contains

How the four rules reshape agent behavior

Measured results (summary)

Works with Claude Code, Codex, Hermes, OpenClaw, and custom agent harnesses

What this is not

Design principles

License

Fork, rewrite, ship

Files in this package

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
benchmarks		benchmarks
site		site
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
README.md		README.md
banner.svg		banner.svg
mythos-harness.md		mythos-harness.md

Folders and files

Latest commit

History

Repository files navigation

MYTHOS Harness

Why you might want this

Quick start (30 seconds)

Install

Claude Code

Codex CLI / generic agent harnesses

Custom orchestrator

What it contains

How the four rules reshape agent behavior

Measured results (summary)

Works with Claude Code, Codex, Hermes, OpenClaw, and custom agent harnesses

What this is not

Design principles

License

Fork, rewrite, ship

Files in this package

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages