Build software with AI agents that read your BDD spec, write tests first, then code to pass them — all while journaling their progress. No config needed; just set your API key and let the agent evolve your project on schedule.
- Evolve - the agent finds uncovered scenarios, writes tests, implements code, and commits — all automatically. via a github action cron job.
- Orchestrate - for larger projects, spawn multiple agents in parallel, ordered by an AI orchestrator.
- Interactive mode - run evolution sessions interactively with Claude Code, guiding the agent
- BAADD supports any LLM provider with an API — just set the corresponding environment variable. Local or otherwise
- BAADD auto-detects your provider from environment variables. Set one API key and run — no config needed.
| Provider | Environment Variable | Default Model | Notes |
|---|---|---|---|
| Anthropic | ANTHROPIC_API_KEY |
claude-sonnet-4-5 |
Highest priority; native tool use |
| OpenAI | OPENAI_API_KEY |
gpt-4o |
|
| Groq | GROQ_API_KEY |
llama-3.3-70b-versatile |
Fast inference |
| Alibaba / Qwen | DASHSCOPE_API_KEY |
qwen-max |
OpenAI-compatible endpoint |
| Moonshot / Kimi | MOONSHOT_API_KEY |
moonshot-v1-8k |
OpenAI-compatible endpoint |
| Ollama | OLLAMA_HOST |
(pass --model) |
Local models, no API key required |
| Custom Provider | CUSTOM_API_KEY, CUSTOM_BASE_URL |
(user-defined) | Custom integrations |
Provider priority (first key found wins): ANTHROPIC_API_KEY > MOONSHOT_API_KEY > DASHSCOPE_API_KEY > OPENAI_API_KEY > GROQ_API_KEY > OLLAMA_HOST
Override the model at any time with --model <name> or force a provider with --provider <name>.
BAADD is a meta-framework / template — not a library you install, but a pattern you adopt. You bring your spec; BAADD brings the agent, the loop, and the rules that keep it honest.
The three parts that make it work:
graph TD
subgraph BAADD["BAADD — the meta-framework"]
BDD["📄 BDD Spec Format\n(BDD.md + frontmatter)\nWhat to build"]
LOOP["🔄 Evolve Loop\n(scripts/ + GitHub Actions)\nHow to build it"]
CONTRACT["📜 Agent Behaviour Contract\n(IDENTITY.md)\nHow the agent must behave"]
end
BDD -->|"parsed by"| LOOP
CONTRACT -->|"governs"| LOOP
LOOP -->|"implements scenarios from"| BDD
LOOP -->|"journals progress to"| BDD
style BDD fill:#a855f7,color:#fff,stroke:none
style LOOP fill:#3b82f6,color:#fff,stroke:none
style CONTRACT fill:#6366f1,color:#fff,stroke:none
style BAADD fill:#1e1e2e,color:#cdd6f4,stroke:#45475a
- BDD Spec Format —
BDD.mdwith YAML frontmatter declaring language, build/test commands, and Gherkin scenarios. This is the only file you edit. - Evolve Loop — the
scripts/+ GitHub Actions cron that drives the agent: find uncovered scenario → write test → write code → commit. - Agent Behaviour Contract —
IDENTITY.md, the agent's constitution. It defines what the agent is allowed to do, what it must never do, and how it measures progress.
You write the spec. The agent writes the code.
Run evolve in your terminal and Claude Code will read the spec, pick the next uncovered scenario, write the test, implement it, and commit — then ask if you want to continue.
flowchart TD
A([📝 You write BDD.md]) --> B[GitHub Actions\ncron every 8h]
B --> C{Uncovered or\nfailing scenarios?}
C -- No --> D([✅ Nothing to do])
C -- Yes --> E[Agent reads\nBDD.md]
E --> F[Writes tests first]
F --> G[Writes code to\nmake tests pass]
G --> H{Tests pass and\ncoverage maintained?}
H -- No --> G
H -- Yes --> I[Agent commits]
I --> J([📓 Journals session])
I --> A
style A fill:#a855f7,color:#fff,stroke:none
style D fill:#22c55e,color:#fff,stroke:none
style I fill:#3b82f6,color:#fff,stroke:none
style J fill:#6366f1,color:#fff,stroke:none
- You write
BDD.md— features, scenarios, given/when/then - A GitHub Actions cron job fires every 8 hours
- The AI agent reads
BDD.md, finds uncovered or failing scenarios - It writes tests first, then writes code to make them pass
- It commits only when tests pass and BDD coverage is maintained
- It journals what it did and responds to GitHub issues
The agent never builds anything that isn't in BDD.md.
mkdir my-project && cd my-project
curl -fsSL https://raw.githubusercontent.com/dweng0/BAADD/main/install.sh | bashThis downloads all framework files, creates a BDD.md template, and initialises a git repo. A .BAADD manifest is written to track the framework version — run the same command again at any time to update.
Edit the frontmatter at the top of BDD.md:
---
language: typescript # rust | python | go | node | typescript | java
framework: react-vite # or: none, express, django, etc. (informational)
build_cmd: npm run build
test_cmd: npm test
lint_cmd: npm run lint
fmt_cmd: npm run format
birth_date: 2026-01-01 # project start date (used for day counter)
---Then write your features and scenarios below the frontmatter.
BAADD.yml controls how the agent runs — parallel workers, token limits, timeouts. The defaults work out of the box, but you can tune them:
# BAADD.yml — HOW to build (BDD.md defines WHAT to build)
orchestration:
max_parallel_agents: 3 # workers per orchestrator run
model_orchestrator: claude-haiku-4-5-20251001 # model for scenario ordering
agent:
max_iterations: 75 # tool-call rounds per session
wrap_up_at: 70 # when to inject wrap-up reminder
max_tokens_per_response: 8192
tool_output_limit: 12000 # max chars per tool result
session_timeout: 3600 # seconds
context_window_limit: 100000 # auto-trims older results past this
default_model: claude-haiku-4-5-20251001This file is not overwritten on framework updates.
In your GitHub repo: Settings → Secrets and variables → Actions → New repository secret
| Name | Value |
|---|---|
ANTHROPIC_API_KEY |
your sk-ant-... key |
pip install anthropicANTHROPIC_API_KEY=sk-... ./scripts/evolve.shPush to GitHub. The workflow runs automatically every 8 hours via cron.
Trigger manually: Actions tab → Evolution → Run workflow.
# Initialize BAADD in a new project
curl -fsSL https://raw.githubusercontent.com/dweng0/BAADD/main/install.sh | bash
# Update BAADD framework to latest version (preserves your journals)
./install.sh --update
# Pin to a specific BAADD version
./install.sh --version v1.2.0# Run one evolution session manually (tests one scenario)
ANTHROPIC_API_KEY=sk-... ./scripts/evolve.sh
# Check BDD scenario coverage
python3 scripts/check_bdd_coverage.py BDD.md
# Run your project's build command (from BDD.md)
python3 -m py_compile pyken.py # or whatever your build_cmd is
# Run your project's tests (from BDD.md)
.venv/bin/pytest tests/ # or whatever your test_cmd is# Run the orchestrator — orders scenarios with AI, spawns parallel agents
ANTHROPIC_API_KEY=sk-... python3 scripts/orchestrate.py
# Preview the plan without executing
python3 scripts/orchestrate.py --dry-run
# Override max parallel agents
python3 scripts/orchestrate.py --max-agents 2# Run evolution sessions interactively with Claude Code
/evolve# Dry run — see which tests would get markers
python3 scripts/add_bdd_markers.py BDD.md
# Apply markers to existing test files
python3 scripts/add_bdd_markers.py BDD.md --applyTests should include a BDD marker comment linking them to their scenario:
# BDD: Successful login
def test_successful_login():// BDD: Successful login
test('successful login', () => {The coverage checker uses these markers for exact matching, falling back to heuristic name matching for unmarked tests.
# See what the agent will work on next
cat JOURNAL_INDEX.md
# Read full session logs
cat JOURNAL.md
# Check framework version
cat .BAADD | python3 -c "import sys,json; print(json.load(sys.stdin)['version'])"| File | Purpose |
|---|---|
BDD.md |
The spec — edit this to drive development |
BAADD.yml |
Agent/orchestrator config — edit to tune behaviour |
IDENTITY.md |
Agent constitution — do not modify |
JOURNAL.md |
Agent's full session logs — auto-written |
JOURNAL_INDEX.md |
One-line-per-session summary index — auto-generated |
LEARNINGS.md |
Agent's cached research — auto-written |
BDD_STATUS.md |
Scenario coverage status — auto-generated |
scripts/evolve.sh |
Main evolution loop — do not modify |
scripts/orchestrate.py |
Parallel orchestrator — AI-ordered, multi-agent |
scripts/agent.py |
The AI agent runner |
scripts/check_bdd_coverage.py |
Scenario coverage checker (supports BDD markers) |
scripts/add_bdd_markers.py |
Upgrade tool — adds BDD markers to existing tests |
scripts/parse_bdd_config.py |
BDD.md frontmatter parser |
scripts/parse_BAADD_config.py |
BAADD.yml config parser |
scripts/setup_env.sh |
Language-aware toolchain installer |
Feature: User authentication
As a user
I want to log in with my email and password
So that I can access my account
Scenario: Successful login
Given I am on the login page
When I enter valid email and password
Then I am redirected to my dashboard
Scenario: Wrong password
Given I am on the login page
When I enter a valid email but wrong password
Then I see "Invalid email or password"Keep scenarios:
- Specific — one behaviour per scenario
- Observable — the
Thenclause must be testable - Independent — each scenario stands alone
If you have Claude Code installed, you can run evolution sessions interactively instead of waiting for the cron:
> evolve
Claude Code will read the spec, pick the next uncovered scenario, write the test, implement it, and commit — then ask if you want to continue. This uses the same workflow as the GitHub cron but lets you guide the session in real time.
The orchestrator applies deterministic checks before merging any worktree branch to prevent the SE agent from causing unintended damage.
Before merging, the orchestrator diffs the worktree branch against the merge base and rejects any branch that deletes an existing file unless the PM explicitly approved the deletion in PLAN.md.
The PM declares approved deletions in section 5 of PLAN.md:
## 5. Files to delete (optional)
- scripts/old_helper.py — superseded by the new unified loader in this scenarioAny file deleted by the SE that is not listed here causes the work to be thrown away. The PM is instructed to populate this section only when the scenario genuinely requires removing a file.
- Truncation detection — the current guard catches
git rm-style deletions but not files that are overwritten with empty or near-empty content. A secondary check usinggit diff --numstatto flag files where all lines were removed and nothing was added would cover this case. The threshold for "suspiciously large removal" is context-dependent, which is why this has not been added yet.
Label issues with agent-input to have the agent pick them up.
If an issue proposes a new feature, the agent will add it to BDD.md as a Scenario before implementing it.
- BAADD Framework: https://github.com/dweng0/BAADD
- Documentation: See the
CLAUDE.mdfile in your project for detailed guidance - GitHub Issues: Use
agent-inputlabel to task the AI agent