Adversarial Firebird workload generator — when "it's slow" isn't a useful enough answer.
- What's inside
- Requirements
- Install
- Configuration
- Usage
- Scenarios
- Determinism contract
- Firebird 5 engine pathologies forge surfaces
- Known limitations
- Testing
- Related projects
- Tech stack
- Credits & license
A Python package and CLI that produces realistic, on-purpose-pathological Firebird v5 workloads and records a ground-truth NDJSON intent stream that tragach (or any other observer) can be correlated against. Five subcommands:
forge init— creates the database, applies all DDL (14 named domains, 12 tables, 14 sequences, 1 trigger, 6 procedures + 4 SUSPEND-style generator procedures).forge load— Faker-driven, seed-reproducible population.--mode mixedadds a dirty phase (UPDATE/INSERT/DELETE/CREATE_ORDER interleaved across workers) so the resulting baseline has dead version chains, GC backlog, and fragmented free space — what real production looks like.forge run SCENARIO— drives one of 15 scenarios against the loaded DB, bounded by--duration(1m–1h, default 5m, hard-exit at deadline), with NDJSON intent + outcome recording.forge replay FILE.ndjson— re-issues everyexecuterecord from a recording, optionally preserving the original inter-op timing or firing as fast as possible.python -m forge.correlate forge.ndjson tragach.ndjson— joins forge's intent stream against tragach's observation stream by SQL-hash + timestamp window; reports coverage, accuracy, and false positives.
SPECS.md is the authoritative work order. CLAUDE.md is the operating manual that constrains how the project evolves.
- Linux (tested on Debian 13, Ubuntu 22.04 / 24.04)
- Python 3.11+ (tested on 3.13)
- Firebird v5 SuperServer reachable at
localhost:3050(override viaFORGE_DB_HOST/FORGE_DB_PORT) sudois needed once to create/opt/firebird-v5/databases/and chown it to the firebird user; after that tragach-forge runs as an unprivileged user and talks to Firebird over TCP
From PyPI (planned — see releases for now)
git clone https://github.com/ZlatanOmerovic/tragach-forge
cd tragach-forge
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"The package installs a forge console script in the venv. Pure Python — no compilation, no per-distro wheels.
sudo install -d -o firebird -g firebird -m 0750 /opt/firebird-v5/databasesOnly the firebird user can write to /opt/firebird-v5/databases/, which is correct — tragach-forge talks to Firebird over TCP, not the filesystem.
Connection config comes from environment variables so passwords never land on the command line or in NDJSON recordings:
| Variable | Default | Notes |
|---|---|---|
FORGE_DB_HOST |
localhost |
Firebird host |
FORGE_DB_PORT |
3050 |
|
FORGE_DB_USER |
SYSDBA |
|
FORGE_DB_PASSWORD |
(none — prompted) | Required; prompts interactively if unset |
FORGE_DB_PATH |
/opt/firebird-v5/databases/forge.fdb |
Overridden by CLI --database |
FIREBIRD_LOCK |
auto: /tmp/forge-firebird-<uid>/ |
Set automatically by forge.cli if not already set |
The auto-set of FIREBIRD_LOCK matters: Firebird's client library coordinates locally via /tmp/firebird (0770 firebird:firebird) even for pure-TCP connections, so unprivileged users hit "Permission denied" without this redirect.
export FORGE_DB_PASSWORD='your-sysdba-password'
.venv/bin/forge initCreates the database file, applies the DDL in order (000_domains.sql → 001_tables.sql → 002_sequences.sql → 003_trigger.sql → 004_procedures.sql → 005_generators.sql), and exits. --force drops and recreates an existing database. --database PATH overrides the default location.
The clean load (deterministic inserts only):
.venv/bin/forge load --seed 42Defaults: 100 warehouses, 10 000 products, 1 000 000 inventory rows, 5 000 customer notes. Add --workers 4 to parallelise the INVENTORY phase. The 1M-row load takes ~10 min serial on a dev VM — row generation (Faker) dominates, so parallelism mostly helps for very large --inventory values.
The mixed-mode load (clean inserts then a deterministic dirty phase of UPDATE/INSERT/DELETE across tables, plus CREATE_ORDER ops that populate ORDERS / ORDER_ITEMS / SHIPMENTS so the complex-query scenarios have real transactional data to read against):
.venv/bin/forge load --seed 42 --mode mixed --dirty-ops 20000 --workers 4--dirty-workers N overrides the worker count for the dirty phase (defaults to --workers).
.venv/bin/forge run mixed-chaos --seed 42 --duration 2m --workers 8 \
--record runs/forge-intent.ndjsonDrives N concurrent workers against the populated DB for the requested duration, records every intent + outcome to NDJSON, and exits at the deadline. Hard time bound: daemon threads + sys.exit after a 5 s grace window — no waiting around for in-flight iterations to finish.
--duration accepts Ns / Nm / Nh or a bare number of seconds, must be within 1m–1h inclusive, default 5m. Values outside the range are rejected at CLI parse time.
--workers N runs N independent worker threads each opening its own Firebird connection. Per-worker RNG is seeded as seed + worker_id, so two runs at the same seed produce the same plan stream (timing and outcomes naturally vary with concurrent DB state).
NDJSON schema: pairs of execute (intent) and outcome (result) records joined by intent_id (ULID). One JSON object per line. Schema stable from v0.1 — see SPECS.md §9.
| Field | Type | On which op | Notes |
|---|---|---|---|
ts |
float | execute, outcome | Seconds-since-epoch with microsecond precision |
worker |
integer | execute, outcome | Worker id (0..N−1) |
scenario |
string | execute, outcome | Matches --scenario argument |
op |
string | execute, outcome | One of execute, outcome (commit reserved for future) |
intent_id |
string (ULID) | execute, outcome | 26-char ULID joining the pair |
sql |
string | execute | Parametrised SQL |
params |
array | execute | Bound parameters (in JSON-serialisable form) |
expected_class |
string | execute | Tragach-target tag (e.g. complex-join, hot-update) |
kind |
string | execute | Scenario-internal sub-op label (esp. useful for chaos) |
template |
string | execute | Specific SQL template the worker picked |
duration_us |
integer | outcome | Wall-clock for execute() call |
rows_affected |
integer | outcome | Fetched rows for SELECT, affected rows for DML |
success |
boolean | outcome | false on exception |
error |
string | outcome | "{ExceptionClass}: {message}", only when success=false |
Sample pair:
{"ts":1715607781.421023,"worker":7,"scenario":"hot-update-contention","op":"execute",
"sql":"UPDATE WAREHOUSES SET LAST_ACTIVITY_AT = ? WHERE WAREHOUSE_ID = ?",
"params":["2026-05-13T01:23:01.045123",42],"expected_class":"hot-update",
"intent_id":"01KRFDR8YSV2Q09T8HJAYDMDCX","kind":"hot-update","template":"target_wh=42"}
{"ts":1715607781.523711,"worker":7,"scenario":"hot-update-contention","op":"outcome",
"intent_id":"01KRFDR8YSV2Q09T8HJAYDMDCX","duration_us":102688,"rows_affected":1,"success":true}If a worker's connection dies (Firebird 5 SuperServer crashes are not uncommon under heavy mixed concurrent writes — see engine pathologies), the worker drops the dead handle and reconnects up to 5 times before bailing. The run continues with the workers that still have live connections.
forge run --help lists all 15 registered scenarios.
.venv/bin/forge replay runs/forge-intent.ndjson
.venv/bin/forge replay runs/forge-intent.ndjson --asap
.venv/bin/forge replay runs/forge-intent.ndjson --workers 4Reads the NDJSON, dispatches each execute record against a fresh connection. Default mode preserves the original inter-op timing deltas; --asap fires straight through; --workers N repartitions events across N threads instead of using the worker numbering from the recording.
Records whose sql isn't directly executable (mixed-chaos uses a sentinel "EXEC CREATE_ORDER inline" for its structured 3-statement create-order op) are skipped and counted; the runner reports the skip total at the end.
python -m forge.correlate runs/forge-intent.ndjson runs/tragach-slowquery.ndjson
python -m forge.correlate runs/forge-intent.ndjson runs/tragach-slowquery.ndjson \
--window-ms 100 --diff-output runs/diff.jsonlOutputs:
- Coverage — % of forge intents that found a tragach match.
- Accuracy — mean / p50 / p99 of |forge.duration_us − tragach.duration_us|, per scenario.
- False positives — tragach events with no matching forge intent.
The join key is (SQL-text hash, ts within ±50 ms). Because tragach observes the server side and doesn't natively know about tragach-forge's client-side worker numbering, worker_id is not used in the primary match. Per-event JSON Lines diffs go to --diff-output FILE.
See SPECS.md §10 for the assumed tragach NDJSON schema. The current tragach --json output uses RFC3339 ts strings and emits separate prepare_us / execute_us instead of a combined duration_us; an adapter step may be needed until the two schemas converge.
15 scenarios, each targeting a distinct Firebird engine pathology that tragach should observe.
| Scenario | What it stresses |
|---|---|
steady-oltp |
Baseline OLTP: SP_CREATE_ORDER in tight loop, predictable latency |
bulk-import |
Write-heavy bulk insert into INVENTORY, long execute spans |
hot-update-contention |
Many workers updating the same warehouse row → MVCC version churn |
cold-scan |
gfix cache-flush then full-table scan with unindexed predicate |
indexed-vs-unindexed |
Side-by-side equivalent queries, one indexed and one not — μs vs ms |
long-transaction |
Each worker holds a long read-committed transaction open, also doing short writes → GC pressure |
mixed-chaos |
Production-shaped weighted mix combining DML and the heavy SELECT scenarios |
pathological-like |
Single LIKE '%foo%' against unindexed BLOB SUB_TYPE TEXT |
complex-joins |
6-table joins with mixed indexed/unindexed predicates, all join types |
slow-aggregate |
GROUP BY on function-of-unindexed-column → big hash/sort |
correlated-subquery |
Pathological per-row correlated subqueries that the optimizer won't flatten |
recursive-graph |
WITH RECURSIVE walk of CARRIER_ROUTES, hop-bounded |
window-functions |
Window-aggregation paths (RANK / PARTITION BY / running totals) |
heavy-grouping |
Multi-key GROUP BY + HAVING + ORDER BY across joined tables |
psql-stream |
SUSPEND-style generator procedures, indexed vs unindexed streams |
Full per-scenario rationale, expected tragach signature, and SQL templates: SPECS.md §7.
Two runs of forge init --force && forge load --seed N against the same Firebird produce byte-identical row content for every populated table, regardless of --workers or --dirty-workers. Verified on both clean and mixed modes.
The mechanism: row content is generated serially from a single seeded random.Random + Faker + numpy.random.default_rng, and the mixed-mode dirty op stream is pre-generated then partitioned across workers by crc32(table, row_id) % workers — same-row ops always land on the same worker, so per-row order is preserved and cross-worker order doesn't change the final state.
For forge run, the intent stream is byte-identical at the same seed (each worker's random.Random(seed + worker_id) produces the same plan sequence). Outcomes vary by definition because they reflect concurrent DB state and wall-clock timing.
Building tragach-forge against Firebird 5 SuperServer surfaced several engine behaviors that shape what the loader and scenarios can rely on. These aren't tragach-forge bugs — they're real engine pathologies, exactly the kind of thing tragach is supposed to surface. Documented here so users don't chase them.
- No PostgreSQL-style multi-row
INSERT VALUES. Firebird 5 rejectsINSERT INTO T VALUES (...), (...), ...withToken unknown - ,. EXECUTE BLOCKwith parameterised multi-INSERT crashes the server at scale. Server logs showfirebird terminated abnormally (-1), fbguard restarts. Not viable on this install. The loader usescursor.executemanyinstead (prepared statement reused, row-by-row on the wire).forge run mixed-chaosreliably crashes Firebird 5 SuperServer under concurrent mixed writes (≥4 workers, ~45 % writes, ~55 % reads). Errors look likepthread_mutex_lock failed -Invalid argument -invalid statement handle. tragach-forge workers detect dead connections and reconnect up to 5 times each; the run still completes within--duration. The crashes are the signal, not a forge bug.- Cursor close after a mixed-DML workload sometimes crashes the server.
SELECT * FROM INVENTORY ORDER BY pkafter a mixed-mode load fetches rows fine but crashes Firebird during cursor cleanup. Workaround in test scripts:os._exit(0)before destructors run. CURRENT_TIMESTAMPinside an UPDATE that hot-rows from many concurrent workers is one of the easier crash triggers we found.mixed-chaosandhot-update-contentionpass the timestamp as a Python-side param instead, which doesn't fix the broader instability but removes one reliably-triggering crash path so other pathologies have a chance to surface.overwrite=Trueagainst a non-existent file in firebird-driver 2.0.3 silently no-ops.forge initprobes for existence first and only passesoverwritewhen needed.
Captured in detail in SPECS.md §8.1 / §7.7 / §13.
- Firebird v5 SuperServer only. Classic / SuperClassic and Firebird v3 / v4 / v6 are not supported. v6 isn't shipped stable yet anyway. Sequence-grabbing inside parallel workers (mixed-chaos) would behave differently under Classic — out of scope for v0.1.
- Linux only. The
FIREBIRD_LOCKauto-redirect (/tmp/forge-firebird-<uid>/) is Linux-specific, andcold-scancallsgfixvia subprocess assuming a POSIX shell. - Python 3.11+. No earlier-version compatibility; uses match/case, parameterised generics, etc.
forge correlateis schema-aspirational. The current tragach--jsonoutput uses RFC3339tsstrings and emitsprepare_us+execute_usseparately; the correlator assumes float-secondstsand a combinedduration_us. A small adapter step (or a future tragach--ndjson-schema=forge-compatflag) closes the gap.forge replayskips structured-payload ops. Mixed-chaos'screate-orderrecords a sentinel SQL string (EXEC CREATE_ORDER inline) because its execute path is three statements driven from a structured payload. Replay skips those and reports the count.- The Firebird-side crashes documented above reduce throughput and bound how long a
mixed-chaosrun sustains before workers reconnect. That's the engine, not tragach-forge — see the pathologies section.
.venv/bin/pytest tests/The suite runs without a live Firebird — pure-Python unit coverage of:
test_schema.py— DDL file ordering,SET TERMstatement splittertest_recorder.py— NDJSON record shape, ULID intent_ids, thread-safe writestest_correlate.py— synthetic forge + tragach NDJSON in, expected coverage / accuracy / false-positive numbers outtest_scenarios.py— all 15 scenarios register,plan()returns required shape, mixed-chaos covers expected kinds,SP_CREATE_ORDERplan always emits 17 paramstest_loader.py— seed-driven row generators are byte-identical across twoSeeded.from_seed(N)callstest_duration_parser.py—forge run --durationaccepts1m–1hand rejects anything outside
End-to-end checks against the live DB are in the usage walkthrough above. The forge load determinism contract (intent-byte-identical at the same seed across same-mode runs) is verified live, not in pytest.
eBPF observability tool for Firebird v5 — peer project, same author. Where tragach-forge generates the workload, tragach watches it from the outside via uprobes / kernel tracepoints. Two binaries:
tragach-slowquery— engine-level DSQL statement tracing vialibEngine13.souprobes (DSQL_prepare,DSQL_execute, cursor open/fetch). Reportsprepare_us+execute_usper statement, the SQL text, and the attachment it ran on.tragach-iowait— kernel-level off-CPU profiling. Buckets thread blocked-time into block I/O / futex / scheduler delay / other, with representative stacks per bucket.
The natural pairing: run forge run mixed-chaos --record forge-intent.ndjson and tragach-slowquery --json > tragach.ndjson simultaneously, then join them with python -m forge.correlate. Forge knows what it intended; tragach reports what it observed; the correlator surfaces the gap.
tragach-forge has incidentally discovered that Firebird 5 SuperServer reliably crashes under its mixed-chaos workload at ≥4 concurrent writers. The crashes are the signal, exactly the kind of engine pathology tragach is built to surface.
You don't have to run tragach to use tragach-forge — point any observability tool at it, or just run forge to populate / churn a Firebird database for local development.
Runtime (Python 3.11+):
- firebird-driver
≥2.0,<3.0— sync DB-API 2.0 client, official Firebird Foundation–maintained - Typer
≥0.12— CLI framework - Pydantic
≥2.7v2 — config models withSecretStrfor credentials - Rich
≥13.7— interactive progress and console (auto-disables when piped) - Faker
≥25.0— synthetic data - NumPy
≥1.26— Zipf distribution for product popularity in the inventory loader - python-ulid
≥2.0,<4.0— intent IDs in the NDJSON recorder
Dev:
- pytest
≥8.0— test suite - pytest-asyncio
≥0.23— currently unused; reserved for future asyncio scenarios
Firebird-side:
- 14 named domains, 12 tables, 14 sequences, 1 trigger, 6 stored procedures, 4 SUSPEND-style generator procedures — all SCREAMING_SNAKE_CASE per the project's identifier convention.
- Bulk-insert path: prepared
executemany(the only stable batch shape Firebird 5 gave us — see SPECS.md §8.1 for the EXECUTE BLOCK / multi-row VALUES dead ends).
Build & CI:
- GitHub Actions matrix across Ubuntu 22.04 / 24.04 and Debian 12 / 13 (containers for the Debian legs), with Python 3.11 / 3.12 / 3.13
- Tag-triggered release workflow that builds an sdist + wheel and attaches them to a GitHub release
- Pure-Python: one wheel works everywhere Python 3.11+ does
Built by Zlatan Omerović — @ZlatanOmerovic · zlomerovic@hotmail.com.
Pair-programmed with Claude (Anthropic) as a coding collaborator — including the schema design, the mixed-chaos op partition, the determinism contract, the NDJSON intent / outcome split, the live-debugging of the Firebird 5 SuperServer crashes, and most of the prose you're reading right now.
Released under the Apache License 2.0.