Skip to content

toddegray/fec-analyst

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fec-analyst

AI political intelligence analyst. Paste a candidate, PAC, donor, or industry — get the brief a senior opposition researcher would spend a week on.

Built on the OpenFEC public API. TypeScript on Bun. MIT licensed. No paywalls, no scraping, no proprietary data.


Why this exists

  • OpenSecrets killed their public API in April 2025.
  • OpenFEC has beautiful raw data but requires code to use.
  • Quorum charges $30K/seat.
  • Nothing sat in the middle: no AI-native agent that turns "here's a candidate name" into a real research brief with cited figures.

fec-analyst fills that gap. Ten skills, each an endpoint a senior political finance analyst would hand-build. Four surfaces (CLI, MCP server, natural-language ask, static site) over a common skill contract. An entity-memory layer that makes the second query better than the first.

Thirty-second demo

$ fec-analyst candidate "Jon Tester" --cycle=2024

TESTER, R. JON — 2024 cycle brief Democrat candidate for U.S. Senate, MT. FEC candidate ID: S6MT00162.

  • Raised $93,570,123 and spent $95,689,258 this cycle (burn rate 102.3%).
  • Ended the period with $765,624 cash on hand and $0 in debts.
  • Small-dollar (<$200) donations account for 54.6% of Schedule A receipts; $2,000+ large-dollar buckets account for 20.2%.
  • 89.7% of itemized individual contributions came from outside MT.
  • Largest non-noise occupation cluster: ATTORNEY ($2,991,769).

Full brief: examples/candidate-tester-2024.md.

The brief a consultant charges $5K for, in one command

$ fec-analyst darkmoney --committee-id=C00865444 --cycle=2024

For WINSENATE (the Senate Majority PAC's independent-expenditure affiliate) in 2024:

Total IE $626,311,588
Oppose-spending share 92.3%
Largest oppose target Bernie Moreno (R-OH) — $144,531,469
Largest support target Colin Allred (D-TX) — $20,178,908
Biggest ad vendor Waterfront Strategies — $458M (73% of spend)
October 2024 alone $384M

Full brief: examples/darkmoney-winsenate-2024.md.

The "holy shit" moment

$ fec-analyst bundler "Jon Tester" --cycle=2024

Surfaces employer clusters that match the textbook bundling pattern (≥10 distinct donors, ≥50% at the individual max, ≥50% of the money in a 14-day window):

BLACKROCK — textbook bundler signal 16 distinct donors, 82.4% at ≥ 90% of the $3,300 cycle limit. 14 of 16 contributions on 2024-05-02 or 2024-05-06 — a two-day window consistent with a fundraiser event.

Full brief: examples/bundler-tester-2024.md.

Every flag is a pattern match, not a legal finding. Scoring heuristic and thresholds are printed in the output, declared in source, and applied uniformly across every committee regardless of party.


Skills

analysis:
  candidate      Full finance brief for one candidate
  committee      Committee / PAC brief with hybrid-super-PAC routing flag
  donor          Giving history for an individual donor
  race           Side-by-side comparison of every candidate in a race
  geo            Donor geography (zip + state concentration)
  diff           Cycle-over-cycle diff of a stored entity
  bundler        Employer clusters matching the textbook bundler pattern
  anomaly        Threshold-based concentration scorecard
  darkmoney      Super-PAC IE traced to its candidate targets
  industry       Giving pattern for one or more employer keywords

natural language:
  ask            "who raised more in the MT Senate race?" → correct skill call

interfaces:
  mcp            MCP stdio server (drops into Claude Code / Desktop / Cursor)

automation:
  watch          Scheduled refresh + change-detection digest
  render-site    Zero-dep static HTML site from stored briefs

memory:
  recall         Stored briefs + annotations for an entity (no network)
  annotate       Attach a free-text note that carries into future recalls

setup:
  init           Interactive first-run setup
  config         Show the resolved configuration
  ping           Sanity-check the OpenFEC key

fec-analyst --help is the full reference.


How it works

This section exists because the tool is a systems-design exercise as much as a product. If you're evaluating the code, read this.

The skill contract

Every skill is a pure async function over the OpenFecClient. Its return type is the same across every skill:

interface Brief<TData> {
  skill: string;              // "candidate-brief", "dark-money-trace", ...
  schema_version: number;     // bumped on breaking changes
  entity: EntityId;           // { kind, id, display }
  cycle: number;              // 2024, 2022, ...
  generated_at: string;       // UTC ISO timestamp
  data: TData;                // typed, skill-specific structured payload
  citations: Citation[];      // every claim in `markdown` links to one of these
  markdown: string;           // rendered view of `data`
}

The data block is the contract. The markdown is derivative — rendered from data at skill time and stored alongside it. A downstream consumer (MCP client, web page, diff engine) can work with either.

Citations are load-bearing: no narrative figure is allowed without a Citation whose URL points to a page on fec.gov that a reader can spot-check in two clicks. Hallucinated numbers are unshippable.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Surfaces                                                    │
│   CLI  ·  MCP server  ·  ask (LLM)  ·  render-site  ·  watch │
└───────────────┬──────────────────────────────────────────────┘
                │  (each surface composes skills; skills don't
                │   know which surface invoked them)
┌───────────────▼──────────────────────────────────────────────┐
│  Skills (src/skills/)                                        │
│   candidate-brief     committee-brief     donor-profile      │
│   race-comparison     geo-flow            cycle-diff         │
│   bundler-detection   anomaly-scan        dark-money-trace   │
│   industry-influence                                         │
│                                                              │
│  Each skill: (client, input) → Brief<TData>                  │
│              (pure async function, no global state)          │
└───────────────┬──────────────────────────────────────────────┘
                │
┌───────────────▼──────────────────────────────────────────────┐
│  Core (src/core/)                                            │
│   OpenFecClient       — disk-cached, rate-limited,           │
│                         adaptive-cooldown + 429 retry        │
│   endpoint helpers    — zod-typed thin wrappers over the     │
│                         OpenFEC paths we consume             │
│   config              — ~/.fec-analyst/config.json           │
│                         + env-var escape hatch               │
│   types               — Brief, EntityId, Citation, money     │
│                         formatting                           │
└───────────────┬─────────────────┬────────────────────────────┘
                │                 │
                ▼                 ▼
┌─────────────────────────┐  ┌─────────────────────────────────┐
│  OpenFEC                │  │  Entity memory (src/db/)        │
│  api.data.gov gateway   │  │  bun:sqlite                     │
│  Schedule A / B / E,    │  │   entities                      │
│  candidates, committees │  │   briefs (JSON envelopes)       │
│  (60 req/min/key)       │  │   annotations                   │
└─────────────────────────┘  │   watchlist                     │
                             └─────────────────────────────────┘

Data flow: one skill call, end to end

fec-analyst candidate "Jon Tester" --cycle=2024
  │
  ▼
src/cli/candidate.ts  (argument parse + flag validation)
  │   resolveConfig()                ← reads ~/.fec-analyst/config.json
  │   new OpenFecClient(...)
  │   openDb({ dataDir })
  │
  ▼
src/skills/candidate-brief.ts
  │
  ├─→ searchCandidates                →  GET /candidates/search?q=Jon+Tester&cycle=2024
  ├─→ getCandidateTotals              →  GET /candidate/S6MT00162/totals?cycle=2024
  ├─→ getCandidateCommittees          →  GET /candidate/S6MT00162/committees?designation=P
  └─→ (parallel, Promise.all)
        getScheduleABySize            →  GET /schedules/schedule_a/by_size
        getScheduleAByState           →  GET /schedules/schedule_a/by_state
        getScheduleAByOccupation      →  GET /schedules/schedule_a/by_occupation
        getScheduleAByEmployer        →  GET /schedules/schedule_a/by_employer

  ↓ derive
  CandidateBriefData {
    topline, contribution_mix, size_buckets,
    geography, top_occupations, top_employers,
    small_dollar_share, max_out_share
  }

  ↓ render
  Brief<CandidateBriefData> { data, citations, markdown, … }
  │
  ▼
CLI persists:   upsertEntity()  +  saveBrief()
CLI emits:      markdown → stdout (or --write path)

Every OpenFEC call is cached on disk under ~/.fec-analyst/cache/<sha256>.json with a 24-hour TTL. A second run of the same brief within that window issues zero API calls.

Rate-limit engineering

api.data.gov (the gateway in front of OpenFEC) enforces 60 requests per minute per key — tighter than the documented 1,000/hour.

The client coordinates four mechanisms:

  1. Steady-state pacer. Every call waits for a slot at rateLimitRps (default 0.8 rps / 48/min, leaves headroom).
  2. Disk cache. 24-hour TTL on every response body, keyed by sha256(path + sorted_query). The API key is excluded from the cache key — different keys get the same cached payloads.
  3. Proactive cooldown. When a response returns X-RateLimit-Remaining ≤ 2, all subsequent in-flight calls await a 65-second gate before acquiring a pacer slot. This prevents concurrent tool-use loops from tipping the bucket past zero.
  4. 429 retry. On Too Many Requests, arm a 65-second cooldown (honoring X-RateLimit-Reset or Retry-After when present) and retry up to 3 times. Non-429 failures fail fast — those are real errors, not pacing issues.
// src/core/openfec-client.ts (illustrative)
const maxAttempts = 3;
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
  await this.awaitCooldown();        // honor any active back-off
  await this.limiter.acquire();      // wait for a steady-state slot
  const res = await fetch(url, );

  const remaining = Number(res.headers.get("x-ratelimit-remaining"));
  if (Number.isFinite(remaining) && remaining <= 2) this.setCooldown(65);

  if (res.status === 429 && attempt < maxAttempts) {
    this.setCooldown(readResetSeconds(res) ?? 65);
    continue;
  }
  // … parse + cache + return
}

Discovered, not documented: api.data.gov's /developers docs advertise a 1,000/hour quota. Live observation showed X-RateLimit-Limit: 60 per-minute. I found this running race-comparison for MT Senate 2024 (~35 concurrent calls through one key) and getting a 429 on call ~57. The retry logic + proactive cooldown were added in response.

Typed API contracts, live-verified

Every OpenFEC endpoint fec-analyst uses has a zod schema in src/core/openfec-endpoints.ts. I verified each schema against real 2024 responses before wiring it into a skill — not by reading the OpenFEC OpenAPI spec, which is inaccurate in places.

Field-naming discoveries worth calling out:

  • /candidate/{id}/totals uses last_cash_on_hand_end_period and last_debts_owed_by_committee — the non-prefixed variants don't exist. /committee/{id}/totals follows the same last_-prefix convention.
  • /candidate/{id}/committees omits candidate_id on each row (it's in the path), so my initial schema with candidate_id: z.string() failed. Fixed to optional.
  • /schedules/schedule_e/by_candidate/ returns 422 Must include "candidate_id" or "office" when called with only committee_id — it's not an aggregation endpoint for a single spender, it's for slicing by target. For per-committee IE rollup, you paginate /schedules/schedule_e directly. This drove the dark-money-trace skill design.
  • /candidates/search?cycle=2024 returns every candidate whose 2-year filing window overlaps 2024 — including retired senators like Max Baucus (last active 2014). The election_year=2024 filter is what restricts to candidates actually on the ballot. Caught this when my first race-comparison for MT Senate 2024 returned Baucus, Bullock, and Daines (none of whom ran that cycle) while missing Tester and Sheehy.

Each discovery is documented inline where the schema / filter lives.

Entity-memory layer

bun:sqlite at ~/.fec-analyst/data/fec-analyst.db.

entities       (entity_id PK, kind, fec_id, display, metadata,
                first_seen, last_seen)
briefs         (brief_id PK, entity_id FK, skill, cycle, schema_version,
                envelope_json, markdown,
                UNIQUE(entity_id, skill, cycle))
annotations    (annotation_id PK, entity_id FK, created_at, note)
watchlist      (entity_id PK, cycles JSON, added_at)
  • Briefs overwrite on (entity, skill, cycle) conflict. The JSON envelope is the stable record; the markdown is derivative. A schema-version bump flags old envelopes for regeneration.
  • Statement caching is load-bearing. The DbClient caches prepared statements by SQL text for the lifetime of the process. Without this, bun:sqlite's Statement GC interacts badly with re-preparing the same SQL in tight loops — it manifests as "closed database" errors a few iterations in.
  • Schema is portable. Standard SQL that also runs on libsql/Turso. Moving to a networked engine later is a transport swap, not a rewrite.

The compounding payoff: running fec-analyst race --state=MT --office=S --cycle=2024 fetches every candidate's brief once. Subsequent diff, recall, or anomaly calls on any of those candidates are free. watch reuses the same store to detect changes.

Composition patterns

Two skills are higher-order — they compose other skills rather than calling the OpenFEC API directly:

  • cycle-diff loads the stored brief for the source skill (candidate-brief or committee-brief) at both cycles. If either side is missing, it auto-fetches and persists it, then computes structural numeric deltas and renders the diff table. Second run of the same diff is free.
  • race-comparison resolves every candidate on the ballot via /candidates/search?election_year=YYYY, then calls candidate-brief for each. Uses cached briefs when they exist; falls through to live fetch otherwise. Running a race-comparison populates every candidate's memory as a side effect.

The ask orchestrator

src/agents/ask.ts wraps every skill as an Anthropic tool and runs the tool-use loop.

User question  →  Claude picks a skill  →  skill returns Brief (markdown + data)
                                       ↓
                           Claude synthesizes a 5-bullet answer with source URLs
                                       ↓
                                 final markdown to stdout

Implementation details that matter:

  • Prompt caching via cache_control: { type: "ephemeral" } on the system block. The ~3K-token tool schema + behavioral rules cache for the session; follow-up questions pay input-token cost only for the user's message plus prior turns.
  • Strict rules in the system prompt: neutrality, every figure cites a filing, prefer memory over fresh calls (recall_entity before running a full brief), concise synthesis over transcription.
  • Today's date is interpolated so "this cycle", "last cycle", "next cycle" resolve correctly against the current calendar.
  • Bounded iteration (5 tool calls default) with a prompt-level nudge not to chain more than 3 without an analytical reason.
  • Tool registry is shared in spirit with the MCP server — same tool names, same input schemas, same dispatch functions — so a user who learns the tool set in one surface can use it in the other.

Example session (full transcript):

$ fec-analyst ask "Is anything unusual about Jon Tester's 2024 fundraising?" --verbose --stats
[ask] tool_use: anomaly_scan({"candidate_query":"Jon Tester","cycle":2024})

## Jon Tester 2024 — Four concentration indicators above threshold
…
[ask] 1 tool call(s); 2501 input / 868 output tokens;
      2955 cache read / 2955 cache create.

One tool call, correct pick, 5-bullet synthesis with sources.

The MCP surface

src/mcp/server.ts exposes all 13 tools (10 skills + recall_entity, annotate_entity, resolve_config) over MCP stdio. Drops into Claude Code, Claude Desktop, Cursor, or any MCP client.

$ bun run src/mcp/server.ts
# OR
$ fec-analyst mcp

Each tool returns two content blocks: the rendered markdown (for clients that show text), and the structured JSON envelope in a fenced code block (for clients that want to consume the data). Errors come back as isError: true with a readable message — the protocol layer never sees an uncaught exception.

Lazy OpenFecClient + DB singletons keep tools/list requests fast; the first tools/call pays the init cost.

Neutrality, by construction

Neutrality isn't a suggestion in the docs — it's enforced by how the code is shaped:

  • Fixed, declared thresholds. bundler-detection and anomaly-scan use numeric thresholds in source. Every output prints the threshold it was evaluated against. ≥ 50% near-max applies to every committee.
  • No party variables. No skill branches on party. Per-party labels come from the committee's own FEC filing (party_full), never from fec-analyst's code.
  • Support / oppose comes from the filer. dark-money-trace reports what each committee filed in the support_oppose_indicator column. The tool does not infer intent.
  • Citations, not assertions. Every table row, every figure, has a source URL. A reader can spot-check any number in two clicks.
  • Committed bipartisan examples. /examples ships with matched pairs: Tester/Sheehy, SMP/SLF, Yass/Hoffman. The brief template is identical; differences come from the filings.

What's deferred (on purpose)

Each gap is documented inline where it'd matter:

  • Congress.gov legislation-timing correlation for industry-influence — pairing donation dates with bill/vote dates would materially improve the narrative. Needs a second data source.
  • Zip-to-congressional-district mapping for geo-flow — would enable real in-district vs out-of-district ratios for House races. Needs the Census CD relationship file.
  • Full schedule_e pagination with memo-subtotal handling for very large super PACs — dark-money-trace caps at 3,000 transactions by default.
  • FEC bulk-data loader for cycles pre-dating the processed API coverage.
  • State-level filings (OpenStates, Follow The Money).

None of these block v1.0.


Install

Requires Bun ≥ 1.1.

git clone https://github.com/toddegray/fec-analyst.git
cd fec-analyst
bun install
bun run src/cli.ts init

init walks you through getting a free OpenFEC key from api.data.gov — takes 30 seconds, no credit card.

An Anthropic API key is optional — required only for the ask orchestrator. Every other skill runs without one.

Config lives at ~/.fec-analyst/config.json; env vars (FEC_ANALYST_API_KEY, ANTHROPIC_API_KEY, etc.) override the file.

Examples

See examples/README.md for the full catalog. Highlights:

Architecture deep-dive

See docs/ARCHITECTURE.md for the layer-by-layer reference.

License

MIT. See LICENSE.

About

AI political intelligence analyst — paste a candidate, PAC, donor, or industry, get the brief a senior opposition researcher would spend a week on. TypeScript on Bun, MCP server, built on OpenFEC.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors