Skip to content
View sebastianfoerste's full-sized avatar

Block or report sebastianfoerste

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sebastianfoerste/README.md

Sebastian Förste

I build review-gated legal AI systems: evaluation harnesses, supervised workflows, and legal operating layers that make outputs cited, testable, and safe for human approval.

German-qualified lawyer, former NLP data scientist, and partner at gunnercooke. My work sits between legal engineering, product engineering, and AI governance: structured intake, deterministic checks, visible source provenance, human-approved outputs, and audit trails.

These are public-safe prototypes on synthetic data only — not client work, not production systems, not legal advice.

Start here

If you look at one repository, look at contract-review-eval-harness — an offline, deterministic evaluation harness for AI contract review. It scores model output against a hand-authored answer set and catches the failures lawyers actually care about: missed clauses, wrong risk severity, unsupported citations, and fabricated text.

git clone https://github.com/sebastianfoerste/contract-review-eval-harness
cd contract-review-eval-harness && make install && make test && make demo

In the sample run it catches a fabricated citation and marks the output for rejection. The thesis: legal AI quality should be measured, not asserted. A second idea runs alongside it: legal AI becomes useful when judgment is structured before the model acts, measured after it acts, and blocked until a human approves consequential use.

Portfolio map

Layer The question it answers Repository
Evaluation How do we know the legal AI output is any good? contract-review-eval-harness
Supervised workflow How do we keep agentic legal work accountable? legal-ops-agent
Legal operating layer How does a GC scale intake, routing, approvals, and reporting? legal-function-operating-system, ai-saas-legal-ops-starter-kit
Domain checks Can regulation become cited, reviewable first-pass checks? dpa-and-data-transfer-review, eu-ai-act-classifier, micar-whitepaper-linter, dora-third-party-register-and-resilience-workbench
Adoption How does legal AI move from demo to daily use? legal-ai-workshop-kit, legal-ai-adoption-dashboard

Suggested paths

Pick the three that match why you're here:

  • Legal Engineer / Legal AI productcontract-review-eval-harness · legal-ops-agent · dpa-and-data-transfer-review
  • AI SaaS General Counsel / Product Counselai-saas-legal-ops-starter-kit · legal-function-operating-system · dpa-and-data-transfer-review
  • AI governance / model evaluationeu-ai-act-classifier · contract-review-eval-harness · legal-ops-agent
  • Adoption / enablement / solutionslegal-ai-workshop-kit · legal-ai-adoption-dashboard · legal-ops-agent
  • Financial regulation / crypto / MiCARmicar-whitepaper-linter · MiCAR-Authorization-Co-Pilot · eu-financial-reg-horizon-scanner

How I think about legal AI

Useful legal AI is not about generating text. The questions I build around: is intake structured before drafting begins? Are assumptions, sources, and gaps visible? Can a user see what is draft, checked, approved, or blocked? Can quality be tested, not asserted? Can the workflow make a lawyer faster without pretending judgment has disappeared? That is why these projects lean on deterministic checks, evaluation scripts, explicit review states, blocked exports, and audit trails — not just prompts.

For hiring teams

Most relevant for: legal AI product work · AI deployment in legal/regulated environments · legal engineering · AI governance and model evaluation · SaaS legal operations · privacy, financial-regulation, and product-counsel workflows. The strongest proof points are the evaluation harness, the approval-gated supervised workflow, the cited regulatory checks, and the legal operating system.

Background

Partner at gunnercooke in Germany, advising on AI, SaaS, crypto, capital markets, payments, and EU financial regulation. German-qualified lawyer, admitted 2012; trained at Hengeler Mueller, Freshfields Bruckhaus Deringer, and Cleary Gottlieb. Earlier, data scientist at Dudenverlag building Python NLP pipelines.

Languages: German (native), English (fluent), French (professional working knowledge).

Public-safe statement

Synthetic examples only. No client data, no privileged material, no confidential negotiation history, no candidate data, no personal data. Public outputs are draft and review artifacts; they are not legal advice.

Contact

LinkedIn · GitHub

Pinned Loading

  1. contract-review-eval-harness contract-review-eval-harness Public

    Evaluation harness for legal AI contract review. Measures expected answer set coverage, citation grounding, and hallucination counts.

    Python

  2. legal-ops-agent legal-ops-agent Public

    Supervised legal-operations workflow for typed intake, deterministic risk triage, reviewer routing and human-approved outputs.

    Python

  3. legal-ai-adoption-dashboard legal-ai-adoption-dashboard Public

    Legal AI adoption dashboard to monitor utilization, blockers, re-engagement, and product feedback.

    TypeScript

  4. legal-ai-workshop-kit legal-ai-workshop-kit Public

    Enablement artifacts for Legal AI: partner briefings, associate hands-on, adoption questionnaires, and workflow discovery.

    HTML

  5. ai-saas-legal-ops-starter-kit ai-saas-legal-ops-starter-kit Public

    Public-safe legal operating layer for AI SaaS: contract intake, DPA triage, AI vendor review, launch governance and approval-gated risk reporting.

    TypeScript

  6. legal-function-operating-system legal-function-operating-system Public

    Deterministic legal function operating system — routing, SLAs, approval matrix, escalation, and a board-ready operations pack. Synthetic data only.

    Python