Self-hosted, multi-user interview prep grounded in your code.
Upload a project + resume β get a personalized mentor and interviewer that read your repo, ground every claim in real code, and remember what you've practiced. Bring your own LLM key, run locally, and keep your data on your machine.
Quickstart Β· Features Β· Architecture Β· Configuration Β· Roadmap Β· Contributing
Most interview-prep tools throw generic LeetCode questions at you. Real interviews are about your projects, your resume, and the trade-offs you actually made.
Open Interview turns your codebase into the curriculum:
- π Reads your repo. A LangGraph mentor browses files, runs
grep, and reads source β Cursor / Claude Code style β so answers cite real files. - π Grounds your resume. Claims are matched against project evidence so the interviewer can probe what's actually true.
- π§ Remembers everything. Three layers of memory (working / episodic / long-term) keep gaps and strengths sticky across sessions.
- ποΈ Voice practice. Talk through answers with recorded STT and live interview audio built in.
- π Local-first. Bring your own OpenAI / Anthropic / Ollama / vLLM key. Multi-tenant from day one, your data never leaves your machine.
git clone https://github.com/your-org/open-interview.git
cd open-interview
make setup # venv, editable installs, speaker deps, npm install
cp config/env.dev.example .env
make infra # postgres + redis + chroma in Docker (background)
make all # core + gateway + realtime + workers + web (overmind / honcho)| Surface | URL |
|---|---|
| Web app | http://localhost:5173 |
| Core API docs | http://localhost:8000/docs |
| Gateway docs | http://localhost:9100/docs |
Heads up:
make allrequires a process manager (brew install overmindrecommended, orpip install honcho). Don't have one? Run each service in its own terminal βmake core,make gateway,make realtime,make workers,make web.
Add your provider key in Settings β API keys once the UI is up. From there, upload a project zip + resume, then start a Mentor or Interviewer session.
Live voice mode uses OpenAI Realtime transcription plus local speaker
verification. If an existing .venv predates this feature, run
make setup-speaker once before starting make realtime or make all.
|
Conversational coach that has read your code. The agent uses
read-only tools β |
Adaptive interviewer that asks questions sourced from your own project. Per-turn evaluation, gap-aware question picker, and a final rubric with strengths, weaknesses, and suggested practice. |
|
Per-(project Γ position Γ level) bank of 25 evidence-backed questions: planner β 7 sharded generators β merger. Auto-generates on ingest, regeneratable on demand. |
Working (last-N msgs), episodic (per-session summary + entities), long-term (per-user vector store of gaps, strengths, preferences). Distilled at session end. |
|
OpenAI-compatible internal gateway routes logical models
( |
Recorded voice answers use browser-native |
|
Zip upload β AST chunking β embeddings β hierarchical summaries β Mermaid diagrams. Files browsable, summaries queryable, all evidence linkable. |
PDF / docx / md upload. Parsed into identity, skills, experience timeline, projects, and claims β each claim linked to project code with confidence scores. |
|
Pair your personal number (no business API) by scanning
a QR in Settings β Messaging. From your phone:
|
Channel SDK modeled after openclaw. Adding WeChat / Telegram /
Slack is a new |
ββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β React SPA β β WhatsApp (your personal number) β
ββββββββ¬ββββββββ ββββββββββββββ¬βββββββββββββββββββββββββ
β HTTPS + JWT β Baileys (WhatsApp Web)
β βΌ
β ββββββββββββββββββββββββββββββββββββββββββ
β β whatsapp_bridge (Node + Fastify) β
β β β’ multi-account socket manager β
β β β’ echo cache Β· HMAC webhook poster β
β βββββββββββββββββββ¬βββββββββββββββββββββββ
β β HTTP + HMAC
βΌ βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Core API β
β β
β api/ β auth, projects, resumes, qa, mentor, β
β interviewer, audio, messaging β
β domain/ β projects (chunker, embedder, summarizer) β
β resumes (parser, claim grounder) β
β qa (planner, shard generator, merger) β
β mentor (LangGraph + ProjectFs tools) β
β interviewer (LangGraph + picker + evaluator) β
β memory (recall + distiller) β
β messengers (kernel Β· sdk Β· plugins/whatsapp) β
β infra/ β Postgres Β· Redis Β· Chroma Β· BlobStorage β
ββββββββ¬ββββββββββββββ¬ββββββββββββββββ¬ββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ βββββββββββ ββββββββββββββββββββ
β GenAI β β arq β β Vector Store β
β Gateway β β workers β β (Chroma / β
β β β β β in-memory) β
β β’ chat β βββββββββββ ββββββββββββββββββββ
β β’ chat/stream
β β’ embed
β β’ transcribe
β β’ realtime session minting
β β’ rate-limit + usage logs
ββββββββ¬βββββββ
β OpenAI-compatible
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OpenAI Β· Anthropic (adapter) Β· Ollama Β· vLLM Β· OpenRouter β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Live interviewer audio runs through a separate realtime gateway:
sequenceDiagram
participant Browser as React live chat
participant RT as realtime_gateway
participant GW as GenAI gateway
participant OA as OpenAI Realtime
participant Core as Core interviewer
Browser->>RT: WS hello + config + PCM16 chunks
RT->>Browser: calibration_required
Browser->>RT: calibration_start / PCM16 / calibration_commit
RT->>RT: enroll session-local speaker profile
RT->>GW: mint realtime transcription session
GW->>OA: POST /realtime/transcription_sessions
OA-->>GW: ephemeral client_secret
GW-->>RT: ws_url + client_secret
RT->>OA: provider WS, append PCM16
OA-->>RT: speech_started / speech_stopped / transcript completed
RT->>RT: defer commit while speech or transcript items are outstanding
RT->>RT: speaker + confidence gates
RT->>Core: accepted primary-speaker transcript
Core-->>RT: interviewer SSE tokens
RT-->>Browser: final_transcript + assistant_token + assistant_done
open-interview/
βββ frontend/app/ # Vite + React SPA
βββ backend/
β βββ services/
β β βββ core/ # FastAPI app (api / domain / infra)
β β βββ gateway/ # GenAI gateway (provider router + rate limiter)
β β βββ realtime_gateway/ # live audio WS, Realtime STT, speaker gate
β β βββ workers/ # arq background jobs
β β βββ whatsapp_bridge/ # Node sidecar (Fastify + Baileys), port 9300
β βββ libs/ # shared schemas, db models, storage, logging
βββ infra/ # Docker Compose, Dockerfiles
βββ config/ # models.yaml, tiers.yaml, env profiles
βββ docs/ # design doc + TODO.md
βββ scripts/ # one-off scripts
βββ data/ # gitignored: blob storage in dev
| Layer | Tools |
|---|---|
| Frontend | React 18, Vite, TypeScript, Tailwind, shadcn/ui, lucide-react, react-markdown, react-router |
| Core API | FastAPI, SQLAlchemy 2 (async), pydantic v2, LangGraph, httpx, argon2, PyJWT |
| Realtime gateway | FastAPI WebSockets, OpenAI Realtime transcription, SpeechBrain ECAPA speaker verification |
| Persistence | Postgres (prod) / SQLite (tests), Redis, Chroma vector DB |
| Workers | arq (Redis-backed task queue) |
| Messenger bridge | Node 20, TypeScript, Fastify, Baileys (@whiskeysockets/baileys), vitest |
| LLM | OpenAI-compatible β gpt-4o, gpt-4o-mini, gpt-4o-mini-transcribe, text-embedding-3-small (defaults; override in config/models.yaml) |
| Tooling | pytest, ruff, mypy, Docker Compose, overmind / honcho |
Environment lives in .env (start from config/env.dev.example). Highlights:
# Database / cache
DATABASE_URL=postgresql+asyncpg://openinterview:openinterview@localhost:55432/openinterview
REDIS_URL=redis://localhost:56379/0
CHROMA_URL=http://localhost:8001
# Auth
JWT_SECRET=change-me-32-bytes-min
JWT_ACCESS_TTL_S=900
JWT_REFRESH_TTL_S=2592000
# Encryption (AES-GCM at rest for user keys)
OPENINTERVIEW_MASTER_KEY=change-me-32-bytes-min
# Internal gateway
GATEWAY_URL=http://localhost:9100
GATEWAY_SERVICE_TOKEN=change-me-internal-only
# Live interviewer audio
REALTIME_PUBLIC_URL=ws://localhost:9200/ws/interview
OPENINTERVIEW_REALTIME_SECRET=change-me-32-bytes-min
REALTIME_INTERNAL_TOKEN=change-me-internal-only
REALTIME_MAX_TURN_SECONDS=90
REALTIME_TRANSCRIPTION_MODEL=gpt-4o-transcribe
REALTIME_NOISE_REDUCTION=near_field
REALTIME_TURN_DETECTION=semantic_vad
REALTIME_SPEAKER_VERIFIER_BACKEND=speechbrain
REALTIME_SPEAKER_THRESHOLD=0.25
REALTIME_CALIBRATION_SECONDS=5.0
REALTIME_MIN_TURN_AUDIO_MS=300
REALTIME_MIN_TRANSCRIPT_CONFIDENCE=0.35
REALTIME_VAD_EAGERNESS=low
REALTIME_TURN_COMMIT_DELAY_MS=1200
# Optional local-only WAV/JSON dumps for live-audio debugging.
REALTIME_DEBUG_AUDIO_DIR=Logical names decouple agent code from providers:
chat:
default: chat-strong
models:
chat-strong: { provider: openai, endpoint: https://api.openai.com/v1, model_id: gpt-4o }
chat-fast: { provider: openai, endpoint: https://api.openai.com/v1, model_id: gpt-4o-mini }
chat-local: { provider: ollama, endpoint: http://ollama:11434/v1, model_id: qwen2.5-coder:32b }
embedding:
default: embed-default
models:
embed-default: { provider: openai, endpoint: https://api.openai.com/v1, model_id: text-embedding-3-small }
transcription:
default: stt-default
models:
stt-default: { provider: openai, endpoint: https://api.openai.com/v1, model_id: gpt-4o-mini-transcribe }
stt-whisper: { provider: openai, endpoint: https://api.openai.com/v1, model_id: whisper-1 }
voice-analysis:
default: voice-analysis-default
models:
voice-analysis-default: { provider: openai, endpoint: https://api.openai.com/v1, model_id: gpt-4o-audio-preview }Swap any line for your own provider β Ollama, vLLM, OpenRouter, Together β as long as it speaks the OpenAI API.
models.yaml is the server-wide fallback. Any user can override it
per role from the UI:
| Role | Used by | Discovery |
|---|---|---|
| Chat / LLM | Mentor, mock interviewer, project explainer | live /v1/models |
| Embeddings | Project search, long-term memory recall | live /v1/models |
| Transcription | Speech-to-text in voice mode | live /v1/models |
| Voice analysis | Multimodal delivery scoring | live /v1/models |
Picks are stored in user_model_preferences and transparently injected
into every gateway call via a ProviderOverride, so the agents stay
unaware. The Settings UI filters the catalogue to recommended models
per role (with a "Show all" escape hatch), preselects a sane default
for the picked provider, and gates Save behind a Test round-trip.
Reasoning models (OpenAI o-series, gpt-5) are detected automatically:
the gateway sends max_completion_tokens (with a high floor for hidden
reasoning tokens) instead of max_tokens, drops unsupported temperature,
and auto-retries once on the well-known unsupported parameter and
output limit reached errors.
make test # full backend suite
.venv/bin/python -m pytest backend/services/core/tests -q
cd frontend/app && npx tsc --noEmit && npx vite buildCurrent coverage includes focused suites for core agent flows, gateway model routing and realtime session minting, realtime live-session turn aggregation and speaker gating, and a typecheck-clean Vite build.
| Milestone | Status | What |
|---|---|---|
| M0 Foundations | β | FastAPI skeleton, multi-tenant auth (argon2 + JWT + refresh), per-user data isolation |
| M1 GenAI Gateway | β | OpenAI-compatible provider, BYO + shared modes, rate limiting, usage logging |
| M2 Ingestion | β | Project zip β chunk β embed β summarize β diagram; resume parse β claim ground |
| M3 QA generation | β | Planner β 7 sharded generators β merger; auto on ingest, regenerable |
| M4 Mentor mode | β | LangGraph + 3-layer memory + ProjectFs tools (list_dir / read_file / grep / tree) |
| M5 Interviewer mode | β | Gap-aware picker, per-turn eval, final rubric, evaluation page |
| M6 Memory | β | Working / episodic / long-term, distilled at session end, vector-recalled |
| Audio | β | Voice input in chat + multimodal delivery scoring rolled into evaluation |
| M7 Messenger SDK | β | Plugin SDK + WhatsApp via Baileys sidecar (groups; commands; rate-limit; echo-suppression) |
| M8 Common KB | π§ | Applied-AI question bank + system-design primer seeds |
| M9 Per-user model picks | β | Settings β Models with live /v1/models discovery, per-role override, Test before Save; reasoning-model param auto-translation |
| M10 Public resources | π§ | LeetCode-tagged questions, system design primer, Designing Data-Intensive Apps notes β see docs/TODO.md |
| M11 WhatsApp DMs | π§ | Self-DM and direct-message inbound; currently surfaced as "Unavailable" β see docs/TODO.md |
| M12 WeChat | π§ | Same SDK; new plugin β see docs/TODO.md |
| M13 Hardening | π§ | Observability, export / wipe, real arq offload of QA generation |
This is a personal/learning project at the moment. Issues, PRs, and design
discussions are very welcome β see
docs/requirements-and-system-design.md
for the full spec before opening a substantive PR.
make setup # one-time
make test # before pushing
make fmt # ruff (placeholder)Coding conventions: clean architecture (api / domain / infra), interfaces over implementations, small modules, tests next to the thing they test.
Apache 2.0. The license file still needs to be added before distribution.