GrillKit · vitchenkokir · Jun 12, 2026 · Jun 11, 2026 · Jun 12, 2026 · Jun 12, 2026
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,62 +8,26 @@ Work in progress is accumulated under `[Unreleased]`; on release, that section b
 
 ### Added
 
-- **Session results hub** — completed interviews redirect to `/interview/{id}/results` with overall evaluation and per-section summary cards linking to dedicated review pages
-- **Theory review page** — `/interview/{id}/theory` shows section feedback and full Q&A chat history with per-round scores after session completion
-- **Coding review page** — `/interview/{id}/coding` shows section feedback and an accordion of coding tasks with final submit, test summary, and per-round feedback on one page
-- **Coding section evaluator** — `CodingEvaluatorService.evaluate_section()` prefetches `coding_sections.section_feedback` when the coding phase completes and before session completion
-- **Coding interview UI** — separate coding panel with Monaco editor (CDN), Run (`POST /coding/run`), Submit (`WS /coding/ws`), run output with test progress, `sessionStorage` drafts, and phase switch between theory and coding by `session_mode`
-- **CodingEvaluatorService** — AI scoring for coding submit with run history and hidden test context in prompts; `follow_up_mode: code | explanation`; hidden test failures cap score at 3
-- **Coding Run API** — `POST /interview/{id}/coding/run` executes public tests via Judge0 and persists `CodeRunAttempt`; `GET /interview/{id}/coding/state` returns current task, progress, and run history; `WS /interview/{id}/coding/ws` accepts submit and streams `feedback`
-- **Judge0 coding runner** — `CodingRunnerService` executes public tests and compile-only checks via `Judge0Client`; Python harness wraps candidate code for entrypoint tasks; setup blocks coding when Judge0 is unhealthy (`CODING_ENABLED` + health probe)
-- **Judge0 Docker profile** — `docker compose --profile coding up` starts Judge0 CE (server, worker, Postgres, Redis); `deploy/judge0.conf` and env vars `JUDGE0_URL`, `JUDGE0_AUTH_TOKEN`
-- **Coding setup and planning** — all four `session_mode` options on setup when coding is available; `GET /setup/coding-options` and `GET /setup/coding-available`; `app/coding/services/planning.py` picks tasks from `data/coding/`; `SessionCreationService` creates coding sections via `CodingSectionCreationService`
-- **Dashboard session mode badge** — history rows show Theory, Coding, or Theory+Coding from `session_mode`
-- **`app/theory/` module scaffold** — domain (`TheorySection`, `TheoryTask`), repositories, read schemas, and `theory_sections` table with backfill from existing interviews
-- **Theory section tasks** — `answers.theory_section_id` links tasks to sections; theory repository loads full aggregate; interview creation dual-writes theory section rows
-- **Theory submission services** — answer processing, navigation, timer, and evaluation persistence moved to `app/theory/services/`; WebSocket and audio API use `TheorySubmissionService`
-- **Theory API routes** — canonical `POST /interview/{id}/theory/audio-answer` and `WS /interview/{id}/theory/ws`; legacy `/audio-answer` and `/ws` delegate with deprecation log; interview page uses new paths
-- **Theory evaluator** — `app/theory/services/evaluator/` with `TheoryEvaluatorService`; per-task evaluation used by theory submission; `InterviewEvaluatorService` remains a compat alias
-- **Session creation split** — `SessionCreationService` persists an interview shell plus `TheorySectionCreationService`; `Interview.start_shell` and theory-aware `interview_from_orm` reads
-- **Selection spec v2** — `SessionSelection` with `session_mode`, theory/coding branches; setup form session-mode picker (coding modes shown as coming soon); Alembic backfill for legacy rows
-- **Session page composition** — `SessionPageService` merges shell + `TheoryPageContext`; phase order from `session_mode`
-- **Session evaluation pipeline** — `SessionEvaluationAggregator`, `SessionEvaluatorService`, and `InterviewSection` protocol with theory prefetch via `on_phase_complete`
-
 ### Changed
 
-- **Section orchestration consolidation** — typed `SectionService` protocol with `is_user_facing` / `activate_if_pending`, shared section evaluation/review helpers, session evaluation models moved to `app/shared/evaluation_models.py`, multi-section score fallback sums both sections, unified results hub card builder via section registry, `score_breakdown` attached only at session completion via `attach_session_score_breakdown`
-- **Session orchestration refactor** — unified `SESSION_MODE_LABELS`, section service registry instead of unused `InterviewSection` protocol, single `InterviewUnitOfWork` for cross-section phase reads, shared section-feedback prefetch and task timer helpers, score resolution moved out of mappers
-- **Completed session navigation** — dashboard history links to `/interview/{id}/results`; active interview pages no longer embed final evaluation in the sidebar
-- **Session completion scoring** — `SessionCompletionService` merges theory and coding section summaries; `score_breakdown` exposes separate `theory` and `coding` totals; display score sums both sections
-- **Theory question planning** — excludes legacy `type: coding` rows still present in theory YAML banks
-- **Documentation** — `ARCHITECTURE.md` coding data flows and scoring; `README.md` setup/coding env vars; `CONTRIBUTING.md` coding task YAML format
-- **Coding naming** — domain/ORM fields use `task_count`, `task_id`, and `prompt_text` instead of legacy `question_*` names; `CodingSectionCreationService` requires shared `InterviewUnitOfWork` like theory
-- **Shared paths and questions** — `app/paths.py` and `app/questions.py` moved to `app/shared/paths.py` and `app/shared/questions.py`
-- **Theory question planning** — moved to `app/theory/services/planning.py`; excludes YAML `type: coding` rows
-- **Session read models** — `AnswerRead` is an alias of `TheoryTaskRead`; interview domain no longer defines an `Answer` entity
-- **Interview aggregate** — `Interview` is a session shell only; answers and theory config are composed at read time from `theory_sections`
-- **Interview completion** — `SessionCompletionService` loads read models and scores from merged section breakdown
-- **Interview creation** — setup uses `SessionCreationService.create_session` with shell + theory section persistence
-- **Setup form** — posts v2 `selection_json`; theory question count and timer stored on the theory branch
-
 ### Fixed
 
-- **Coding session UI** — dedicated `coding_interview.html` layout (assignment panel + editor); evaluating spinner no longer visible on load (`[hidden]` vs `display:flex` clash)
-- **Coding task bank** — tasks use `coding.assignment` (technical brief) instead of theory-style `question.text` prompts
-- **Coding-only session pages** — dashboard and interview page no longer 500 when theory sources are empty; titles and selection summary use coding branch data
-- **Coding phase activation** — `theory_then_coding` sessions promote coding sections from `pending` to `active` when theory finishes (`SessionPhaseOrchestrator`, `CodingPageService.activate_timer`)
-- **Theory-to-coding handoff** — completing the theory section auto-reloads into the coding page via shared `session_phases.js`; theory-complete state shows a **Continue to Coding** button as fallback
-- Configuration speech model panel tracks the selected Whisper size and locale in the form (status, download, and save now refer to the same model)
-- Piper and Whisper downloads in Docker no longer fail with ``Permission denied: '/.cache'`` (Hub cache uses ``data/.cache/huggingface``)
-- Per-question timer stops when the interview is ended or completed (including during final evaluation)
-- Configuration question voice panel tracks the selected interview language in the form (status and download now refer to the matching Piper voice)
-- Whisper and Piper voices can be downloaded from Configuration before any LLM model is saved; adding an audio-capable catalog entry no longer requires Whisper to be installed first
-
 ### Removed
 
-- **Legacy interview columns** — `question_count`, `question_ids`, `question_time_limit_seconds`, and `score` dropped from `interviews`; `answers.interview_id` removed (Alembic `20260608_0007`)
-- **Deprecated interview API paths** — `POST /interview/{id}/audio-answer` and `WS /interview/{id}/ws`; use `/theory/audio-answer` and `/theory/ws`
-- **Interview compat re-exports** — `AnswerProcessingService`, `InterviewPageService`, `InterviewCreationService`, `InterviewCompletionService`, and `app/interview/services/evaluator/`
+## 2026.6.12
+
+### Added
+
+- **Coding interviews** — practice live coding in the browser: editor, Run on public tests, Submit for evaluation, and a review page after the session; use `docker compose --profile coding` for code execution
+- **Coding question bank** — 33 Python language-focused tasks (junior: basics, strings, functions, control flow, exceptions, OOP, collections; middle: refactor, bug hunt, complete code, implement)
+
+### Changed
+
+- **New interview setup** — choose session mode (theory only, coding only, or both in sequence) and configure theory and coding topics separately on one screen
+
+### Fixed
+
+- **First-time configuration** — saving provider settings and downloading Whisper or Piper models works on a fresh install, including in Docker
 
 ## 2026.5.31
 

diff --git a/README.md b/README.md
@@ -2,9 +2,9 @@
 
 [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
 [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-yellow.svg)](https://opensource.org/licenses/Apache-2.0)
-[![Version](https://img.shields.io/badge/version-2026.5.31-blue.svg)](CHANGELOG.md)
+[![Version](https://img.shields.io/badge/version-2026.6.12-blue.svg)](CHANGELOG.md)
 
-Open-source AI technical interview trainer. Practice from curated YAML question banks, get structured scoring and follow-ups, and optionally use voice — with your own LLM (cloud or local).
+Open-source AI technical interview trainer. Practice **theory Q&A**, **live coding**, or **both in one session** from curated YAML banks — with structured scoring, follow-ups, optional voice, and a local results history. Bring your own LLM (cloud or local).
 
 [Why GrillKit](#why-grillkit-not-just-chatgpt) · [Quick start](#quick-start) · [Changelog](CHANGELOG.md) · [Architecture](ARCHITECTURE.md)
 
@@ -15,9 +15,10 @@ A general chat assistant is flexible, but it does not run an **interview** for y
 | What you need | ChatGPT-style chat | GrillKit |
 |---------------|-------------------|----------|
 | Curated technical questions | You prompt each time | Built-in **tracks** (Python, Kafka, System Design, …), **levels**, and **topics** |
-| Interview flow | Free-form thread | Fixed session: N questions, up to **2 AI follow-ups** per question, **1–5 scoring**, session summary |
-| Practice history | Scattered chats | **Dashboard** with past sessions stored locally |
-| Time pressure | None | Optional **per-round timer** (expired round → 0, move on) |
+| Interview flow | Free-form thread | Fixed session: theory Q&A and/or coding tasks, up to **2 AI follow-ups** per item, **1–5 scoring**, session summary |
+| Live coding practice | Paste code in chat | **Monaco editor**, **Run** against public tests, **Submit** for hidden tests + AI review (needs Judge0) |
+| Practice history | Scattered chats | **Dashboard** with past sessions; open **results** and per-section **review** pages after completion |
+| Time pressure | None | Optional **per-round timer** on theory and coding (expired round → 0, move on) |
 | Voice practice | Depends on product | Offline **Whisper** dictation; optional **Piper** question audio; **audio answers** when your model supports it |
 | Where data lives | Vendor cloud | **Self-hosted**: SQLite + `data/` on your machine; use **Ollama**, vLLM, or any OpenAI-compatible API |
 
@@ -45,21 +46,44 @@ A general chat assistant is flexible, but it does not run an **interview** for y
   <img src="./assets/interview-setup.png" alt="Interview setup" width="900" />
 </p>
 
-**Interview session** — real-time Q&A with AI scoring and final evaluation
+**Coding section** — Monaco editor, Run on public tests, Submit for AI evaluation
+
+<p align="center">
+  <img src="./assets/coding.png" alt="Coding interview session" width="900" />
+</p>
+
+**Theory section** — real-time Q&A with AI scoring and final evaluation
 
 <p align="center">
   <img src="./assets/interview-session.png" alt="Completed interview with evaluation" width="900" />
 </p>
 
 ## Features
 
-- **Interviews** — multi-track setup, several topics per session, WebSocket Q&A, AI scoring 1–5, up to 2 follow-ups per question
-- **Question banks** — Python, Database/SQL, System Design, Kafka, RabbitMQ, Docker, Kubernetes, Observability, Airflow, and more under `data/questions/{track}/` (junior / middle / senior where applicable)
-- **Timer** — optional per-round time limit; expired rounds score 0 and the session moves on
-- **Voice** — offline Whisper dictation for typed answers; optional Piper TTS to read questions aloud
-- **Audio answers** — when the configured model supports audio input and Whisper is ready, record and send a WAV answer from the interview page
-- **Setup** — model catalog on `/config`, interview locale (AI feedback language), Whisper/Piper downloads from the UI
-- **Dashboard** — recent interview history on the home page
+### Session modes
+
+Pick one mode on **New interview** (`/setup`):
+
+| Mode | What you practice |
+|------|-------------------|
+| **Theory only** | Technical Q&A from `data/questions/` — type, dictate, or record answers |
+| **Coding only** | Programming tasks from `data/coding/` — edit, Run, Submit |
+| **Theory then coding** | Q&A first, then coding panel when theory finishes |
+| **Coding then theory** | Coding first, then theory |
+
+Coding modes need a running [Judge0](https://github.com/judge0/judge0) instance (see **Coding sessions** below).
+
+### Practice tools
+
+- **Theory** — WebSocket Q&A, AI scoring 1–5, up to 2 follow-ups per question
+- **Coding** — Monaco editor, Run (`POST /coding/run`) on public tests, Submit (`WS /coding/ws`) with hidden tests and AI feedback
+- **Question banks** — Python, Database/SQL, System Design, Kafka, RabbitMQ, Docker, Kubernetes, Observability, Airflow, and more (junior / middle / senior where applicable)
+- **Timer** — optional per-round limit on theory and coding; expired rounds score 0 and the session moves on
+- **Voice** — offline Whisper dictation; optional Piper TTS to read theory questions aloud
+- **Audio answers** — record a WAV theory answer when your model supports audio input and Whisper is ready
+- **Results hub** — after you finish, `/interview/{id}/results` shows overall evaluation and links to **theory** and **coding** review pages with full chat/code history
+- **Dashboard** — recent sessions on the home page (completed sessions link to results)
+- **Setup** — model catalog on `/config`, interview locale, Whisper/Piper downloads from the UI
 - **Deployment** — Docker Compose on port 8000 with `./data` volume for config, DB, and models
 
 ## Quick start
@@ -106,9 +130,10 @@ On some Linux hosts Judge0 needs **cgroup v1** (`systemd.unified_cgroup_hierarch
 
 ### First-time flow
 
-1. **Configuration** (`/config`) — add one or more OpenAI-compatible models to the catalog, select an interview model, set interview locale; test connection, then save.
-2. **New interview** (`/setup`) — pick a **session mode** (theory only, coding only, or combined). Configure theory and/or coding tracks, topics, task counts, and per-task timers. Coding modes require Judge0 (see **Coding sessions** above).
-3. **Interview** (`/interview/{id}`) — theory answers over `WS /theory/ws`; coding uses Monaco + Run (`POST /coding/run`) and Submit (`WS /coding/ws`). End interview from the sidebar at any time.
+1. **Configuration** (`/config`) — add one or more OpenAI-compatible models to the catalog, select an interview model, set interview locale; test connection, then save. Download Whisper (and optionally a Piper voice) from the same page if you want voice features.
+2. **New interview** (`/setup`) — pick a **session mode** (theory only, coding only, or combined). Choose tracks, levels, topics, how many questions/tasks, and optional per-round timers. Coding modes require Judge0 (see **Coding sessions** above).
+3. **Practice** (`/interview/{id}`) — answer theory questions in the chat (type, dictate, or record audio). On coding phases, use the editor: **Run** to check public tests, **Submit** when ready. Combined sessions switch panels automatically when a section ends (or use **Continue to Coding**). End the interview from the sidebar at any time.
+4. **Review** (`/interview/{id}/results`) — after completion, read the overall evaluation, then open **Theory** or **Coding** review for full conversation history, scores, and feedback.
 
 Without saved provider config, `/setup` redirects to `/config`.
 
@@ -168,8 +193,8 @@ Optional environment variables (full list in [ARCHITECTURE.md](ARCHITECTURE.md#p
 
 | Document | Contents |
 |----------|----------|
-| [ARCHITECTURE.md](ARCHITECTURE.md) | Layers, HTTP/WebSocket routes, data flows, persistence, question banks |
-| [CONTRIBUTING.md](CONTRIBUTING.md) | Dev setup, tests, ruff/mypy/pytest, contribution workflow |
+| [ARCHITECTURE.md](ARCHITECTURE.md) | Feature modules, routes, data flows, persistence, test layout |
+| [CONTRIBUTING.md](CONTRIBUTING.md) | Dev setup, quality checks, question/coding YAML guidelines |
 | [CHANGELOG.md](CHANGELOG.md) | Release history |
 
 ## Security

diff --git a/app/main.py b/app/main.py
@@ -49,7 +49,7 @@ def create_app() -> FastAPI:
     app = FastAPI(
         title="GrillKit",
         description="AI Interview Trainer",
-        version="2026.5.31",
+        version="2026.6.12",
         lifespan=lifespan,
     )
 

diff --git a/assets/coding.png b/assets/coding.png