Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ env/
*.sqlite
*.sqlite3
data/config.json
data/llm_models.json
llm_models.json
data/whisper-models/*
!data/whisper-models/.gitkeep
data/piper-voices/*
Expand Down
82 changes: 60 additions & 22 deletions ARCHITECTURE.md

Large diffs are not rendered by default.

23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,29 @@ Work in progress is accumulated under `[Unreleased]`; on release, that section b

### Removed

## 2026.6.16

### Added

- **Known questions** — mark theory or coding bank items as known during an interview (**I know this**) or on review pages; optionally exclude them when starting a new session; manage the list from **Known Questions** in the navigation bar

### Changed

- **Add model to catalog** — the catalog model id is generated automatically from the display name; removed the **Model id** field on `/config`
- **UI** — refreshed dark theme with clearer hierarchy, IDE-style coding editor, terminal-style run output, and updated status badges on the dashboard

### Fixed

- **Theory then coding sessions** — fixed errors when advancing from theory to coding in a combined session
- **Coding follow-ups** — explanation rounds now submit your typed explanation instead of the code in the editor
- **Coding timers** — expired rounds score 0 and the session advances automatically
- **Setup review** — known-questions option shows the correct hint for the checkbox
- **Early session end** — partial theory/coding scores are kept when you end a session before finishing every task
- **Theory answers** — more reliable submit flow for text and audio answers during AI evaluation
- **Dashboard** — faster interview history on the home page

### Removed

## 2026.6.12

### Added
Expand Down
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ A general chat assistant is flexible, but it does not run an **interview** for y
| Interview flow | Free-form thread | Fixed session: theory Q&A and/or coding tasks, up to **2 AI follow-ups** per item, **1–5 scoring**, session summary |
| Live coding practice | Paste code in chat | **Monaco editor**, **Run** against public tests, **Submit** for hidden tests + AI review (needs Judge0) |
| Practice history | Scattered chats | **Dashboard** with past sessions; open **results** and per-section **review** pages after completion |
| Skip what you already know | You repeat the same prompts | **Known questions** — mark bank items during practice; optionally exclude them when starting a new session |
| Time pressure | None | Optional **per-round timer** on theory and coding (expired round → 0, move on) |
| Voice practice | Depends on product | Offline **Whisper** dictation; optional **Piper** question audio; **audio answers** when your model supports it |
| Where data lives | Vendor cloud | **Self-hosted**: SQLite + `data/` on your machine; use **Ollama**, vLLM, or any OpenAI-compatible API |
Expand All @@ -31,7 +32,10 @@ A general chat assistant is flexible, but it does not run an **interview** for y
**Demo video** — full flow from setup to scored feedback

<p align="center">
<img src="./assets/demo_cut.gif" alt="Demo video" width="900" />
<video src="./assets/demo-video.mp4" width="900" controls playsinline>
Your browser does not support the video tag.
<a href="./assets/demo-video.mp4">Download the demo video</a>.
</video>
</p>

**Dashboard** — recent sessions and quick start
Expand Down Expand Up @@ -82,6 +86,7 @@ Coding modes need a running [Judge0](https://github.com/judge0/judge0) instance
- **Voice** — offline Whisper dictation; optional Piper TTS to read theory questions aloud
- **Audio answers** — record a WAV theory answer when your model supports audio input and Whisper is ready
- **Results hub** — after you finish, `/interview/{id}/results` shows overall evaluation and links to **theory** and **coding** review pages with full chat/code history
- **Known questions** — mark theory or coding bank items as **I know this** during an interview or on review pages; optionally exclude them on **New interview** setup; manage the list at `/known-questions/manage`
- **Dashboard** — recent sessions on the home page (completed sessions link to results)
- **Setup** — model catalog on `/config`, interview locale, Whisper/Piper downloads from the UI
- **Deployment** — Docker Compose on port 8000 with `./data` volume for config, DB, and models
Expand Down Expand Up @@ -131,7 +136,7 @@ On some Linux hosts Judge0 needs **cgroup v1** (`systemd.unified_cgroup_hierarch
### First-time flow

1. **Configuration** (`/config`) — add one or more OpenAI-compatible models to the catalog, select an interview model, set interview locale; test connection, then save. Download Whisper (and optionally a Piper voice) from the same page if you want voice features.
2. **New interview** (`/setup`) — pick a **session mode** (theory only, coding only, or combined). Choose tracks, levels, topics, how many questions/tasks, and optional per-round timers. Coding modes require Judge0 (see **Coding sessions** above).
2. **New interview** (`/setup`) — pick a **session mode** (theory only, coding only, or combined). Choose tracks, levels, topics, how many questions/tasks, optional per-round timers, and whether to **exclude known questions**. Coding modes require Judge0 (see **Coding sessions** above).
3. **Practice** (`/interview/{id}`) — answer theory questions in the chat (type, dictate, or record audio). On coding phases, use the editor: **Run** to check public tests, **Submit** when ready. Combined sessions switch panels automatically when a section ends (or use **Continue to Coding**). End the interview from the sidebar at any time.
4. **Review** (`/interview/{id}/results`) — after completion, read the overall evaluation, then open **Theory** or **Coding** review for full conversation history, scores, and feedback.

Expand Down Expand Up @@ -160,7 +165,7 @@ Any **OpenAI-compatible** HTTP API works:

On `/config`:

- **Add model to catalog** — base URL, model name, optional API key; enable **Accepts audio input** only if the model supports multimodal audio (and download Whisper for transcription).
- **Add model to catalog** — display name, base URL, model name, optional API key (a stable catalog id is generated automatically from the display name); enable **Accepts audio input** only if the model supports multimodal audio (and download Whisper for transcription).
- **Interview model** — pick from the catalog, **Test Connection**, save.
- **Locale** — language for AI feedback and speech (stored in `data/config.json`, gitignored).
- **Whisper** — choose size (`small`, `medium`, `large`), download from the UI for dictation and audio answers.
Expand Down
26 changes: 26 additions & 0 deletions alembic/versions/20260612_0010_answers_expected_points.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright 2026 GrillKit Contributors
# SPDX-License-Identifier: Apache-2.0
"""Add expected_points rubric snapshot column to answers."""

from collections.abc import Sequence

import sqlalchemy as sa

from alembic import op

revision: str = "20260612_0010"
down_revision: str | None = "20260610_0009"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None


def upgrade() -> None:
"""Store rubric bullets on each theory answer row."""
with op.batch_alter_table("answers") as batch_op:
batch_op.add_column(sa.Column("expected_points", sa.Text(), nullable=True))


def downgrade() -> None:
"""Remove expected_points from answers."""
with op.batch_alter_table("answers") as batch_op:
batch_op.drop_column("expected_points")
35 changes: 35 additions & 0 deletions alembic/versions/20260615_0011_known_questions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Copyright 2026 GrillKit Contributors
# SPDX-License-Identifier: Apache-2.0
"""Add known_questions table for excluding marked questions from planning."""

from collections.abc import Sequence

import sqlalchemy as sa

from alembic import op

revision: str = "20260615_0011"
down_revision: str | None = "20260612_0010"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None


def upgrade() -> None:
"""Create known_questions table with composite primary key."""
op.create_table(
"known_questions",
sa.Column("branch", sa.Text(), nullable=False),
sa.Column("bank_item_id", sa.Text(), nullable=False),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
server_default=sa.text("CURRENT_TIMESTAMP"),
nullable=False,
),
sa.PrimaryKeyConstraint("branch", "bank_item_id"),
)


def downgrade() -> None:
"""Drop known_questions table."""
op.drop_table("known_questions")
51 changes: 32 additions & 19 deletions app/ai/llm_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,13 @@
# SPDX-License-Identifier: Apache-2.0
"""LLM model catalog types."""

from collections.abc import Collection
from dataclasses import dataclass
import re
from typing import Final

CUSTOM_PRESET_ID: Final[str] = "custom"
MODEL_ID_PATTERN: Final[re.Pattern[str]] = re.compile(
r"^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$"
)
_FALLBACK_MODEL_ID: Final[str] = "model"


@dataclass(frozen=True)
Expand Down Expand Up @@ -71,25 +70,39 @@ def normalize_model_id(model_id: str, catalog: LLMCatalog) -> str:
return value


def validate_new_model_id(model_id: str) -> str:
"""Normalize and validate a user-supplied catalog model id.
def slugify_model_id(text: str) -> str:
"""Convert arbitrary text into a catalog-safe id slug.

Lowercases the text and replaces every run of non-alphanumeric characters
with a single hyphen, trimming hyphens from both ends.

Args:
model_id: Proposed model id from the add-model form.
text: Source text, typically a model display name.

Returns:
Normalized lowercase id.
A slug using lowercase letters, digits, and hyphens, or an empty
string when no usable characters remain.
"""
return re.sub(r"[^a-z0-9]+", "-", text.strip().lower()).strip("-")

Raises:
ValueError: If the id is invalid or reserved.

def generate_model_id(display_name: str, existing_ids: Collection[str]) -> str:
"""Derive a unique catalog id from a model display name.

Args:
display_name: Human-readable model name from the add-model form.
existing_ids: Catalog ids already in use.

Returns:
A unique slug; falls back to ``model`` when the name has no usable
characters and appends a numeric suffix to avoid collisions.
"""
value = model_id.strip().lower()
if not value:
raise ValueError("Model id is required")
if value == CUSTOM_PRESET_ID:
raise ValueError(f"Model id '{CUSTOM_PRESET_ID}' is reserved")
if not MODEL_ID_PATTERN.fullmatch(value):
raise ValueError(
"Model id must use lowercase letters, digits, and hyphens only"
)
return value
base = slugify_model_id(display_name) or _FALLBACK_MODEL_ID
if base == CUSTOM_PRESET_ID:
base = f"{CUSTOM_PRESET_ID}-{_FALLBACK_MODEL_ID}"
candidate = base
suffix = 2
while candidate in existing_ids:
candidate = f"{base}-{suffix}"
suffix += 1
return candidate
12 changes: 9 additions & 3 deletions app/coding/api/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
CodingTaskNotCurrentError,
CodingTaskNotFoundError,
)
from app.interview.domain.exceptions import InterviewDomainError
from app.interview.domain.exceptions import (
InterviewDomainError,
InterviewNotActiveError,
InterviewNotFoundError,
)


def coding_ws_error_payload(
Expand Down Expand Up @@ -42,14 +46,16 @@ def http_exception_from_coding_error(
"""
if isinstance(
exc,
CodingSectionNotFoundError | CodingTaskNotFoundError,
InterviewNotFoundError | CodingSectionNotFoundError | CodingTaskNotFoundError,
):
return HTTPException(status_code=404, detail=str(exc))
if isinstance(exc, CodingRunLimitExceededError):
return HTTPException(status_code=429, detail=str(exc))
if isinstance(
exc,
CodingSectionNotActiveError | CodingTaskNotCurrentError,
InterviewNotActiveError
| CodingSectionNotActiveError
| CodingTaskNotCurrentError,
):
return HTTPException(status_code=400, detail=str(exc))
return HTTPException(status_code=400, detail=str(exc))
16 changes: 12 additions & 4 deletions app/coding/api/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@
run_attempt_to_response,
)
from app.coding.services.run_execution import CodingRunExecutionService
from app.coding.services.state import CodingStateService
from app.interview.api.deps import AIProviderDep
from app.interview.api.deps import (
AIProviderDep,
CodingStateServiceDep,
CodingSubmissionServiceDep,
)
from app.interview.domain.exceptions import InterviewDomainError

router = APIRouter(prefix="/interview", tags=["coding"])
Expand Down Expand Up @@ -74,7 +77,10 @@ async def coding_run(


@router.get("/{interview_id}/coding/state")
async def coding_state(interview_id: str) -> JSONResponse:
async def coding_state(
interview_id: str,
service: CodingStateServiceDep,
) -> JSONResponse:
"""Return coding session progress and Run history for the active task.

Args:
Expand All @@ -87,7 +93,7 @@ async def coding_state(interview_id: str) -> JSONResponse:
HTTPException: When the coding section does not exist.
"""
try:
state = CodingStateService.get_state(interview_id)
state = service.get_state(interview_id)
except CodingDomainError as exc:
raise http_exception_from_coding_error(exc) from exc
return JSONResponse(coding_state_to_dict(state))
Expand All @@ -98,6 +104,7 @@ async def coding_ws(
websocket: WebSocket,
interview_id: str,
provider: AIProviderDep,
submission_service: CodingSubmissionServiceDep,
) -> None:
"""WebSocket endpoint for coding task submit and feedback.

Expand All @@ -118,6 +125,7 @@ async def coding_ws(
raw,
interview_id=interview_id,
provider=provider,
submission_service=submission_service,
):
if not await _safe_send_json(websocket, message):
break
Expand Down
44 changes: 41 additions & 3 deletions app/coding/api/ws_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@ async def iter_responses(
*,
interview_id: str,
provider: AIProvider,
submission_service: type[CodingSubmissionService] = CodingSubmissionService,
submission_service: CodingSubmissionService,
) -> AsyncIterator[dict[str, Any]]:
"""Handle one client message and yield JSON payloads for the socket.

Args:
raw: Parsed client JSON message.
interview_id: Interview session UUID.
provider: AI provider for coding evaluation.
submission_service: Coding submission service class.
submission_service: Request-scoped coding submission service.

Yields:
WebSocket message dicts to send to the client.
Expand All @@ -50,6 +50,15 @@ async def iter_responses(
yield message
return

if msg_type == "timeout":
async for message in CodingWebSocketService._handle_timeout(
raw,
interview_id=interview_id,
submission_service=submission_service,
):
yield message
return

yield {
"type": "error",
"message": f"Unknown message type: {msg_type}",
Expand All @@ -61,7 +70,7 @@ async def _handle_submit(
*,
interview_id: str,
provider: AIProvider,
submission_service: type[CodingSubmissionService],
submission_service: CodingSubmissionService,
) -> AsyncIterator[dict[str, Any]]:
task_id = str(raw.get("task_id", "")).strip()
source_code = str(raw.get("source_code", ""))
Expand All @@ -88,3 +97,32 @@ async def _handle_submit(
"type": "error",
"message": ai_error_message_for_client(exc),
}

@staticmethod
async def _handle_timeout(
raw: dict[str, Any],
*,
interview_id: str,
submission_service: CodingSubmissionService,
) -> AsyncIterator[dict[str, Any]]:
task_id = str(raw.get("task_id") or raw.get("question_id") or "").strip()
round_num = raw.get("round")
if not task_id or round_num is None:
yield {"type": "error", "message": "Both task_id and round are required"}
return

try:
async for event in submission_service.stream_timeout_submission(
interview_id=interview_id,
task_id=task_id,
round_num=int(round_num),
):
yield coding_event_to_message(event)
except (InterviewDomainError, CodingDomainError) as exc:
yield coding_ws_error_payload(exc)
except Exception as exc:
logger.exception("Coding timeout failed for interview %s", interview_id)
yield {
"type": "error",
"message": ai_error_message_for_client(exc),
}
Loading
Loading