Skip to content

Commit a24aaf6

Browse files
feat(localization): Enhance smart event handling and UI interactions
- Added `Confidence` column to the localization table for smart events. - Implemented double-click functionality on confidence cells to confirm/reject annotations. - Introduced signals for confirming and rejecting annotations in the `_TableAdapter`. - Updated the `_SpottingTabsAdapter` to include a "Smart Inference" button. - Modified event handling to ensure persistence of confidence scores and proper UI updates. - Removed legacy smart keys from dataset normalization. - Enhanced tests to cover new smart inference workflows and ensure correct behavior of confidence handling.
1 parent e299735 commit a24aaf6

22 files changed

Lines changed: 2189 additions & 1066 deletions

AGENTS.md

Lines changed: 83 additions & 154 deletions
Original file line numberDiff line numberDiff line change
@@ -1,162 +1,91 @@
1-
# AGENTS Guide
1+
# AGENTS Guide (Compact)
22

33
## Purpose
4-
This file is the working contract for agents and contributors modifying this repository.
5-
It summarizes the current architecture and the business decisions that must stay stable unless explicitly changed.
4+
Working contract for contributors/agents. Preserve decisions here unless explicitly changed.
65

76
## Product Scope
8-
The product is a PyQt6 desktop annotation workspace for OSL-format sports video datasets.
9-
It supports the full edit loop for dataset curation:
10-
- Create/open/close/save/export dataset JSON projects.
11-
- Add/remove samples and manage dataset-level schema (`labels`) and metadata.
12-
- Annotate videos in four task modalities:
13-
- Classification: sample-level single-label and multi-label heads.
14-
- Localization (action spotting): timestamped event annotations.
15-
- Description: sample-level captions / structured text.
16-
- Dense Description: event-level timestamped textual descriptions.
17-
- Review and edit annotations through synchronized tree selection, timeline markers, tables, and media playback controls.
18-
- Maintain annotation quality and safety with undo/redo history and filtering.
19-
20-
From the docs perspective, the data model target is the OSL JSON structure:
21-
- Top-level dataset metadata + taxonomy (`labels`) + item list (`data`).
22-
- Per-sample multimodal `inputs` and task-specific keys such as `labels`, `events`, `captions`, `dense_captions`.
23-
24-
## Architecture Snapshot
25-
High-level layering:
26-
- UI layer (`annotation_tool/ui/...`):
27-
- Loads `.ui` forms, owns widget composition, and emits user-intent signals.
28-
- Contains view adapters (table models, button adapters, etc.) but not dataset mutation policy.
29-
- Controller layer (`annotation_tool/controllers/...`):
30-
- Implements business behavior and state transitions.
31-
- Converts UI intents to mutation requests and refresh events.
32-
- Composition layer (`annotation_tool/main_window.py`):
33-
- Instantiates panels/controllers and wires all cross-module signal-slot connections.
34-
- Acts as the only composition root.
35-
36-
Core runtime ownership:
37-
- `DatasetExplorerController`:
38-
- Canonical in-memory dataset document (`dataset_json`) owner.
39-
- Handles project lifecycle, tree/index/filter/selection, and emits selection/media intents.
40-
- `HistoryManager`:
41-
- Canonical mutation engine for tracked operations.
42-
- Owns undo/redo replay and emits refresh intents after state changes.
43-
- `MediaController`:
44-
- Canonical playback source/state owner.
45-
- Owns route/load/play/pause/seek/stop/mute decisions.
46-
- Mode controllers:
47-
- `ClassificationEditorController`
48-
- `LocalizationEditorController`
49-
- `DescEditorController`
50-
- `DenseEditorController`
51-
- Own mode-local UI state and emit typed mutation requests to history.
52-
53-
Signal flow (default):
54-
1. UI widget emits intent to its owning controller.
55-
2. Controller emits typed request signal (mutation/media/status/etc.).
56-
3. `MainWindow.connect_signals()` routes that signal to the owning target controller.
57-
4. Owning controller updates state and emits refresh/context signals.
58-
5. Dependent controllers/panels refresh from those signals.
59-
60-
## Hard Boundaries (Do Not Break)
61-
High-level module coupling rules:
62-
- Controllers do not own each other directly through constructor injection (scoped architecture).
63-
- No controller should accept/store `main_window`.
64-
- Cross-module coordination must happen through explicit Qt signal-slot wiring in `MainWindow.connect_signals()`.
65-
- No centralized event bus abstraction.
66-
- No `DatasetModelFacade`-style abstraction layer.
67-
- No legacy aliases such as `window.model` / `window.router`; use `window.dataset_explorer_controller`.
68-
- Mode controllers should only have their panel as constructor arguments.
69-
70-
UI vs Controller split:
71-
- UI modules:
72-
- Presentation only (layout, widgets, local rendering helpers, intent signals).
73-
- Must not implement dataset lifecycle or persistence policy.
74-
- Controllers:
75-
- Business rules, state transition orchestration, validation/no-op guards.
76-
- Emit/consume typed signals; avoid direct widget-to-widget coupling across modules.
77-
78-
Media boundary:
79-
- Explorer and editor controllers must not import/use `QMediaPlayer`.
80-
- `MediaController` is the only owner of playback source/state/mute logic.
81-
- Other modules request media actions by signals only.
82-
83-
Mutation boundary:
84-
- `HistoryManager` is the mutation + undo/redo authority for tracked dataset edits.
85-
- Do not duplicate mutation implementations across explorer/editors/history.
86-
87-
## Data Ownership and Mutation Rules
88-
- `dataset_json` in `DatasetExplorerController` is the canonical in-memory document.
89-
- All tracked dataset mutations should flow through `HistoryManager.execute_*` slots.
90-
- History contract:
91-
- Effective JSON mutation => exactly one undo entry.
92-
- No-op mutation => zero undo entries.
93-
- Forward mutation clears redo.
7+
PyQt6 desktop annotation tool for OSL sports-video datasets.
8+
9+
Must support:
10+
- Project lifecycle: create/open/close/save/export JSON.
11+
- Dataset curation: samples, metadata, schema (`labels`).
12+
- Modes: Classification, Localization, Description, Dense Description.
13+
- Editing/review UX: tree + timeline + table + media controls + filter + undo/redo.
14+
15+
Data model target:
16+
- Root: metadata + `labels` + `data`.
17+
- Sample: `inputs` and task keys like `labels`, `events`, `captions`, `dense_captions`.
18+
19+
## Architecture
20+
Layers:
21+
- UI (`annotation_tool/ui/...`): presentation + intent signals.
22+
- Controllers (`annotation_tool/controllers/...`): behavior/state orchestration.
23+
- Composition (`annotation_tool/main_window.py`): only place for cross-module wiring.
24+
25+
Runtime owners:
26+
- `DatasetExplorerController`: canonical in-memory `dataset_json`, lifecycle, tree/filter/selection.
27+
- `HistoryManager`: tracked mutations + undo/redo authority.
28+
- `MediaController`: playback route/state/mute authority.
29+
- Mode controllers (`Classification`, `Localization`, `Description`, `Dense`): mode-local behavior; emit mutation/media intents.
30+
31+
Default signal flow:
32+
1. UI emits intent.
33+
2. Controller emits typed request signal.
34+
3. `MainWindow.connect_signals()` routes to owner.
35+
4. Owner mutates state and emits refresh/context.
36+
37+
## Hard Boundaries
38+
- No controller may store/accept `main_window`.
39+
- No direct controller-to-controller constructor coupling in scoped modules.
40+
- No event bus.
41+
- No `DatasetModelFacade`.
42+
- No legacy aliases (`window.model`, `window.router`); use `window.dataset_explorer_controller`.
43+
- Mode controllers should only receive their panel.
44+
- Explorer/editor controllers must not use `QMediaPlayer` directly.
45+
- Do not duplicate tracked mutation logic outside `HistoryManager`.
46+
47+
## Data and History Contracts
48+
- Canonical document: `DatasetExplorerController.dataset_json`.
49+
- Tracked edits flow through `HistoryManager.execute_*`.
50+
- Effective mutation => exactly 1 undo entry.
51+
- No-op mutation => 0 undo entries.
52+
- Forward mutation clears redo.
9453
- Undo/redo must restore structural JSON equality.
9554

96-
## Selection, Refresh, and Playback Contracts
97-
- Tab switch must not repopulate the dataset tree.
98-
- Tab switch with unchanged selection/path must not restart media.
99-
- Undo/redo should avoid full tree rebuild when lightweight refresh is sufficient.
100-
- Filter edge case: if selected row becomes hidden, clear selection (do not force-select first visible).
101-
- Selection-triggered routing should still ensure playback when user explicitly selects a sample/input and player is not already playing it.
102-
103-
## Current UX Decisions to Preserve
104-
- No single-view vs multi-view project creation prompt.
105-
- No `is_multi_view` workflow branch for new project behavior.
106-
- Description controller consumes selected `sample` payload and emits caption-only updates.
107-
- Classification manual annotation saves immediately on effective value change.
108-
- Dense add flow is explicit modal add (`Add New Description`), while edits are table-driven.
109-
- Media mute control is icon-based and placed at the right side of the timeline row.
110-
111-
## Module Responsibilities
112-
### DatasetExplorerController
113-
- Own: dataset lifecycle (create/open/save/close), tree/filter/selection, selection context emission, media routing intent emission.
114-
- Do not own: per-mode annotation mutation policy.
115-
116-
### HistoryManager
117-
- Own: forward mutation execution + undo/redo replay for tracked operations.
118-
- Emit refresh intents; do not take over UI composition concerns.
119-
120-
### MediaController
121-
- Own: source routing, play/pause/seek/stop/mute logic and state signals.
122-
- Keep playback guard logic centralized here.
123-
124-
### Mode Controllers
125-
- Own: mode UI behavior and local mode state.
126-
- Emit typed mutation intents; avoid direct dataset mutation duplication.
127-
128-
## Refactoring Principles
129-
- Prefer deleting legacy compatibility paths instead of adding shims.
130-
- Prefer one canonical implementation per business operation.
131-
- Prefer simple, explicit logic over clever or heavily abstracted implementations.
132-
- Keep constructors minimal and explicit.
133-
- If logic duplicates across controllers, centralize it in the owner module (usually `HistoryManager` for mutations, `MediaController` for playback).
134-
- Remove functions that are no longer used after refactors.
135-
- Eliminate redundant logic and redundant helper functions; keep only the canonical path.
136-
- Keep files smaller by extracting coherent helpers, not by adding extra orchestration layers.
137-
138-
## Testing and Change Discipline
139-
When changing architecture or mutation/playback behavior, update and run relevant suites:
140-
- `tests/gui/test_signal_decoupling_contract.py`
141-
- `tests/gui/test_history_stack_contract.py`
142-
- `tests/gui/test_dataset_explorer_regressions.py`
143-
- `tests/gui/test_core_lifecycle.py`
144-
- `tests/gui/test_workflow_classification.py`
145-
- `tests/gui/test_workflow_localization.py`
146-
- `tests/gui/test_workflow_description.py`
147-
- `tests/gui/test_workflow_dense_description.py`
148-
- `tests/gui/test_media_player_controls.py`
149-
150-
Always run the relevant tests after changes to check for regressions before considering work complete.
55+
## Selection/Refresh/Playback Contracts
56+
- Tab switch must not repopulate tree.
57+
- Tab switch with same selection/path must not restart media.
58+
- Undo/redo should prefer lightweight refresh over full rebuild when possible.
59+
- If filter hides selected row: clear selection (do not auto-select first visible).
60+
- Explicit user sample/input selection should ensure playback when needed.
61+
62+
## UX Decisions to Preserve
63+
- No single-view vs multi-view creation prompt.
64+
- No `is_multi_view` new-project branch.
65+
- Description controller consumes selected `sample` and emits caption-only updates.
66+
- Classification manual edits save immediately on effective change.
67+
- Dense add remains explicit modal (`Add New Description`), edits table-driven.
68+
- Mute control: icon button on right side of timeline row.
69+
70+
## Refactoring Rules
71+
- Prefer simple, explicit logic.
72+
- Keep one canonical implementation per operation.
73+
- Delete legacy paths instead of adding shims.
74+
- Keep constructors minimal/explicit.
75+
- Remove unused functions after refactors.
76+
- Remove redundant logic/helpers.
77+
78+
## Testing Discipline
79+
For architecture, mutation, playback, or workflow changes, update/run relevant suites in the folder `tests/gui`. Always run relevant tests before considering work complete.
15180

15281
## PR/Agent Checklist
153-
1. Did you keep controller boundaries (no `main_window`, no direct controller-to-controller constructor dependency in scoped modules)?
154-
2. Did you preserve the 1-or-0 history push rule for mutations?
155-
3. Did you avoid unnecessary tree repopulate/media restart side effects?
156-
4. Did you wire new behavior through explicit signals in `MainWindow.connect_signals()`?
157-
5. Did you avoid reintroducing legacy aliases/facades?
158-
6. Did you keep the implementation simple and avoid unnecessary complexity?
159-
7. Did you remove functions that are no longer used?
160-
8. Did you remove redundant logic/functions and keep one canonical implementation?
161-
9. Did you update module README(s) if public behavior/contracts changed?
162-
10. Did you add or update regression tests for new behavior, and run relevant test suites to verify no regressions?
82+
1. Controller boundaries preserved?
83+
2. History 1-or-0 push rule preserved?
84+
3. No unnecessary tree/media regressions?
85+
4. New behavior wired through `MainWindow.connect_signals()`?
86+
5. No legacy aliases/facades reintroduced?
87+
6. Logic kept simple?
88+
7. Unused functions removed?
89+
8. Redundant logic removed and canonical path retained?
90+
9. README/docs updated if public contract changed?
91+
10. Relevant regression tests added/updated and run?

annotation_tool/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@ The canonical persisted in-memory state is a single `dataset_json` owned by `Dat
4141
- Project root object (`dataset_json`) includes (non-exhaustive):
4242
- `version`, `date`, `dataset_name`, `description`, `metadata`, `labels`, `data`
4343
- Sample object (`dataset_json["data"][i]`) typically includes:
44-
- `id`, `inputs`, `labels`, `smart_labels`, `events`, `smart_events`, `captions`, `dense_captions`
44+
- `id`, `inputs`, `labels`, `events`, `captions`, `dense_captions`
45+
- Classification smart prediction marker: `labels[head].confidence_score` (optional float)
4546
- Input item: `{ "type": "video", "path": "..." }`
4647
- Localization event: `{ "head": str, "label": str, "position_ms": int }`
4748
- Dense event: `{ "position_ms": int, "lang": str, "text": str }`

annotation_tool/controllers/classification/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,21 +33,27 @@ Implements Classification mode behavior and coordinates smart inference/train he
3333
## Key Functions and Responsibilities
3434
- `setup_connections()`
3535
- Connects classification panel UI actions to controller actions.
36-
- `on_selected_sample_changed(sample, resolved_path="")`
36+
- `on_selected_sample_changed(sample)`
3737
- Refreshes manual/smart display for selected sample snapshot.
3838
- `on_schema_context_changed(schema)`
3939
- Rebuilds dynamic schema-driven controls from runtime schema context.
4040
- `save_manual_annotation(override_data=None, show_feedback=True)`
4141
- Normalizes annotation payload and emits tracked save intent.
4242
- `confirm_smart_annotation_as_manual()`
4343
- Delegates smart confirmation flow to `InferenceManager`.
44+
- `reject_smart_annotation_head(head)`
45+
- Rejects inferred smart annotation for one head and restores pre-inference manual state.
4446
- `clear_current_manual_annotation()` / `clear_current_smart_annotation()`
4547
- Clears manual/smart state with proper history behavior.
4648

4749
## Business Rules
4850
- Manual tab changes save immediately when effective value differs.
4951
- No-op saves (same normalized annotation) do nothing.
5052
- Schema operations enforce duplicate checks.
53+
- Smart inference is persisted in `labels[head]` with `confidence_score`.
54+
- Confirming smart annotation removes `confidence_score` only.
55+
- Rejecting/clearing smart annotation restores pre-inference manual state (or removes head if none existed).
56+
- Smart inference is triggered per head from the manual panel (no Smart tab flow).
5157

5258
## Conventions
5359
- UI stays in panel classes; controller performs behavior orchestration.

0 commit comments

Comments
 (0)