fix(colgrep): responsive Ctrl+C + non-resetting build progress while indexing#143
Merged
Merged
Conversation
…indexing
Two indexing UX issues (neither corrupted the index):
1. Progress bar restarted at 0 every ~4096 units. The resumable build encodes in
BUILD_CHECKPOINT_UNITS batches, each running a fresh pipeline whose metadata
stage set the bar from a per-batch counter starting at 0 — so it climbed to
~4096 then jumped back. Use `pb.inc(delta)` so the shared bar accumulates
across batches up to the real total.
2. Ctrl+C appeared to do nothing during indexing:
- The acknowledgement only printed inside a critical section, so an interrupt
during the long encoding phase was silent. Always acknowledge the first
interrupt now.
- Interruption was only polled at chunk/batch boundaries, draining the whole
in-flight pipeline before stopping. The encode stage now bails between
chunks, and `encode_prepared_document_batches_cancellable` checks the flag
between batches so a Ctrl+C lands within ~one model forward pass. Stop
latency drops from a full-queue drain (many seconds) to ~1s.
Safety/resumability unchanged and verified: index writes stay inside a critical
section, state.json is checkpointed per batch, and build_resumable trims any
partial mid-batch write on resume. Interrupt+resume tested on a 30k-unit build
(interrupted before and after the first checkpoint): resumes to exactly 30000
docs — no duplicates, correct search.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes two indexing UX issues reported on a large fresh build. Neither corrupted the index — both are about progress display and interrupt responsiveness — and crash-safety/resumability is unchanged and verified.
1. Progress bar restarted at 0 every ~4096 units
The resumable build encodes in
BUILD_CHECKPOINT_UNITS(4096) batches, each running a fresh pipeline. The metadata stage set the bar viaset_position(completed_units)from a per-batch counter starting at 0, so the bar climbed to ~4096 then jumped back. Switched topb.inc(delta)so the shared bar accumulates across batches to the true total.2. Ctrl+C appeared to do nothing while indexing
encode_prepared_document_batches_cancellablechecks the flag between batches, so a Ctrl+C lands within ~one model forward pass. Measured stop latency: full-queue drain → ~1 s.The hard floor is one ONNX
session.run()(an uninterruptible FFI call); the residual ~1 s is the clean shutdown finalizing in-flight work (k-means seed, the accumulated index write, queued metadata inserts) so the index stays consistent.Safety / resumability (verified)
Index writes stay inside a
CriticalSectionGuard,state.jsonis checkpointed per batch, andbuild_resumabletrims any partial mid-batch write on resume. Interrupt+resume tested on a 30,000-unit build, interrupted both before and after the first checkpoint:_subset_(no duplicates, no loss), and unique-token searches resolve to the correct file (4/4).make ci-quick(fmt + clippy + 562 colgrep tests) passes.