Wearable navigation assistant for blind users built for Raspberry Pi 4 + Intel RealSense D435. The system detects obstacles, estimates threat from distance and time-to-collision, and speaks warnings through Bluetooth headphones using Piper neural TTS.
Current production version: v3.30 HEADLESS
- Production script:
raspberry_pi/yolo_realsense_navigation.py - Foundational regression suite:
tests/test_blindnav.py - Advanced voice/latency regression suite:
tests/test_blindnav_v326.py - Verified locally on April 27, 2026:
195 passed
- Runs YOLO26n ONNX inference on Pi 4 CPU.
- Samples depth per tracked object using adaptive stride + clustering.
- Compensates apparent approach speed using background-depth ego-motion.
- Suppresses static-object chatter when the user is standing still.
- Speaks left/right/ahead warnings with distance-aware cooldown buckets.
- Logs per-alert latency timestamps to
events.log. - Provides on-demand scene description with the
dkey. - Provides optional push-to-talk voice commands with OpenAI speech-to-text.
- Provides an optional OpenAI alert TTS field-test mode.
- Added
BLINDNAV_ALERT_TTS=openaias a field-test mode for urgent, warning, and cleared alert speech. - Keeps detection, threat scoring, voice queueing, and
aplayplayback policy unchanged so the field test compares TTS output only. - Uses OpenAI WAV speech output with local fallback enabled by default.
- Added
tools/run_tts_local.sh,tools/run_tts_openai.sh, andtools/FIELD_TEST_TTS_COMPARE.mdfor repeatable side-by-side field tests.
- Added optional push-to-talk voice command input with
BLINDNAV_VOICE_INPUT=1. - Records short commands with
arecord, transcribes through OpenAI STT, and routes deterministic commands: describe, nearest, people, status, repeat, and cancel. - Keeps safety alerts on the existing local Piper/VoiceAssistant path by default; OpenAI STT is used only for command input.
- Adds a thread-safe navigation snapshot so command responses can report the nearest object, people count, and runtime status without blocking detection.
- Adds
tools/test_transcribe.pyfor command-line STT smoke testing.
- Bucketed spoken distances to the same 30 cm voice buckets already used by cooldown keys so repeat warnings reuse the same Piper phrases instead of synthesizing slightly different decimals.
- Added richer voice diagnostics to
events.log, including queue wait, synth time, launch wait, cache hit vs miss, and synthesis mode. - Made left/right/ahead hysteresis frame-stable so repeated same-frame position reads no longer consume the switch threshold early.
- Promoted nearby side-pass people on the left/right while the user is moving, so a person walking by no longer depends entirely on radial TTC.
- Clamped bad-ego TTC usage to close range when the user is still, blocking
far nonsense alerts such as
person ahead, 6.4 meters. - Switched non-person urgent/warning phrasing to cached
obstaclewording so close-object warnings stay fast even when the classifier label changes.
- Switched urgent/warning alerts to Piper by default while keeping
espeakavailable as an override throughBLINDNAV_ALERT_TTS=espeak. - Added a prewarmed Piper alert-clip cache for common short safety phrases so urgent/warning speech stays natural without paying full synthesis cost every time.
- Kept
en_US-amy-mediumas the default Piper voice and added env-based voice overrides solessac-mediumcan be tested without editing the script. - Added large-jump confirmation and far-noise suppression so static people at roughly 2-3m do not accumulate fake approach velocity.
- Replaced the thirds-based left/right/ahead split with wide-angle-aware angle mapping plus per-track hysteresis.
- Unified filtered motion across threat scoring, TTC logging, console output, and CSV logging.
- Hardened shutdown so the voice queue drains cleanly and the capture thread is joined before process exit.
- Reduced ONNX Runtime to 3 threads so Piper synthesis gets CPU time.
- Added ego-Z clamp and confidence gating to block impossible velocity spikes.
- Switched to zone-based voice cooldown keys so tracker ID churn does not retrigger the same warning.
- Added per-alert latency logging.
- Extracted
_select_voice_message()so alert wording is unit-testable. - Fixed neutral wording leakage in close-distance branches when ego-motion is unreliable.
- Added safe urgent supersession: an urgent alert can cancel a lower-priority phrase only while that phrase is still synthesizing.
- Replaced terminate-style preemption with BT-safe skip-ahead before playback.
- Preserved the hard rule that active
aplayplayback is never terminated.
- Never send SIGTERM to
aplay. - Never use
numpy2.0+; pinnumpy==1.26.4. - Never use
cv2.imshowon the Pi; use the Flask MJPEG stream for display work. - Never put the ghost filter inside
ObjectTracker. - Never announce a threat as cleared while its compensated velocity is still negative enough to indicate approach.
blind_navigation_aid/
|-- AGENTS.md
|-- README.md
|-- SETUP.md
|-- STATUS.md
|-- raspberry_pi/
| `-- yolo_realsense_navigation.py
|-- tests/
| |-- test_blindnav.py
| `-- test_blindnav_v326.py
`-- .github/workflows/tests.yml
source ~/blindnav-venv/bin/activate
export ANTHROPIC_API_KEY="sk-..."
export OPENAI_API_KEY="sk-..." # optional, for voice commands
export BLINDNAV_VOICE_INPUT=1 # optional, press v to speak command
python3 raspberry_pi/yolo_realsense_navigation.pyPress d for a scene description, or v for a voice command when enabled.
Use Ctrl+C to exit.
To compare alert TTS output only:
bash tools/run_tts_local.sh
OPENAI_API_KEY="sk-..." bash tools/run_tts_openai.shTo upload each completed run's CSV and event log to GitHub automatically:
export BLINDNAV_LOG_UPLOAD=1
python3 raspberry_pi/yolo_realsense_navigation.pyLogs are pushed to the blindnav-field-logs branch by default. Set
BLINDNAV_LOG_UPLOAD_BRANCH or BLINDNAV_LOG_UPLOAD_REMOTE to override that.
The uploader writes its own upload_*.log file under ~/blindnav_logs.
BlindNav keeps the newest 10 navigation runs by default and prunes older
log_*.csv / events_*.log pairs on startup and shutdown. Set
BLINDNAV_LOG_RETENTION_RUNS to change that. When upload is enabled, startup
also retries upload for the newest previous completed run so logs from a prior
failed shutdown are not stranded locally.
All tests run without camera hardware, a RealSense device, Piper, or an IMU. Hardware modules are stubbed at import time.
pytest tests/test_blindnav.py tests/test_blindnav_v326.py -vCurrent collected totals:
tests/test_blindnav.py: 37 teststests/test_blindnav_v326.py: 158 tests- Combined: 195 tests
- Expected field FPS: roughly 8-14 depending on thermals.
- YOLO export must produce output shape
(1, 300, 6). - ONNX Runtime stays on float32. INT8 was slower on Pi 4 ARM in project tests.
- Thermal throttling is still the main real-world performance limiter.
- Add a heatsink before field sessions.
- Review and merge the v3.30 repo state, then field-test it with Ricardo Salazar.
- Record bag-file scenarios for regression playback.
- Field-test push-to-talk voice input on the Pi with the real microphone.
- Field-test local Piper alert TTS against OpenAI alert TTS using the same
walking route and compare
[LATENCY] play_startinevents.log. - Add traffic-light color classification after the base obstacle system is stable.
- Urgent audio is optimized for freshness, but active playback is not forcibly interrupted because Bluetooth stream renegotiation is worse than waiting for a short phrase to finish.
- Queueing, cooldown, latency, wording, and ego-motion regressions are all testable without hardware and are now covered in the advanced suite.