Containerized pipeline that ingests dashcam, audio, and bodycam recordings into a Neo4j knowledge graph. Features distributed workers, a NAS-drop job queue, REST API for triggering jobs, and scheduled cron workers.
cd /home/scott/git/auto-ingest
# Configure environment
cp deploy/path_profiles.env.example .env
# Edit .env for your host
# Build and start all services
docker compose up -d --build
# Verify
docker compose ps
curl http://localhost:8766/api/health
curl http://localhost:8766/api/status
# Via HTTP API
curl -X POST http://localhost:8766/api/enqueue \
-H 'Content-Type: application/json' \
-d '{"kind": "dashcam"}'
# Via shell script
./deploy/create_job.sh dashcam
./deploy/create_job.sh audio
./deploy/create_job.sh bodycam
./deploy/create_job.sh all
# Check status
curl http://localhost:8766/api/status
| Service |
Port |
Poll |
Purpose |
| job-api |
8766 |
— |
HTTP API for enqueueing jobs |
| ingest-service |
— |
5 min |
Runs run_ingest_all.sh continuously |
| ingest-worker |
— |
30s |
Claims .job files from /nas/drop/ |
| sync-service |
— |
10 min |
Syncs legacy drop from deathstar |
| content-service |
— |
30 min |
Content OS CLI status loop |
| ingest-cron |
— |
5 min |
Scheduled ingest cron |
| content-cron |
— |
30 min |
Scheduled content cron |
| neo4j |
7474/7687 |
— |
Graph database (20M+ nodes) |
+-------------------+ +------------------+ +-----------------+
| Job Trigger API | | Ingest Service | | Sync Service |
| (HTTP on :8766) | | (loop 5 min) | | (loop 10 min) |
+-------------------+ +------------------+ +-----------------+
| | |
v v v
+-------------------+ +------------------+ +-----------------+
| .job Queue |<----| Ingest Worker |<----| Legacy Drop |
| /nas/drop/ | | (loop 30s) | | /incoming/ |
| claimed/ | | | | deathstar/ |
| done/ | | | +-----------------+
| failed/ | | |
+-------------------+ | |
| |
v v
+------------------+ +-----------------+
| Neo4j Graph DB | | Content OS |
| :7687 (:7474) | | (cron 30 min) |
+------------------+ +-----------------+
Core Neo4j node types:
| Label |
Count |
Description |
| PhoneLog |
20M |
Phone call/SMS records |
| DashcamEmbedding |
4.2M |
Dashcam video embeddings |
| YOLODetection |
4.1M |
Vehicle/object detections |
| Frame |
3.7M |
Video frames |
| Utterance |
420K |
Speech utterances |
| Segment |
361K |
Transcript segments |
| Speaker |
233K |
Speaker entities |
| Transcription |
64K |
Transcription records |
| File |
Purpose |
docker-compose.yml |
Service definitions |
Dockerfile |
Container image |
deploy/job_trigger_api.py |
HTTP API server |
deploy/worker_ingest.sh |
Distributed worker |
deploy/sync_from_legacy_drop.sh |
Legacy sync |
deploy/create_job.sh |
Job creation helper |
deploy/start-cron.sh |
Cron daemon starter |
deploy/cron/ingest.crontab |
Ingest schedule |
deploy/cron/content_generation.crontab |
Content schedule |
deploy/path_profiles.env.example |
Environment template |
run_ingest_all.sh |
Main ingest runner |
ingest_transcriptsv5_3.py |
Python ingest script |
cd /home/scott/git/auto-ingest
# Start/stop
docker compose up -d # start all
docker compose down # stop all
docker compose up -d --build # rebuild and start
# Logs
docker compose logs -f ingest-worker
docker compose logs -f ingest-service
docker compose logs -f sync-service
# Neo4j
docker exec neo4j cypher-shell -u neo4j -p knowledge_graph_2026 "RETURN 1"
docker exec neo4j cypher-shell -u neo4j -p knowledge_graph_2026 \
"MATCH (n) RETURN labels(n) AS label, count(*) AS cnt ORDER BY cnt DESC LIMIT 10"
# Queue
ls -lah /nas/drop/ /nas/drop/claimed/ /nas/drop/done/ /nas/drop/failed/
See Troubleshooting Skill for:
- Neo4j connection failures
- libGL/libvpx errors
- Legacy drop sync issues
- Job queue problems
- Cron job failures
- Diagnostic commands