🐾 FeedSentinel

Event-driven pet-feeding monitor that fuses multimodal LLM vision with a confidence-gated state machine to confirm meals in real time — and tells you about it on Telegram.

Why FeedSentinel

A naive "is the cat eating?" classifier fires constantly on a single ambiguous frame. FeedSentinel doesn't. It treats feeding as a temporal event, not a snapshot — requiring multiple consecutive, high-confidence observations before it confirms a meal. The result is a monitoring system that is quiet when nothing matters and reliable when it does, at a near-zero inference cost of roughly $0.10–$0.50 per day.

It is camera-agnostic by design: the application never touches the camera. Any external tool (ffmpeg, cron, a motion sensor) drops a frame into a watched directory, and the pipeline reacts. This decoupling makes it trivial to swap an RTSP CCTV feed for a webcam, a Raspberry Pi cam, or a folder of test images.

Key Features

Multimodal vision analysis — every frame is interpreted by GPT-4o mini against a strict JSON contract, returning activity, confidence, and human-readable reasoning for full auditability.
Confidence-gated state machine — meals are confirmed only after N consecutive high-confidence "eating" frames, eliminating single-frame false positives.
Cooldown control — a configurable quiet period prevents one meal from generating a storm of alerts.
Cost-optimized model orchestration — an expensive vision model for perception, a cheap text model for the friendly notification copy.
Durable decoupled queue — file watcher and processing pipeline communicate through a SQLite-backed work queue, so frames survive restarts and bursty writes.
Resilient by default — Telegram or LLM failures are logged, never fatal; the daemon keeps running.
Full observability — every analyzed frame and every confirmed meal is persisted to SQLite for later analysis.

Architecture

[ffmpeg / cron]      ┌────────────────────── monitoring.db ──────────────────────┐
       │             │  image_queue   ·   frame_logs   ·   meal_events           │
       ▼             └────────────────────────────────────────────────────────────┘
  snapshots/  ──►  file_watcher.py  ──►  [image_queue]  ──►  main.py (daemon loop)
                                                                 │
                          ┌──────────────────────────────────────┼───────────────┐
                          ▼                  ▼                    ▼               ▼
                   llm_vision.py      state_machine.py        logger.py      notifier.py
                  (GPT-4o mini)    (consecutive-frame      (SQLite frame   (text LLM +
                                    counter + cooldown)     & meal log)     Telegram)

Design principle: the app is purely reactive. Capture cadence, source, and hardware are external concerns — the pipeline only ever sees new files appearing in snapshots/.

How It Works

An external job (e.g. a 2-minute cron) writes a uniquely-named snapshot into snapshots/.
file_watcher.py (watchdog) debounces partial writes and enqueues the path into the image_queue table.
The main.py daemon dequeues paths, skipping work entirely while in cooldown.
llm_vision.py base64-encodes the frame and asks GPT-4o mini for a strict-JSON verdict → VisionResult.
Every frame is written to frame_logs.
state_machine.py increments its counter on high-confidence "eating" frames and resets on anything else.
On the N-th consecutive confirmation it fires a meal event: notifier.py generates a warm one-line message with a cheap text model and pushes it plus the confirming image to Telegram; the event is recorded in meal_events; the counter resets and a cooldown begins.

Quickstart

# 1. Clone & install
git clone https://github.com/<you>/feedsentinel.git
cd feedsentinel
pip install -r requirements.txt

# 2. Configure
cp .env.example .env        # then fill in your keys

# 3. Prepare runtime directories
mkdir -p snapshots data

# 4. Run
python main.py

Run as a background daemon:

nohup python main.py >> app.log 2>&1 &

Feeding it frames

FeedSentinel does not capture images itself. Point any tool at snapshots/. Example: one RTSP snapshot every 2 minutes via cron (note the escaped % and unique filename — required so the watcher sees a new file each time):

*/2 * * * * /usr/bin/ffmpeg -rtsp_transport tcp -i "rtsp://user:pass@CAMERA_IP:554/stream" \
  -frames:v 1 -y -loglevel error \
  /abs/path/snapshots/snap_$(date +\%Y\%m\%d_\%H\%M\%S).jpg >> /abs/path/ffmpeg_cron.log 2>&1

Configuration

All configuration is environment-driven (.env, loaded via python-dotenv). The app fails loudly at startup if any required key is missing.

Variable	Required	Default	Description
`OPENAI_API_KEY`	✅	—	OpenAI API key for vision + messaging
`TELEGRAM_BOT_TOKEN`	✅	—	Telegram Bot API token
`TELEGRAM_CHAT_ID`	✅	—	Destination chat for alerts
`TELEGRAM_API_URL`		`https://api.telegram.org`	Override for proxies/self-host
`SNAPSHOTS_DIR`		`./snapshots`	Watched directory
`DB_PATH`		`./data/monitoring.db`	SQLite database path
`CAT_NAME`		`Cat`	Used to personalize notifications
`CONSECUTIVE_FRAMES_REQUIRED`		`3`	N — confirmations needed per meal
`MEAL_COOLDOWN_MINUTES`		`30`	Minimum gap between alerts
`LLM_MODEL`		`gpt-4o-mini`	Vision model
`MESSAGING_MODEL`		`gpt-3.5-turbo`	Notification-copy model

Tech Stack

Python 3.10+ · OpenAI (GPT-4o mini vision + text) · watchdog (filesystem events) · SQLite (durable queue + analytics) · Telegram Bot API via requests. No heavy ML frameworks, no OpenCV — pure Python.

Testing

python -m pytest tests/

test_state_machine.py — counter, cooldown, and edge-case logic (no API calls).
test_llm_vision.py — runs the vision module against static sample images.

Cost

Snapshot cadence	Daily API calls	Approx. daily cost
Every 1 min (8 h)	~480	~$0.50
Every 2 min	~240	~$0.25
Every 5 min	~96	~$0.10

Recommended: every 2–3 minutes — the sweet spot between responsiveness and spend.

Roadmap

Two-way control: REST endpoint to trigger analysis on demand
Remote runtime config (adjust N, cooldown) without restart
Daily Telegram digest of feeding history
Web dashboard over the frame & meal logs
Motion-triggered capture to cut API cost further
Missed-meal alerting

Known Limitations

Single-cat scenarios only · no portion/consumption estimation · daytime-optimized (low light degrades accuracy) · capture cadence is an external concern.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.claude		.claude
database		database
services		services
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐾 FeedSentinel

Why FeedSentinel

Key Features

Architecture

How It Works

Quickstart

Feeding it frames

Configuration

Tech Stack

Testing

Cost

Roadmap

Known Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🐾 FeedSentinel

Why FeedSentinel

Key Features

Architecture

How It Works

Quickstart

Feeding it frames

Configuration

Tech Stack

Testing

Cost

Roadmap

Known Limitations

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages