Skip to content

Abhijendra/FeedSentinel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🐾 FeedSentinel

Event-driven pet-feeding monitor that fuses multimodal LLM vision with a confidence-gated state machine to confirm meals in real time — and tells you about it on Telegram.

Python LLM Architecture License Status


Why FeedSentinel

A naive "is the cat eating?" classifier fires constantly on a single ambiguous frame. FeedSentinel doesn't. It treats feeding as a temporal event, not a snapshot — requiring multiple consecutive, high-confidence observations before it confirms a meal. The result is a monitoring system that is quiet when nothing matters and reliable when it does, at a near-zero inference cost of roughly $0.10–$0.50 per day.

It is camera-agnostic by design: the application never touches the camera. Any external tool (ffmpeg, cron, a motion sensor) drops a frame into a watched directory, and the pipeline reacts. This decoupling makes it trivial to swap an RTSP CCTV feed for a webcam, a Raspberry Pi cam, or a folder of test images.


Key Features

  • Multimodal vision analysis — every frame is interpreted by GPT-4o mini against a strict JSON contract, returning activity, confidence, and human-readable reasoning for full auditability.
  • Confidence-gated state machine — meals are confirmed only after N consecutive high-confidence "eating" frames, eliminating single-frame false positives.
  • Cooldown control — a configurable quiet period prevents one meal from generating a storm of alerts.
  • Cost-optimized model orchestration — an expensive vision model for perception, a cheap text model for the friendly notification copy.
  • Durable decoupled queue — file watcher and processing pipeline communicate through a SQLite-backed work queue, so frames survive restarts and bursty writes.
  • Resilient by default — Telegram or LLM failures are logged, never fatal; the daemon keeps running.
  • Full observability — every analyzed frame and every confirmed meal is persisted to SQLite for later analysis.

Architecture

[ffmpeg / cron]      ┌────────────────────── monitoring.db ──────────────────────┐
       │             │  image_queue   ·   frame_logs   ·   meal_events           │
       ▼             └────────────────────────────────────────────────────────────┘
  snapshots/  ──►  file_watcher.py  ──►  [image_queue]  ──►  main.py (daemon loop)
                                                                 │
                          ┌──────────────────────────────────────┼───────────────┐
                          ▼                  ▼                    ▼               ▼
                   llm_vision.py      state_machine.py        logger.py      notifier.py
                  (GPT-4o mini)    (consecutive-frame      (SQLite frame   (text LLM +
                                    counter + cooldown)     & meal log)     Telegram)

Design principle: the app is purely reactive. Capture cadence, source, and hardware are external concerns — the pipeline only ever sees new files appearing in snapshots/.


How It Works

  1. An external job (e.g. a 2-minute cron) writes a uniquely-named snapshot into snapshots/.
  2. file_watcher.py (watchdog) debounces partial writes and enqueues the path into the image_queue table.
  3. The main.py daemon dequeues paths, skipping work entirely while in cooldown.
  4. llm_vision.py base64-encodes the frame and asks GPT-4o mini for a strict-JSON verdict → VisionResult.
  5. Every frame is written to frame_logs.
  6. state_machine.py increments its counter on high-confidence "eating" frames and resets on anything else.
  7. On the N-th consecutive confirmation it fires a meal event: notifier.py generates a warm one-line message with a cheap text model and pushes it plus the confirming image to Telegram; the event is recorded in meal_events; the counter resets and a cooldown begins.

Quickstart

# 1. Clone & install
git clone https://github.com/<you>/feedsentinel.git
cd feedsentinel
pip install -r requirements.txt

# 2. Configure
cp .env.example .env        # then fill in your keys

# 3. Prepare runtime directories
mkdir -p snapshots data

# 4. Run
python main.py

Run as a background daemon:

nohup python main.py >> app.log 2>&1 &

Feeding it frames

FeedSentinel does not capture images itself. Point any tool at snapshots/. Example: one RTSP snapshot every 2 minutes via cron (note the escaped % and unique filename — required so the watcher sees a new file each time):

*/2 * * * * /usr/bin/ffmpeg -rtsp_transport tcp -i "rtsp://user:pass@CAMERA_IP:554/stream" \
  -frames:v 1 -y -loglevel error \
  /abs/path/snapshots/snap_$(date +\%Y\%m\%d_\%H\%M\%S).jpg >> /abs/path/ffmpeg_cron.log 2>&1

Configuration

All configuration is environment-driven (.env, loaded via python-dotenv). The app fails loudly at startup if any required key is missing.

Variable Required Default Description
OPENAI_API_KEY OpenAI API key for vision + messaging
TELEGRAM_BOT_TOKEN Telegram Bot API token
TELEGRAM_CHAT_ID Destination chat for alerts
TELEGRAM_API_URL https://api.telegram.org Override for proxies/self-host
SNAPSHOTS_DIR ./snapshots Watched directory
DB_PATH ./data/monitoring.db SQLite database path
CAT_NAME Cat Used to personalize notifications
CONSECUTIVE_FRAMES_REQUIRED 3 N — confirmations needed per meal
MEAL_COOLDOWN_MINUTES 30 Minimum gap between alerts
LLM_MODEL gpt-4o-mini Vision model
MESSAGING_MODEL gpt-3.5-turbo Notification-copy model

Tech Stack

Python 3.10+ · OpenAI (GPT-4o mini vision + text) · watchdog (filesystem events) · SQLite (durable queue + analytics) · Telegram Bot API via requests. No heavy ML frameworks, no OpenCV — pure Python.


Testing

python -m pytest tests/
  • test_state_machine.py — counter, cooldown, and edge-case logic (no API calls).
  • test_llm_vision.py — runs the vision module against static sample images.

Cost

Snapshot cadence Daily API calls Approx. daily cost
Every 1 min (8 h) ~480 ~$0.50
Every 2 min ~240 ~$0.25
Every 5 min ~96 ~$0.10

Recommended: every 2–3 minutes — the sweet spot between responsiveness and spend.


Roadmap

  • Two-way control: REST endpoint to trigger analysis on demand
  • Remote runtime config (adjust N, cooldown) without restart
  • Daily Telegram digest of feeding history
  • Web dashboard over the frame & meal logs
  • Motion-triggered capture to cut API cost further
  • Missed-meal alerting

Known Limitations

Single-cat scenarios only · no portion/consumption estimation · daytime-optimized (low light degrades accuracy) · capture cadence is an external concern.


License

MIT — see LICENSE.

About

Event-driven cat-feeding monitor that uses multimodal LLM vision and a state machine to confirm meals and alert via Telegram

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages