EcoBrief

A distributed serverless AI news oracle.

EcoBrief is a zero-cost, distributed AI inference platform designed for high performance and low overhead. It autonomously curates web data, summarizes it using Small Language Models (SLMs), and delivers personalized audio briefings via a minimalist web interface.

Built to emphasize absolute control over MLOps and Cloud FinOps, the architecture revolves around a strict "Scale-to-Zero" infrastructure to ensure AWS operational costs remain practically nonexistent.

🚀 Key Features

Automated Data Ingestion: Robust Java-based pipeline that intelligently scrapes, parses, and sanitizes RSS feeds and third-party APIs directly into AWS S3 data lakes.
Hyper-Constrained AI Inference: Orchestrates text summarization using Ray Serve + llama-cpp-python to run a quantized 3-Billion parameter language model (e.g., Llama-3.2-3B) tightly within a 2GB RAM ephemeral environment.
Active Cloud FinOps (Scale-to-Zero): Masterful Boto3 automation scripts that dynamically boot lean EC2 instances solely for the duration of inference workloads, then deliberately and instantly terminate them to prevent idle billing.
Audio Generation & Delivery: Seamless integration with AWS Polly to formulate human-like MP3 audio briefings, served to users via an interactive React frontend.

🛠️ Technology Stack

Application & API

Frontend Library: React, Vite
Styling: Tailwind CSS
Core API: FastAPI (Python)

Data & Machine Learning

Big Data Processing: Java
Inference Orchestrator: Ray Serve
Model Pipeline: llama-cpp-python
Language Models: Quantized 3B Parameter Models

Cloud Infrastructure

Provider: AWS (Free Tier emphasized)
Storage & Services: Amazon S3, Amazon Polly
Compute & Automation: ARM EC2 Instances, Boto3

🏗️ Technical Architecture & Pipeline

Ingest & Prep: Scheduled Java applications run to obtain raw news content, sanitize the payload, and deposit it into S3 buckets.
Autonomous Spawn: A FastAPI microservice triggers an LRU (Least Recently Used) cache/multiplexer script. Using Boto3, it boots a targeted EC2 instance exactly when a request arrives.
Inference & Audio: The EC2 machine loads the 3B quantized AI model into RAM safely, creates the intelligence transcript, and prompts AWS Polly to synthesize the final MP3 briefing.
Auto-Termination: Immediately following the generation of audio files (saved back to S3), the EC2 instance commits self-termination, scaling costs directly back to zero.
User Interface: The polished React frontend retrieves the links and allows consumers to view the transcript text and natively play the high-quality MP3 briefing.

📖 Operations & Runbook

For complete instructions on running the system locally via Docker Compose, deployment guidelines, and emergency playbooks to handle scale-to-zero telemetry failures, please refer to the RUNBOOK.md.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
docs		docs
frontend		frontend
inference		inference
ingestion-service		ingestion-service
orchestrator		orchestrator
scripts		scripts
.gitignore		.gitignore
README.md		README.md
RUNBOOK.md		RUNBOOK.md
TRUTH_BOOK.md		TRUTH_BOOK.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EcoBrief

🚀 Key Features

🛠️ Technology Stack

Application & API

Data & Machine Learning

Cloud Infrastructure

🏗️ Technical Architecture & Pipeline

📖 Operations & Runbook

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EcoBrief

🚀 Key Features

🛠️ Technology Stack

Application & API

Data & Machine Learning

Cloud Infrastructure

🏗️ Technical Architecture & Pipeline

📖 Operations & Runbook

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages