Live Demo: https://dts.simonmilata.com/
A serverless app that collects, aggregates, and visualizes historical GitHub statistics for data-related repositories. The project focuses on end-to-end API design, serverless architecture, and cost-constrained cloud deployment.
- Deploy a public-facing API using FastAPI and AWS API Gateway.
- Build an interactive frontend to visualize historical trends.
- Implement a simple but realistic data ingestion and aggregation pipeline.
- Keep the entire system free or as close to free as possible.
- Understand architectural tradeoffs in a small production system.
(EventBridge → Lambda → GitHub APIs → S3 → aggregation Lambda → S3)
- Scheduled Extract Lambda fetches GitHub data
- Raw snapshots are stored in S3 (partitioned by date)
- Scheduled Aggregation Lambda produces weekly / monthly datasets
(Client → Cloudflare → API Gateway → Lambda → S3)
- Cloudflare handles DNS, SSL termination, and basic protection
- API Gateway routes requests to API Lambda
- FastAPI runs inside Lambda using Mangum
- API Lambda reads pre-aggregated data from S3
- Serverless (Lambda + API Gateway): Chosen for scale-to-zero capabilities. With only tens of daily invocations, a dedicated server would sit 99% idle; Lambda incurs zero cost when inactive.
- S3 as Data Store: The "Write-Once-Read-Many" pattern makes S3 significantly cheaper ($0.023/GB) than maintaining a database.
- Cloudflare: Acts as the entry point for DNS, SSL, and basic bot protection. This allows the project to bypass AWS Route 53 hosted zone fees ($0.50/mo).
- Compute (Lambda): Monthly usage is ~8,610 GB-s, which is <3% of the 400,000 GB-s free monthly allowance.
- Storage (S3): Accumulating ~1MB/day (raw snapshots + aggregates). Even as the dataset grows, the storage cost is estimated at a few cents for the first few years.
- API Gateway: At current volumes (~1,800 requests/month), the cost is estimated at <$0.002/month.
- Cloudflare: Using the Free Tier for DNS and SSL termination to maintain a total recurring cost of exactly $0.00.
While S3 storage and API calls technically accrue a few cents as the dataset grows, AWS typically waives these, resulting in a net cost of $0.00.
Backend & API
Data Engineering
AWS Infrastructure
Edge & DNS
- Not designed for high write volume or real-time updates
- No database (by design)
- Focused on clarity and cost efficiency over scale
- Data collection began on deployment; historical trends are built moving forward.
This is a backend-centric project. I used AI to build the UI so I could focus entirely on the data engineering, serverless architecture, and API logic.
