"Perfectly balanced, like all things should be."
Thanos is a serverless, multi-tenant AWS compliance platform that continuously monitors AWS resources for security misconfigurations, quantifies configuration drift, and surfaces actionable findings through a real-time dashboard. It is purpose-built on AWS-native serverless services — Lambda, DynamoDB, API Gateway, Cognito, S3, and SNS — requiring zero servers to manage and scaling automatically from a single account to thousands.
Configuration drift quietly snowballs into outages and security breaches. A "temporary" firewall rule, a misconfigured S3 bucket, an overly permissive IAM policy — left undetected these become exploitable gaps and compliance violations. Thanos makes drift visible and quantifiable: every resource gets a drift score (0.0 = compliant, 1.0 = fully drifted), every deviation is categorized by severity, and hierarchical policies let teams define what "good" looks like at the org, group, and instance level.
Key differentiators vs. AWS Security Hub / Prisma Cloud:
- Per-resource drift scores rather than binary pass/fail
- Hierarchical desired-state modeling — base configs with group-level overrides, not ad-hoc rule exceptions
- AI-native interface via Model Context Protocol (MCP) — query your entire infrastructure in natural language
- Complete resource inventory — track ALL resources, not only the non-compliant ones
- Pay-per-use serverless — starts at ~$1.60/month for light usage
⏱ Jump to 6:00 – 11:00 for the live platform walkthrough (dashboard, scan, findings, MCP demo).
flowchart LR
subgraph Users["Users"]
direction TB
Admin["Admin"]
NewCust["New Customer"]
AiBot["AI Assistant"]
end
subgraph AuthGW["Auth & API Gateway"]
direction TB
Cognito["AWS Cognito\nUser Pools · JWT"]
ApiGW["API Gateway HTTP v2\nJWT Authorizer"]
end
subgraph Functions["Lambda Functions"]
direction TB
ScanFn["scan_handler\nFull scan lifecycle"]
QueryFn["findings · resources\nmetrics handlers"]
AdminFn["config · groups\ncustomers · registration"]
McpFn["MCP Server\nAI tool interface"]
end
subgraph DataLayer["Data & Notifications"]
direction TB
Dynamo["DynamoDB\nfindings · resources\nconfigs · customers"]
S3Snap["S3 Snapshots\nAES-256 encrypted"]
AlertSNS["SNS\nEmail Alerts"]
end
subgraph CustomerAcc["Customer AWS Account"]
direction TB
ReadRole["Read-Only IAM Role\nCloudFormation deployed"]
AwsRsrc["S3 · IAM · EC2\nSG · RDS · Lambda"]
end
Admin --> Cognito
NewCust --> ApiGW
AiBot --> McpFn
Cognito -->|"JWT Token"| ApiGW
McpFn --> ApiGW
ApiGW --> ScanFn
ApiGW --> QueryFn
ApiGW --> AdminFn
ScanFn -->|"STS AssumeRole + ExternalID"| ReadRole
ReadRole -->|"read-only"| AwsRsrc
AwsRsrc -->|"resource data"| ScanFn
ScanFn --> S3Snap
ScanFn --> Dynamo
ScanFn -->|"HIGH severity"| AlertSNS
QueryFn --> Dynamo
AdminFn --> Dynamo
| Step | What Happens |
|---|---|
| 1. Trigger | Admin selects tenant + regions → POST /scan hits API Gateway |
| 2. Auth | scan_handler calls STS AssumeRole with ExternalID to get temp credentials |
| 3. Collect | Parallel API calls across regions collect S3, IAM, EC2, SG, RDS, Lambda configs |
| 4. Snapshot | Full resource list (2–5 MB JSON) written to S3 for audit history |
| 5. Evaluate | Each resource is compared against its merged hierarchical desired config |
| 6. Score | Drift score computed: min(1.0, differences / 10). Compliant = 0.0 |
| 7. Store | Findings + resources written to DynamoDB; HIGH severity triggers SNS email |
| 8. Display | Dashboard updates in real-time with compliance %, severity breakdown, findings |
| Layer | Technology |
|---|---|
| Frontend | React 18, TypeScript, Vite, TailwindCSS, shadcn/ui |
| Backend | AWS Lambda (Python 3.12), 9 functions |
| API | AWS API Gateway HTTP v2, JWT authorization |
| Auth | AWS Cognito User Pools |
| Database | Amazon DynamoDB (4 tables, on-demand) |
| Storage | Amazon S3 (snapshots + static hosting) |
| Alerts | Amazon SNS (email notifications) |
| AI Integration | Model Context Protocol (MCP) — SSE + stdio |
| IaC | Terraform |
terraform --version # >= 1.0
python3 --version # >= 3.12
node --version # >= 18
aws configure # AWS credentials with admin accessgit clone https://github.com/manuvikash/thanos.git
cd thanos
make tf-init # First time only
make tf-apply # Deploy all AWS resources (~3 minutes)
# Retrieve admin credentials
cd infra && terraform output -raw admin_temporary_passwordDefault admin login: admin@example.com
make web-dev
# Dashboard at http://localhost:3001- Navigate to Register (no login required)
- Enter AWS Account ID + select regions
- Click Create Role via CloudFormation — opens AWS Console to deploy the read-only IAM role
- Return to the page and click Verify & Save
- From the dashboard, select the tenant and click Run Scan
Thanos exposes 7 tools via Model Context Protocol, letting AI assistants query your infrastructure in natural language.
| Tool | Description |
|---|---|
list_resources |
Query resources with compliance status and drift scores |
get_findings |
Retrieve security violations, filtered by severity/type |
get_dashboard_metrics |
Compliance trends and scan history |
trigger_scan |
Initiate a new scan for any tenant |
list_customers |
List all registered tenants |
get_rules |
View active compliance rules |
search_violations |
Full-text search across all findings |
- Generate an API key from Dashboard → MCP Settings
- Add to
claude_desktop_config.json:
{
"mcpServers": {
"thanos": {
"url": "https://your-mcp-lambda-url.amazonaws.com",
"headers": { "x-api-key": "thanos_mcp_your_key_here" }
}
}
}Example queries:
"Show me all HIGH severity findings for customer-prod"
"What's the drift score for S3 buckets in us-east-1?"
"List all security groups allowing SSH from 0.0.0.0/0"
"Trigger a scan for acme-staging in eu-west-1"
See mcp/README.md for full setup including Gemini CLI.
thanos/
├── infra/ # Terraform — all AWS infrastructure
│ ├── main.tf # Provider + backend config
│ ├── lambda*.tf # Lambda function definitions (9 functions)
│ ├── dynamodb*.tf # DynamoDB tables (findings, resources, configs, customers)
│ ├── api*.tf # API Gateway routes and integrations
│ ├── cognito.tf # Cognito User Pool + app client
│ ├── s3.tf # Snapshot bucket + web hosting bucket
│ ├── sns.tf # Alert topic + email subscription
│ └── customer-onboarding-role.yaml # CloudFormation template for customer IAM role
│
├── lambdas/ # Python backend
│ ├── common/ # Shared libraries used by all handlers
│ │ ├── eval.py # Drift scoring + compliance evaluation engine
│ │ ├── config_merger.py # Hierarchical config deep-merge algorithm
│ │ ├── normalize.py # AWS resource config normalization
│ │ ├── resource_inventory.py # Cross-account AWS resource collection
│ │ ├── ddb.py # DynamoDB helpers
│ │ └── models.py # Shared data models
│ ├── scan_handler/ # Core: orchestrates full scan lifecycle
│ ├── findings_handler/ # Query and filter security findings
│ ├── resources_handler/ # Query resource inventory
│ ├── config_handler/ # CRUD for base configurations
│ ├── groups_handler/ # CRUD for resource groups + selectors
│ ├── customers_handler/ # Tenant management
│ ├── metrics_handler/ # Dashboard KPIs and compliance trends
│ ├── registration_handler/ # Customer self-service onboarding
│ ├── alerts_config_handler/ # Alert threshold configuration
│ ├── mcp_server/ # AI tool server (SSE transport for Lambda)
│ └── authorizer/ # Custom JWT authorizer (fallback)
│
├── web/ # React frontend
│ └── src/
│ ├── pages/ # Route-level views (Dashboard, Findings, Config, MCP…)
│ ├── components/ # Reusable UI components
│ ├── hooks/ # Custom React hooks (scan logic, metrics, toast)
│ └── api.ts # Typed API client
│
├── mcp/ # Local MCP server (stdio transport for Claude Desktop)
├── docs/ # Project documentation
│ └── Final_Report.pdf # Full technical report
├── Makefile # Build, deploy, and dev shortcuts
└── README.md
# Infrastructure
make tf-plan # Preview infrastructure changes
make tf-apply # Deploy to AWS
make tf-destroy # Tear down all resources
# Frontend
make web-dev # Start dev server (localhost:3001)
make web-build # Production buildThanos uses a two-tier hierarchical model for desired-state configuration:
- Base Config — default desired state for all resources of a type (e.g., all S3 buckets must have versioning enabled)
- Resource Groups — tag/ARN/name-pattern matched overrides with numeric priority (e.g., production buckets also require KMS encryption)
During a scan, configs are deep-merged (base → groups by priority) to produce the final desired state per resource. Deviations generate findings and contribute to the drift score.
| Scale | Customers | Scans/day | Monthly Cost |
|---|---|---|---|
| Light | 10 | 2×/customer | ~$1.60 |
| Medium | 100 | 4×/customer | ~$90 |
| Heavy | 1,000 | 8×/customer | ~$4,400 |
Primary cost drivers at scale: DynamoDB writes and CloudWatch Logs ingestion. See docs/Final_Report.pdf for full cost breakdown and optimization strategies.
| Resource | Description |
|---|---|
| Full Technical Report | Architecture deep-dive, cost analysis, roadmap, challenges |
| MCP Integration Guide | AI assistant setup for Claude Desktop and Gemini CLI |
| MCP Troubleshooting | Common MCP issues and fixes |
- Cross-account access: STS
AssumeRolewithExternalIdprevents confused deputy attacks. Role is read-only (SecurityAudit + ViewOnlyAccess) with no write permissions - Auth: Cognito JWT on every API request, 1-hour token expiry, MFA-capable
- Encryption: AES-256 at rest (DynamoDB + S3), TLS 1.2+ in transit
- Tenant isolation: All DynamoDB keys and S3 prefixes are scoped by
tenant_id - Least privilege: Separate IAM execution role per Lambda function
"The hardest choices require the strongest wills." — Keep your cloud infrastructure secure, one scan at a time.
Built by Manuvikash Saravanakumar, Mrunal Suhas Kotkar, and Vishwesh Krishna Hariharakrishnan — SJSU Cloud Computing, 2025