Skip to content

Sentinel is an ML data quality platform that detects leakage, imbalance, missing data patterns, and outliers before model training.

Notifications You must be signed in to change notification settings

shrys1976/Sentinel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentinel

Sentinel is an ML data quality and model-readiness platform for tabular datasets.
Upload a CSV, run deterministic diagnostics, and get a production-style report with actionable fixes and visual diagnostics.

Screenshots

257shots_so 57shots_so 158shots_so

Highlights

  • Upload CSV datasets and run analysis asynchronously.
  • Optional target-aware analysis (target column can be provided at upload time).
  • V2 diagnostics stack:
    • missingness + structural risks
    • leakage heuristics
    • categorical / outlier checks
    • target signal diagnostics
    • lightweight model simulation
    • recommendation engine
  • V2 calibrated scoring (sentinel_score) with difficulty + modeling risk labels.
  • Visual diagnostics are generated once during analysis and persisted in DB.
  • Report page only opens after processing is complete.
  • Works for guest sessions and authenticated users (Supabase).

Tech Stack

Frontend

  • React + TypeScript + Vite
  • Tailwind CSS
  • shadcn-style component structure (frontend/components/ui)
  • Supabase JS client

Backend

  • FastAPI
  • SQLAlchemy
  • Pandas + SciPy + scikit-learn + Matplotlib
  • PostgreSQL (production) / SQLite (local fallback)
  • Background processing via FastAPI BackgroundTasks

Infra

  • Frontend: Vercel
  • Backend: Render
  • Database: Neon Postgres
  • Auth: Supabase

Repository Layout

.
├── frontend/          # Vite React app
├── backend/           # FastAPI API + analysis engine
├── render.yaml        # Render blueprint config
├── runtime.txt        # Runtime pin fallback
└── README.md

Local Development

1. Backend

cd backend
python -m venv .venv
source .venv/bin/activate    # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload

Backend URL: http://localhost:8000

2. Frontend

cd frontend
npm install
npm run dev

Frontend URL: http://localhost:5173

Environment Variables

Backend (backend/.env)

Required:

  • DATABASE_URL
  • SUPABASE_URL
  • SUPABASE_ANON_KEY
  • SUPABASE_JWT_SECRET

Optional:

  • APP_NAME (default: SentinelAI)
  • CORS_ALLOW_ORIGINS (comma-separated)
  • CORS_ALLOW_ORIGIN_REGEX

Frontend (frontend/.env)

  • VITE_API_URL
  • VITE_SUPABASE_URL
  • VITE_SUPABASE_ANON_KEY

API Overview

Datasets

  • POST /datasets/upload
    multipart: file, dataset_name, optional target_column
  • GET /datasets
  • GET /datasets/{dataset_id}/status
  • DELETE /datasets/{dataset_id}

Reports

  • GET /reports/{dataset_id} (raw payload/status)
  • GET /reports/{dataset_id}/view (frontend view payload)

Plots

  • GET /plots/{dataset_id}/{plot_type} returns image/png
  • Plot types:
    • missing_heatmap
    • target_distribution
    • feature_importance
    • numeric_distribution
    • correlation_heatmap

Health

  • GET /health

Plot Storage Model

  • Plots are generated in worker after analysis.
  • Stored in analysis_plots table (dataset_id + plot_type unique).
  • Served directly from DB bytes (no on-demand regeneration on normal path).
  • Delete dataset also removes persisted plots.

Deployment

Frontend (Vercel)

  • Root directory: frontend
  • Build command: npm run build
  • Output directory: dist

Set:

  • VITE_API_URL=https://<your-render-backend>.onrender.com
  • VITE_SUPABASE_URL=https://<your-project-ref>.supabase.co
  • VITE_SUPABASE_ANON_KEY=<anon-key>

Backend (Render)

Use render.yaml (recommended) or set manually:

  • Root directory: backend
  • Build command: python -m pip install --upgrade pip && pip install --only-binary=:all: -r requirements.txt
  • Start command: uvicorn app.main:app --host 0.0.0.0 --port $PORT

Set:

  • DATABASE_URL=postgresql+psycopg2://...
  • SUPABASE_URL
  • SUPABASE_ANON_KEY
  • SUPABASE_JWT_SECRET
  • CORS_ALLOW_ORIGINS=https://<your-vercel-domain>

Troubleshooting

  • ERR_CERT_COMMON_NAME_INVALID (Supabase): verify exact VITE_SUPABASE_URL.
  • CORS blocked from Vercel: verify backend env + redeploy latest CORS fixes.
  • Render SciPy build failures: use latest requirements + wheel-only install command.

About

Sentinel is an ML data quality platform that detects leakage, imbalance, missing data patterns, and outliers before model training.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors