Skip to content

goodmorningcoffee/LabelFleet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LabelFleet

License: Apache 2.0 Node.js 20+

Open-source fleet manager for Label Studio Community Edition. Provisions one isolated Label Studio container per annotator with automated data distribution, multi-stage workflow chaining, rejection routing, gold-standard QA, and real-time monitoring.

The Problem

Label Studio CE has no per-user role-based access control. Every annotator sees every project. There is no automated data distribution, no workflow orchestration, and no built-in quality scoring at scale. Managing annotation teams of 10, 20, or 100+ people is not feasible out of the box.

Key Features

  • Workflow chaining — Chain workflows into multi-stage pipelines: labeler → reviewer → expert. Upstream annotations flow downstream as Label Studio predictions.
  • Rejection routing — Reviewers reject individual tasks; rejected work is archived to S3, routed to a specialist workflow, or returned to the original annotator, with configurable max-rework-iterations to prevent loops.
  • Per-annotator container isolation — Each annotator gets their own Label Studio instance with a dedicated database, subdomain, and API token. True data isolation, not just permission filtering.
  • Gold-standard QA — Inject known-good tasks into batches at a configurable rate. Automatic accuracy scoring (IoU for bounding boxes, exact match for classification, partial credit for multi-label). Quality alerts when accuracy drops below threshold.
  • Full REST API (agentic-ready) — 35+ endpoints with pagination, filtering, and bulk operations. An agent or CI/CD pipeline can programmatically create annotators, start containers, assign workflows, track assignments, query rejections, monitor progress, and check quality. See API Reference below and full schemas in docs/API.md.
  • Automated data distribution — S3-based batch assignment engine with auto-assignment on completion, image shuffling, gold task injection, and already-assigned exclusion.
  • Workflow DAG visualization — Gods-view SVG graph of your entire workflow network with accept/reject edges, terminal archive nodes, and live throughput indicators.
  • Real-time monitoring — Live throughput charts, time-per-annotation histograms, completion rates, and per-annotator drill-downs.
  • Labeling config templates — Ships with 18 pre-built templates (classification, detection, segmentation, NER, LLM evaluation, audio/video, chat, review-decision). Paste XML from Label Studio's editor or create custom configs.
  • Cost-efficient — Runs on Docker Compose locally or ECS Fargate Spot in production.

Quick Start (local)

Prerequisites: Docker Desktop (or Docker Engine + Compose v2), Python 3, Git

git clone https://github.com/goodmorningcoffee/labelfleet.git
cd labelfleet
python3 setup.py local

The setup wizard checks prerequisites, builds images, starts services, runs all database migrations, seeds roles/templates, and creates an admin user. It will guide you if anything is missing.

python3 setup.py stop      # Stop services (keeps data)
python3 setup.py status    # Check health

You can also use the lower-level ./scripts/setup.sh directly.

Try the demo

Visit http://localhost:3000/about and click Try Demo to browse the admin UI with seeded demo data (no containers provisioned).

Architecture

┌─────────────┐   ┌──────────┐   ┌───────────────┐
│  Admin UI   │   │ Traefik  │   │ Label Studio  │
│  (Next.js)  │───┤  Router  ├───┤  (per user)   │
└──────┬──────┘   └──────────┘   └───────┬───────┘
       │                                 │
       │          ┌──────────┐           │
       └──────────┤ Postgres ├───────────┘
                  └──────────┘
                       │
                  ┌──────────┐
                  │ S3/local │   (images + exports)
                  └──────────┘
  • Admin app — Next.js 16 + React 19 + TypeScript. Handles workflow orchestration, assignment routing, webhook processing, quality scoring.
  • Container orchestrator — Pluggable ContainerOrchestrator interface. DockerOrchestrator for local dev, ECSOrchestrator for AWS production. Selected via ORCHESTRATOR_BACKEND env var.
  • Database — PostgreSQL + Drizzle ORM. Shared cluster; one database per Label Studio instance.
  • Traefik — Reverse proxy for per-annotator subdomains (alice.fleet.localhost, bob.fleet.localhost).
  • Storage — S3 + CloudFront for production; local filesystem fallback for dev.

AWS Production Architecture

Internet → ALB (HTTPS, *.example.com)
             ├── admin.example.com → Fleet Admin (ECS Fargate, port 3000)
             └── {name}.example.com → Label Studio containers (ECS Fargate Spot, port 8080)

Fleet Admin → RDS PostgreSQL (shared, separate DB per annotator)
Fleet Admin → S3 (images, exports, gold standard data)
Fleet Admin → ECS API (creates/manages LS containers dynamically)

See DEPLOYMENT.md for full AWS deployment instructions with Terraform.

API Reference

LabelFleet is fully API-driven. Every admin operation is available as a REST endpoint, making it straightforward to integrate with external agents, automation tools, or CI/CD pipelines. All list endpoints support ?limit=N&offset=N pagination. All endpoints except /api/health, /api/version, and /api/webhook require NextAuth session authentication.

Full request/response schemas: docs/API.md

Group Method Path Description
Annotators GET /api/annotators List all annotators with roles
POST /api/annotators Create annotator
GET /api/annotators/:id Get annotator with assignments
PUT /api/annotators/:id Update annotator
DELETE /api/annotators/:id Delete annotator + cleanup containers
POST /api/annotators/:id/start Start container (provision DB, deploy)
POST /api/annotators/:id/stop Stop container
GET /api/annotators/:id/status Container lifecycle status
POST /api/annotators/bulk Bulk create annotators (up to 100)
DELETE /api/annotators/bulk Bulk delete annotators
Workflows GET /api/workflows List workflows with configs + roles
POST /api/workflows Create workflow
GET /api/workflows/:id Get workflow + full chain
PUT /api/workflows/:id Update workflow
DELETE /api/workflows/:id Delete workflow
POST /api/workflows/:id/assign Trigger manual assignment
GET /api/workflows/graph DAG visualization data (nodes + edges)
POST /api/workflows/bulk Bulk create workflows (up to 50)
Assignments GET /api/assignments List assignments (filter by workflow, annotator, status)
GET /api/assignments/:id Get assignment with tasks + gold results
DELETE /api/assignments/:id Cancel pending/in-progress assignment
Rejections GET /api/rejections List rejections (filter by workflow, annotator, action)
GET /api/rejections/:id Get rejection with full relations
Roles GET /api/roles List roles
POST /api/roles Create role
GET /api/roles/:id Get role
PUT /api/roles/:id Update role
DELETE /api/roles/:id Delete role
Templates GET /api/labeling-configs List labeling configs
POST /api/labeling-configs Create config (with XML validation)
GET /api/labeling-configs/:id Get config + usage counts
PUT /api/labeling-configs/:id Update config
DELETE /api/labeling-configs/:id Delete config (if unused)
Gold Tasks GET /api/gold-tasks List gold tasks
POST /api/gold-tasks Create gold task
GET /api/gold-tasks/:id Get gold task
PUT /api/gold-tasks/:id Update gold task
DELETE /api/gold-tasks/:id Delete gold task
Monitoring GET /api/monitoring System overview + hourly stats
GET /api/monitoring/annotator/:id Per-annotator detail
Quality GET /api/quality Accuracy metrics + rejection rates
Events GET /api/events Audit log (container + annotation events)
Settings GET /api/settings List all system settings
GET /api/settings/:key Get single setting
PUT /api/settings/:key Create or update setting
Reconcile POST /api/reconcile Force-sync with Label Studio
Webhook POST /api/webhook Label Studio event ingestion
Health GET /api/health Health check (no auth)
Version GET /api/version API version + feature flags (no auth)

Configuration

All secrets come from environment variables — none are hardcoded. See docker-compose.yml for local dev variables and infrastructure/terraform/terraform.tfvars.example for the production set.

Key environment variables:

Variable Description Default
DATABASE_URL PostgreSQL connection string
NEXTAUTH_SECRET Session signing secret
ORCHESTRATOR_BACKEND docker or ecs docker
FLEET_DOMAIN Base domain for subdomains fleet.localhost
S3_BUCKET S3 bucket for images/exports — (local fallback)
ADMIN_PASSWORD Initial admin password admin

See .env.example for the full list.

Tech Stack

Next.js 16 · React 19 · TypeScript · PostgreSQL · Drizzle ORM · Docker · Traefik · AWS ECS Fargate · Terraform · S3

Project Structure

├── src/
│   ├── app/                   # Next.js routes (pages + API)
│   │   ├── (auth)/           # Login
│   │   ├── (dashboard)/      # Admin pages (annotators, workflows, monitoring, quality, etc.)
│   │   ├── (public)/         # About, landing
│   │   └── api/              # REST endpoints
│   ├── lib/
│   │   ├── orchestrator/     # Docker + ECS backends (ContainerOrchestrator interface)
│   │   ├── assignment-engine.ts
│   │   ├── rejection-handler.ts
│   │   ├── webhook-handler.ts
│   │   └── db/               # Drizzle schema + migrations
│   └── components/           # Shared React components
├── docker/                   # Custom Label Studio image build
├── drizzle/                  # Schema migrations
├── infrastructure/terraform/ # AWS deployment (see DEPLOYMENT.md)
├── scripts/                  # setup.sh, stop.sh
├── docs/                     # API reference
└── docker-compose.yml        # Local dev stack

Production Deployment

AWS deployment (ECS Fargate + Terraform) is documented in DEPLOYMENT.md. You will need:

  • AWS account + CLI configured
  • A domain + Route53 hosted zone
  • An ACM wildcard certificate

Estimated cost for 50 concurrent annotators: **$226-260/month**.

Troubleshooting

Issue Fix
Port 3000/80 already in use lsof -i :3000 to find the process, then kill it
Docker daemon not running Start Docker Desktop, or sudo systemctl start docker
fleet.localhost not resolving (Linux) Add 127.0.0.1 admin.fleet.localhost to /etc/hosts
Container stuck in "starting" Check logs: docker compose logs admin
ACM certificate stuck "pending" Verify the CNAME record was added correctly in your DNS provider
DNS not propagating Wait up to 60 minutes; check with dig NS yourdomain.com
Database migration errors Run docker compose down -v to reset, then re-run setup

Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines.

License

Apache License 2.0. See LICENSE.

About

Open source Fleet management wrapper for Label Studio OS. Brings RBAC, scalability, automation and more to Label Studio

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors