Skip to content

ArchitAnant/Penguin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Penguin - FastAPI ML Project Scaffolder

Crates.io License: MIT

A production-ready Rust CLI tool that scaffolds modern, fully-functional FastAPI-based machine learning APIs with built-in templates, automatic dependency management, and smart project initialization.

Features

3 Built-in Templates

  • Minimal β€” Single model, simple request/response (ideal for single detectors)
  • Multi-model β€” Model registry pattern for managing multiple models
  • Async-heavy β€” Queue-based inference with background job processing (Redis-ready)

Framework Support

  • PyTorch (--framework torch)
  • ONNX Runtime (--framework onnx)
  • TensorFlow Lite (--framework tflite)
  • Automatic dependency injection based on framework selection

Smart Project Setup

  • Auto-creates Python virtual environment
  • Auto-installs all dependencies
  • Optional --no-setup flag to scaffold files only

Production-Ready Boilerplate

  • Pydantic models for type-safe request/response handling
  • FastAPI lifespan events for model loading/cleanup
  • Docker support (Dockerfile included)
  • Environment configuration (.env.example)
  • Comprehensive README in generated projects

Model Management

  • Add new models to existing projects with penguin add model
  • Each model gets its own inference stub and utilities
  • Template-based for consistency

Project Initialization

  • Initialize existing FastAPI projects with penguin init
  • Automatic venv + dependency setup
  • Validates project structure

Get Started Instantly

cargo install penguin-ml
penguin new my-detector --template minimal --framework torch
cd my-detector && source venv/bin/activate && python src/main.py

Visit http://localhost:8000/docs to see your API!


Installation

Option 1: From Crates.io (Recommended)

cargo install penguin-ml

Then verify installation:

penguin --help

Option 2: From Source

git clone https://github.com/yourusername/penguin
cd penguin
cargo install --path .

Option 3: Download Pre-built Binary

Download the latest release from GitHub Releases:

chmod +x penguin
mv penguin ~/.local/bin/  # or /usr/local/bin/

Quick Start

1. Create a New Project

# Interactive mode (prompts for template and framework)
penguin new my-detector

# With explicit options
penguin new my-detector --template minimal --framework torch

# Skip venv setup (manual setup later)
penguin new my-detector --template multi-model --framework onnx --no-setup

2. Start the API Server

cd my-detector
source venv/bin/activate
python src/main.py

Visit http://localhost:8000/docs to see interactive API documentation (Swagger UI).

3. Add Models to Your Project

penguin add model my_custom_detector

This creates:

src/models/my_custom_detector/
β”œβ”€β”€ inference.py    # Model loading and inference logic
β”œβ”€β”€ utils.py        # Preprocessing/postprocessing utilities
└── __init__.py

4. Initialize Existing Projects

If you have an existing FastAPI project and want to add venv + dependencies:

cd /path/to/existing/project
penguin init

Commands

penguin new <project-name>

Scaffold a new FastAPI ML project.

Options:

  • -t, --template <TEMPLATE> β€” Choose template: minimal, multi-model, async-heavy
  • -f, --framework <FRAMEWORK> β€” Choose framework: torch, onnx, tflite
  • --no-setup β€” Skip venv creation and dependency installation

Examples:

penguin new detector-api --template minimal --framework torch
penguin new multi-model-api --template multi-model --framework onnx --no-setup
penguin new queue-api --template async-heavy --framework tflite

penguin add model <model-name>

Add a new model to an existing penguin project.

Usage:

cd your-project
penguin add model yolo_v5
penguin add model faster_rcnn

Output:

  • src/models/<model-name>/inference.py β€” Model loading and prediction logic
  • src/models/<model-name>/utils.py β€” Helper functions
  • src/models/<model-name>/__init__.py β€” Package initialization

penguin init

Initialize venv and dependencies in an existing FastAPI project.

Usage:

cd existing-fastapi-project
penguin init

Requirements:

  • Project must contain src/app.py or src/main.py
  • requirements.txt must exist

Templates

Minimal Template

Best for single-model inference tasks.

API Endpoints:

  • GET / β€” Root endpoint
  • GET /api/health β€” Health check
  • POST /api/predict β€” Run inference

Structure:

src/
β”œβ”€β”€ main.py              # Entry point
β”œβ”€β”€ app.py               # FastAPI app factory
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ routes.py        # API endpoints
β”‚   └── schemas.py       # Pydantic models
β”œβ”€β”€ core/
β”‚   └── config.py        # Settings & configuration
└── models/
    └── model_1/
        β”œβ”€β”€ inference.py # Model inference
        └── utils.py     # Utility functions

Multi-Model Template

Best for projects with multiple detectors/models.

API Endpoints:

  • GET /api/health β€” Health check
  • GET /api/models β€” List available models
  • POST /api/predict/{model_name} β€” Run inference with specific model

Key Components:

  • ModelRegistry β€” Central model management
  • /api/models endpoint shows registered models
  • Route-based model selection

Structure:

src/
β”œβ”€β”€ main.py
β”œβ”€β”€ app.py
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ routes.py
β”‚   └── schemas.py
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ config.py
β”‚   └── model_registry.py   # Model management
└── models/
    β”œβ”€β”€ detector_1/
    └── detector_2/

Async-Heavy Template

Best for long-running inference tasks (image processing, batch jobs, etc.).

API Endpoints:

  • GET /api/health β€” Health check
  • POST /api/submit-job β€” Submit async inference job
  • GET /api/job-status/{job_id} β€” Check job progress
  • GET /api/results/{job_id} β€” Retrieve job results (when ready)

For Production: Replace in-memory queue with Redis or RabbitMQ for persistence and multi-worker support.

Structure:

src/
β”œβ”€β”€ main.py
β”œβ”€β”€ app.py
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ routes.py
β”‚   └── schemas.py
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ config.py
β”‚   └── job_queue.py        # Queue management
└── workers/
    └── inference.py        # Background job processing

Framework Support

PyTorch (--framework torch)

Includes:

  • torch==2.11.0
  • torchvision==0.26.0
  • numpy==2.4.4

Example:

penguin new torch-detector --template minimal --framework torch

ONNX Runtime (--framework onnx)

Includes:

  • onnx==1.21.0
  • onnxruntime==1.24.4
  • numpy==2.4.4

Example:

penguin new onnx-api --template multi-model --framework onnx

TensorFlow Lite (--framework tflite)

Includes:

  • tensorflow==2.21.0
  • numpy==2.4.4

Example:

penguin new tflite-server --template async-heavy --framework tflite

Generated Project Structure

Each generated project follows this structure:

my-project/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.py                 # Entry point
β”‚   β”œβ”€β”€ app.py                  # FastAPI app factory
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ routes.py           # API endpoints
β”‚   β”‚   └── schemas.py          # Pydantic models
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── config.py           # Settings (pydantic-settings)
β”‚   └── models/
β”‚       └── model_1/            # Your model(s)
β”‚           β”œβ”€β”€ __init__.py
β”‚           β”œβ”€β”€ inference.py    # Model class & predict()
β”‚           └── utils.py        # Preprocessing/postprocessing
β”œβ”€β”€ requirements.txt            # Python dependencies (framework-aware)
β”œβ”€β”€ .env.example                # Environment template
β”œβ”€β”€ Dockerfile                  # Production containerization
β”œβ”€β”€ README.md                   # Project-specific docs
└── venv/                       # Virtual environment (created by penguin new)

Usage Examples

Example 1: Create a Single-Model Detection API

# Create minimal project with PyTorch
penguin new face-detector --template minimal --framework torch

cd face-detector
source venv/bin/activate

# Edit src/models/model_1/inference.py to load your face detection model
# Edit src/api/routes.py to handle face detection input/output
# Start server
python src/main.py

Visit http://localhost:8000/docs and test the /api/predict endpoint.


Example 2: Multi-Model Detection System

# Create multi-model project
penguin new detection-system --template multi-model --framework onnx

cd detection-system
source venv/bin/activate

# Add multiple detectors
penguin add model person_detector
penguin add model vehicle_detector
penguin add model pose_estimator

# Update src/core/config.py to register models
# Update each model's inference.py with actual ONNX logic
# Start server
python src/main.py

Now you can:

curl http://localhost:8000/api/models
# {"models": ["person_detector", "vehicle_detector", "pose_estimator"], "count": 3}

curl -X POST http://localhost:8000/api/predict/person_detector \
  -H "Content-Type: application/json" \
  -d '{"data": "base64-encoded-image"}'

Example 3: Async Batch Processing

# Create async-heavy project for long-running jobs
penguin new batch-processor --template async-heavy --framework tflite --no-setup

cd batch-processor
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Edit src/workers/inference.py for actual inference logic
python src/main.py

Usage:

# Submit a job
JOB=$(curl -X POST http://localhost:8000/api/submit-job \
  -H "Content-Type: application/json" \
  -d '{"data": "input"}' | jq -r .job_id)

# Poll for status
curl http://localhost:8000/api/job-status/$JOB

# Get results when complete
curl http://localhost:8000/api/results/$JOB

Configuration

Environment Variables

Each project includes a .env.example file:

cp .env.example .env
# Edit .env with your configuration

Common Variables:

  • HOST β€” Server host (default: 0.0.0.0)
  • PORT β€” Server port (default: 8000)
  • DEBUG β€” Enable debug mode (default: false)
  • RELOAD β€” Auto-reload on code changes (default: true)

See .env.example for template-specific variables.


Model Configuration

Edit src/core/config.py to customize model paths and settings:

class Settings(BaseSettings):
    model_name: str = "model_1"
    model_path: str = "models/model_1"  # Change this
    # ... other settings

Docker Deployment

Each generated project includes a Dockerfile:

# Build image
docker build -t my-detector .

# Run container
docker run -p 8000:8000 my-detector

# With environment variables
docker run -p 8000:8000 \
  -e PORT=8000 \
  -e RELOAD=false \
  my-detector

Troubleshooting

Issue: "Python 3 not found"

Solution:

# Install Python 3.8+
# macOS
brew install python@3.11

# Ubuntu/Debian
sudo apt-get install python3.11 python3.11-venv

# Then try again
penguin new my-project

Issue: pip install fails during project creation

Solution: Use --no-setup and install manually:

penguin new my-project --framework torch --no-setup
cd my-project
python -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Issue: Model not loading in inference.py

Solution: Edit src/models/<model>/inference.py and implement the _load_model() method:

def _load_model(self):
    """Load your actual model here."""
    import torch
    self.model = torch.load("path/to/model.pth")

Issue: Virtual environment already exists

Solution: Remove and recreate:

rm -rf venv
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Development

Build from Source

git clone https://github.com/yourusername/penguin
cd penguin
cargo build --release

Binary will be at: target/release/penguin

Run Tests

cargo test

Publishing to Crates.io

The project is already published on crates.io:

# Install from crates.io
cargo install penguin-ml

# To publish updates:
# 1. Update version in Cargo.toml
# 2. cargo publish

Current version: v0.1.0 (penguin-ml on crates.io)

Contributing

Contributions welcome! Areas for improvement:

  • Remote template registry support
  • Additional framework templates (JAX, Hugging Face Transformers)
  • Advanced monitoring/metrics
  • WebSocket support for streaming inference
  • Multi-GPU deployment helpers

Architecture

Modules

  • scaffolder.rs β€” Template rendering via Tera, file generation
  • dependencies.rs β€” Framework-specific dependency mapping
  • python_env.rs β€” Virtual environment and pip integration
  • progress.rs β€” CLI progress spinners and feedback

All templates are compiled into the binary using include_str!() β€” zero external dependencies.


Performance & Binary Size

  • Binary size: ~10 MB (optimized)
  • Startup: < 100ms
  • Project scaffold: < 1 second
  • Dependency install: 1-5 minutes (network dependent)

License

MIT


Support

For issues, questions, or feature requests:

  • Open a GitHub issue
  • Check existing issues
  • Review generated project READMEs for template-specific docs

Roadmap

  • Remote template registry for custom templates
  • penguin doctor β€” Environment diagnostics
  • Hot-reload file watcher
  • Multiple Python version support
  • Pre-commit hooks scaffolding
  • CI/CD pipeline templates (GitHub Actions, GitLab CI)
  • Kubernetes deployment helpers
  • Monitoring integration (Prometheus, DataDog)

Happy building with Penguin!

About

A production-ready Rust CLI tool that scaffolds modern, fully-functional FastAPI-based machine learning APIs with built-in templates, automatic dependency management, and smart project initialization.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages