Package, verify, and deploy AI models in air-gapped environments.
Forge is an open-source CLI framework that solves the problem of deploying LLMs into disconnected, classified, or air-gapped networks. It handles the entire lifecycle: downloading models on an internet-connected machine, creating cryptographically verified transfer bundles, and deploying with hardware-adaptive configuration on the target system.
No existing open-source tool provides cryptographic transfer verification, compliance-oriented audit logging, and one-command deployment for air-gapped AI. Defense contractors charge millions for proprietary solutions. Forge fills that gap.
Organizations operating in air-gapped environments (defense, intelligence, critical infrastructure, regulated industries) face a painful process to deploy AI:
- No standardized transfer process - models are walked across on USB drives with manual checksum verification
- No chain-of-custody logging - no audit trail of what was transferred, by whom, or when
- No hardware-adaptive deployment - every installation requires manual configuration for the target hardware
- No integrity verification - no automated way to confirm model weights weren't tampered with during transfer
LOW SIDE (internet) HIGH SIDE (air-gapped)
+-----------------+ USB/media +----------------------------+
| forge prepare | -------------> | forge verify |
| - download | transfer | - SHA-256 check all files |
| - package | | - tamper detection |
| - manifest | | |
| - audit log | | forge deploy |
+-----------------+ | - hardware auto-detect |
| - docker load + start |
| - health checks |
| - API ready |
+----------------------------+
pip install forge-airgap
# 1. Check hardware capabilities
forge info
# 2. Package a model for transfer (internet-connected machine)
forge prepare --model TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF --quant Q4_K_M --output ./bundle
# 3. Transfer bundle to air-gapped machine via approved media
# 4. Verify bundle integrity
forge verify ./bundle
# 5. Deploy
forge deploy ./bundle --profile auto --port 8080Detects hardware and recommends a deployment profile.
+--------------------------+
| Forge Hardware Detection |
+--------------------------+
CPU
+------------------------------------------+
| Model | AMD Ryzen 9 9950X |
| Cores | 8 physical / 16 logical |
| AVX2 | Yes |
+------------------------------------------+
GPU (1 detected)
+---------------------------------------------+
| Name | NVIDIA GeForce RTX 5080 |
| VRAM | 16,303 MB |
| CUDA | 13.1 |
+---------------------------------------------+
Deployment Profiles
+---------------------------------------------+
| MINIMAL | Supported |
| STANDARD | Supported |
| ENTERPRISE | Supported <-- Recommended |
+---------------------------------------------+
Downloads a model, exports Docker images, generates a SHA-256 integrity manifest, and creates a self-contained transfer bundle.
forge prepare \
--model TheBloke/Llama-2-7B-Chat-GGUF \
--quant Q4_K_M \
--output ./my-bundleVerifies every file in a bundle against its SHA-256 manifest. Detects tampering, missing files, and unexpected additions.
forge verify ./my-bundle
# Exit code 0 = verified, 1 = failed File Integrity Check
+----------------------------------------------+
| PASS | bundle.toml | 423 B |
| PASS | config/engine.toml | 227 B |
| PASS | models/llama-2-7b.Q4_K_M | 3.8 GB |
+----------------------------------------------+
+--------------------------------------------------+
| VERIFIED - All 6/6 files passed integrity check |
+--------------------------------------------------+
Deploys a verified bundle with automatic hardware detection and profile selection.
forge deploy ./my-bundle --profile auto --port 8080- Auto-detects GPU, VRAM, RAM, CPU cores
- Selects optimal deployment profile (minimal/standard/enterprise)
- Loads Docker images from bundle tarballs
- Generates hardware-specific docker-compose configuration
- Starts llama.cpp server with GPU layer offloading
- Runs health checks and reports endpoints
Shows running Forge services.
forge statusA Forge bundle is a self-contained directory with everything needed for deployment:
my-bundle/
+-- manifest.json # SHA-256 hashes of all files + metadata
+-- bundle.toml # Bundle config (model, engine, profiles)
+-- models/
| +-- model.gguf # Model weights
+-- images/
| +-- inference.tar # llama.cpp server Docker image
| +-- ui.tar # Web UI Docker image
+-- config/
| +-- engine.toml # Inference engine settings
| +-- profiles/ # Hardware profile configs
+-- audit/
+-- prepare.log # Chain-of-custody from preparation
Forge auto-detects hardware and selects the optimal deployment configuration:
| Profile | GPU | RAM | Configuration |
|---|---|---|---|
| Minimal | None | 8+ GB | CPU inference, Q4_K_M quantization, 2048 context |
| Standard | 6+ GB VRAM | 16+ GB | Partial GPU offload, 4096 context |
| Enterprise | 16+ GB VRAM | 32+ GB | Full GPU offload, 8192 context |
Every operation is logged as structured JSON aligned with NIST 800-53 AU-2 audit event requirements:
{
"timestamp": "2026-03-22T00:55:50Z",
"level": "info",
"audit_event": "MODEL_DOWNLOADED",
"bundle_id": "3e7ba633-80dc-450c-9ba1-b95a6b250824",
"operator": "eclip",
"hostname": "low-side-workstation",
"model_name": "TinyLlama-1.1B-Chat-v1.0-GGUF",
"size_bytes": 668788096,
"sha256": "9fecc3b3cd76bba8...",
"forge_version": "0.1.0"
}Events tracked: PREPARE_STARTED, MODEL_DOWNLOADED, MANIFEST_GENERATED, VERIFY_STARTED, VERIFY_COMPLETED, DEPLOY_STARTED, DEPLOY_COMPLETED, DEPLOY_FAILED, and more.
src/forge/
+-- cli/ # Typer CLI commands
+-- core/ # Bundle, manifest, Pydantic models
+-- hardware/ # GPU/CPU/RAM detection, profile selection
+-- engines/ # Inference engine configs (llama.cpp)
+-- packaging/ # Model download, Docker export, bundling
+-- deployment/ # Docker orchestration, compose generation
+-- audit/ # Structured JSON logging (structlog)
+-- ui/ # Minimal FastAPI + static web UI
+-- utils/ # SHA-256 hashing, Rich output helpers
Tech stack: Python 3.11+ | Typer + Rich | Pydantic v2 | docker-py | llama.cpp | structlog | SHA-256 | AES-256-GCM
What Forge handles:
- SHA-256 integrity verification of every file in a transfer bundle
- Tamper detection (modified, missing, or unexpected files)
- Chain-of-custody audit logging with operator identity and timestamps
- Optional AES-256-GCM bundle encryption for transport
What Forge does not handle (and why):
- Classification markings - derivative classification decisions have legal implications under EO 13526 and require a classification management officer
- Multi-level security - MLS is a multi-billion dollar problem; Forge operates within a single classification level
- Model provenance verification - validating that a model hasn't been poisoned is an open research problem
- Network isolation - the host OS and network architecture handle segmentation, not the deployment tool
# Clone and install in development mode
git clone https://github.com/zhadyz/forge.git
cd forge
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Lint
ruff check src/
# Type check
mypy src/forge/- GPG manifest signing (multi-party verification)
- SBOM generation for Docker images
- vLLM engine support for server deployments
- Model provenance tracking
- FIPS 140-2 validated cryptography
- Qdrant vector database for enterprise RAG deployments
- Multi-node distributed inference
Apache-2.0