Forge

Package, verify, and deploy AI models in air-gapped environments.

Forge is an open-source CLI framework that solves the problem of deploying LLMs into disconnected, classified, or air-gapped networks. It handles the entire lifecycle: downloading models on an internet-connected machine, creating cryptographically verified transfer bundles, and deploying with hardware-adaptive configuration on the target system.

No existing open-source tool provides cryptographic transfer verification, compliance-oriented audit logging, and one-command deployment for air-gapped AI. Defense contractors charge millions for proprietary solutions. Forge fills that gap.

The Problem

Organizations operating in air-gapped environments (defense, intelligence, critical infrastructure, regulated industries) face a painful process to deploy AI:

No standardized transfer process - models are walked across on USB drives with manual checksum verification
No chain-of-custody logging - no audit trail of what was transferred, by whom, or when
No hardware-adaptive deployment - every installation requires manual configuration for the target hardware
No integrity verification - no automated way to confirm model weights weren't tampered with during transfer

How Forge Works

LOW SIDE (internet)                    HIGH SIDE (air-gapped)
+-----------------+    USB/media    +----------------------------+
| forge prepare   | -------------> | forge verify               |
|  - download     |    transfer    |  - SHA-256 check all files |
|  - package      |                |  - tamper detection        |
|  - manifest     |                |                            |
|  - audit log    |                | forge deploy               |
+-----------------+                |  - hardware auto-detect    |
                                   |  - docker load + start     |
                                   |  - health checks           |
                                   |  - API ready               |
                                   +----------------------------+

Quick Start

pip install forge-airgap

# 1. Check hardware capabilities
forge info

# 2. Package a model for transfer (internet-connected machine)
forge prepare --model TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF --quant Q4_K_M --output ./bundle

# 3. Transfer bundle to air-gapped machine via approved media

# 4. Verify bundle integrity
forge verify ./bundle

# 5. Deploy
forge deploy ./bundle --profile auto --port 8080

CLI Commands

`forge info`

Detects hardware and recommends a deployment profile.

+--------------------------+
| Forge Hardware Detection |
+--------------------------+

                CPU
+------------------------------------------+
|  Model       |  AMD Ryzen 9 9950X        |
|  Cores       |  8 physical / 16 logical  |
|  AVX2        |  Yes                      |
+------------------------------------------+

               GPU (1 detected)
+---------------------------------------------+
|  Name     |  NVIDIA GeForce RTX 5080        |
|  VRAM     |  16,303 MB                      |
|  CUDA     |  13.1                           |
+---------------------------------------------+

             Deployment Profiles
+---------------------------------------------+
|  MINIMAL     |  Supported                   |
|  STANDARD    |  Supported                   |
|  ENTERPRISE  |  Supported <-- Recommended   |
+---------------------------------------------+

`forge prepare`

Downloads a model, exports Docker images, generates a SHA-256 integrity manifest, and creates a self-contained transfer bundle.

forge prepare \
  --model TheBloke/Llama-2-7B-Chat-GGUF \
  --quant Q4_K_M \
  --output ./my-bundle

`forge verify`

Verifies every file in a bundle against its SHA-256 manifest. Detects tampering, missing files, and unexpected additions.

forge verify ./my-bundle

# Exit code 0 = verified, 1 = failed

              File Integrity Check
+----------------------------------------------+
| PASS   | bundle.toml                 |  423 B |
| PASS   | config/engine.toml          |  227 B |
| PASS   | models/llama-2-7b.Q4_K_M   | 3.8 GB |
+----------------------------------------------+

+--------------------------------------------------+
| VERIFIED - All 6/6 files passed integrity check  |
+--------------------------------------------------+

`forge deploy`

Deploys a verified bundle with automatic hardware detection and profile selection.

forge deploy ./my-bundle --profile auto --port 8080

Auto-detects GPU, VRAM, RAM, CPU cores
Selects optimal deployment profile (minimal/standard/enterprise)
Loads Docker images from bundle tarballs
Generates hardware-specific docker-compose configuration
Starts llama.cpp server with GPU layer offloading
Runs health checks and reports endpoints

`forge status`

Shows running Forge services.

forge status

Bundle Format

A Forge bundle is a self-contained directory with everything needed for deployment:

my-bundle/
+-- manifest.json          # SHA-256 hashes of all files + metadata
+-- bundle.toml            # Bundle config (model, engine, profiles)
+-- models/
|   +-- model.gguf         # Model weights
+-- images/
|   +-- inference.tar      # llama.cpp server Docker image
|   +-- ui.tar             # Web UI Docker image
+-- config/
|   +-- engine.toml        # Inference engine settings
|   +-- profiles/          # Hardware profile configs
+-- audit/
    +-- prepare.log        # Chain-of-custody from preparation

Hardware Profiles

Forge auto-detects hardware and selects the optimal deployment configuration:

Profile	GPU	RAM	Configuration
Minimal	None	8+ GB	CPU inference, Q4_K_M quantization, 2048 context
Standard	6+ GB VRAM	16+ GB	Partial GPU offload, 4096 context
Enterprise	16+ GB VRAM	32+ GB	Full GPU offload, 8192 context

Audit Logging

Every operation is logged as structured JSON aligned with NIST 800-53 AU-2 audit event requirements:

{
  "timestamp": "2026-03-22T00:55:50Z",
  "level": "info",
  "audit_event": "MODEL_DOWNLOADED",
  "bundle_id": "3e7ba633-80dc-450c-9ba1-b95a6b250824",
  "operator": "eclip",
  "hostname": "low-side-workstation",
  "model_name": "TinyLlama-1.1B-Chat-v1.0-GGUF",
  "size_bytes": 668788096,
  "sha256": "9fecc3b3cd76bba8...",
  "forge_version": "0.1.0"
}

Events tracked: PREPARE_STARTED, MODEL_DOWNLOADED, MANIFEST_GENERATED, VERIFY_STARTED, VERIFY_COMPLETED, DEPLOY_STARTED, DEPLOY_COMPLETED, DEPLOY_FAILED, and more.

Architecture

src/forge/
+-- cli/            # Typer CLI commands
+-- core/           # Bundle, manifest, Pydantic models
+-- hardware/       # GPU/CPU/RAM detection, profile selection
+-- engines/        # Inference engine configs (llama.cpp)
+-- packaging/      # Model download, Docker export, bundling
+-- deployment/     # Docker orchestration, compose generation
+-- audit/          # Structured JSON logging (structlog)
+-- ui/             # Minimal FastAPI + static web UI
+-- utils/          # SHA-256 hashing, Rich output helpers

Security Considerations

What Forge handles:

SHA-256 integrity verification of every file in a transfer bundle
Tamper detection (modified, missing, or unexpected files)
Chain-of-custody audit logging with operator identity and timestamps
Optional AES-256-GCM bundle encryption for transport

What Forge does not handle (and why):

Classification markings - derivative classification decisions have legal implications under EO 13526 and require a classification management officer
Multi-level security - MLS is a multi-billion dollar problem; Forge operates within a single classification level
Model provenance verification - validating that a model hasn't been poisoned is an open research problem
Network isolation - the host OS and network architecture handle segmentation, not the deployment tool

Development

# Clone and install in development mode
git clone https://github.com/zhadyz/forge.git
cd forge
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Lint
ruff check src/

# Type check
mypy src/forge/

Roadmap

GPG manifest signing (multi-party verification)
SBOM generation for Docker images
vLLM engine support for server deployments
Model provenance tracking
FIPS 140-2 validated cryptography
Qdrant vector database for enterprise RAG deployments
Multi-node distributed inference

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docker		docker
src/forge		src/forge
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forge

The Problem

How Forge Works

Quick Start

CLI Commands

`forge info`

`forge prepare`

`forge verify`

`forge deploy`

`forge status`

Bundle Format

Hardware Profiles

Audit Logging

Architecture

Security Considerations

Development

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Forge

The Problem

How Forge Works

Quick Start

CLI Commands

forge info

forge prepare

forge verify

forge deploy

forge status

Bundle Format

Hardware Profiles

Audit Logging

Architecture

Security Considerations

Development

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`forge info`

`forge prepare`

`forge verify`

`forge deploy`

`forge status`

Packages