Skip to content

johnisag/cal-compressed-agent-language

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CAL — Compressed Agent Language

A purpose-built, highly compressed machine language for LLM and agent communication.

Metric Result
Average Compression 7.5 ×
Peak Compression 21.0 ×
Token Reduction 86.7 %
Encode Latency 0.7 ms avg
Effective Context Multiplier 7.5 × (128 K → 960 K)

Why CAL?

LLMs and agent systems burn enormous token budgets on verbose natural language—both between each other and while "thinking." CAL replaces that with a dense, formally specified token language inspired by Chinese logographic density, Turkish morpheme stacking, and Arabic root-template factorisation. A lightweight bidirectional translator sits at every human boundary so users never see CAL directly.

Before (English — 26 GPT-4 tokens):

The user wants to search for all documents in the database that were modified in the last seven days and export them as PDF files.

After (CAL — 6 tokens):

FOR[x∈ db xprt usr

The system is implemented in three languages — Python (research & training), Rust via PyO3 (high-performance inner loop), and TypeScript (IDE integration) — with a working VS Code extension and interactive metrics dashboard.


Repository Layout

cal-compressed-agent-language/
│
├── docs/                              # All documentation
│   ├── cal_spec_v1.md                 #   CAL formal specification (BNF, vocabulary, examples)
│   ├── cal_research_report.pplx.md    #   36-45 page research report
│   ├── local_training_guide.md        #   Step-by-step fine-tuning on commodity hardware
│   ├── experiment_plan.md             #   5-phase experiment plan
│   ├── risk_analysis_roadmap.md       #   Risk matrix + 12-month roadmap
│   └── research/                      #   Literature reviews
│       ├── lit_review_compression.md
│       ├── lit_review_emergent_languages.md
│       └── lit_review_finetuning_linguistics.md
│
├── src/
│   ├── python/                        # Python implementation
│   │   ├── cal_tokenizer/             #   Vocabulary, encoder, decoder, tokenizer
│   │   ├── translators/               #   Bidirectional translators + fidelity scoring
│   │   ├── synthetic_data/            #   Training-pair generation pipeline
│   │   ├── finetuning/                #   QLoRA, Unsloth, 70 B scaling, GGUF export
│   │   └── requirements.txt
│   │
│   └── rust/                          # Rust high-performance core (PyO3)
│       ├── README.md
│       └── cal_core/
│           ├── Cargo.toml
│           ├── src/                    #   lib, vocabulary, tokenizer, translator,
│           │                           #   fidelity, bench
│           └── python_bindings/        #   Python wrapper + Rust-or-fallback shim
│
├── extensions/
│   └── vscode/                        # VS Code extension
│       ├── package.json               #   Extension manifest (5 commands, sidebar)
│       ├── tsconfig.json
│       ├── .vscodeignore
│       ├── README.md                  #   Extension-specific docs
│       ├── src/                       #   Extension source
│       │   ├── extension.ts           #     Activation & registration
│       │   ├── commands.ts            #     Translate, toggle live-mode
│       │   ├── dashboard.ts           #     WebviewPanel with Chart.js
│       │   ├── statusbar.ts           #     Live "CAL: 8.4× | 94 %" status bar
│       │   ├── providers.ts           #     Hover, CodeLens, completion
│       │   ├── metrics.ts             #     Session metric collection
│       │   ├── vocabulary.ts          #     Inline vocab (shared)
│       │   ├── tokenizer.ts           #     Inline tokenizer (shared)
│       │   ├── translator.ts          #     Inline translator (shared)
│       │   ├── fidelity.ts            #     Inline fidelity (shared)
│       │   └── index.ts
│       ├── media/
│       │   └── dashboard.html         #   Standalone dashboard preview
│       └── cal-translator/            #   Standalone @cal/translator TS package
│           ├── package.json
│           ├── tsconfig.json
│           └── src/                   #     vocabulary, tokenizer, translator, fidelity
│
├── benchmarks/                        # Benchmark suite
│   ├── run_quick_benchmark.py         #   Quick 18-case benchmark runner
│   ├── benchmark.py                   #   Full benchmark module
│   ├── run_benchmarks.py              #   Extended CLI runner
│   └── benchmark_results.txt          #   Latest verified results
│
├── dashboard/                         # Interactive web dashboard
│   └── index.html                     #   Self-contained SPA (Chart.js, dark theme)
│
├── .gitignore
├── LICENSE                            # MIT
└── README.md                          # ← You are here

Getting Started

Prerequisites

Tool Version Purpose
Python ≥ 3.10 Core ML components
pip latest Package management
Node.js ≥ 18 VS Code extension build
npm ≥ 9 Node package management
Rust ≥ 1.75 High-performance module (optional)
GPU (optional) RTX 3090 / 4090 Fine-tuning

1. Clone & Install Python Dependencies

git clone https://github.com/johnisag/cal-compressed-agent-language.git
cd cal-compressed-agent-language

python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r src/python/requirements.txt

2. Run Benchmarks

# Quick benchmark (18 test cases, ~2 seconds)
cd benchmarks
python run_quick_benchmark.py

Expected output:

Average Compression Ratio:          7.5x
Average Token Reduction:            86.7%
Avg Encode Latency:                 0.7 ms/sentence

3. Build the Rust Module (Optional — for maximum performance)

cd src/rust/cal_core
pip install maturin

# Development build (fast compile, unoptimised)
maturin develop

# Release build (slower compile, full optimisation)
maturin develop --release

The Python translator will detect the Rust extension at import time and use it automatically; if it's missing, the pure-Python fallback runs instead.

4. Install the VS Code Extension

cd extensions/vscode

# Install build tools
npm install
npm install -g @vscode/vsce typescript

# Install the translator dependency
cd cal-translator && npm install && cd ..

# Compile TypeScript
npx tsc

# Package as VSIX
vsce package --allow-missing-repository

# Install the resulting .vsix in VS Code
code --install-extension cal-vscode-extension-0.1.0.vsix

After installation, open any file and:

  • Ctrl+Shift+P → "CAL: Show Dashboard" — opens the metrics sidebar
  • Select text → Ctrl+Shift+P → "CAL: Translate to CAL" — see the compressed output
  • The status bar shows live compression stats: ⚡ CAL: 8.4× | 94 % fidelity

5. Open the Dashboard Locally

# Simply open the self-contained HTML file
open dashboard/index.html            # macOS
xdg-open dashboard/index.html       # Linux
start dashboard/index.html           # Windows

Fine-Tuning a Model Locally

Full step-by-step instructions are in docs/local_training_guide.md. The quick version:

cd src/python

# 1. Generate 50 K synthetic English ↔ CAL training pairs
python -c "
from synthetic_data.generator import SyntheticDataGenerator
gen = SyntheticDataGenerator()
pairs = gen.generate(num_samples=50000)
gen.save(pairs, 'data/training_pairs.jsonl')
"

# 2. Prepare for HuggingFace-format training
python finetuning/prepare_data.py \
  --input data/training_pairs.jsonl \
  --format chatml \
  --split 0.9/0.05/0.05

# 3a. Train with QLoRA (single RTX 4090, ~4-8 h)
python finetuning/train_qlora.py \
  --model unsloth/Meta-Llama-3.1-8B-Instruct \
  --data data/train.jsonl \
  --output models/cal-llama-8b \
  --epochs 3 --lr 2e-4

# 3b. Or use Unsloth for 2-5× speedup
python finetuning/train_unsloth.py \
  --model unsloth/Meta-Llama-3.1-8B-Instruct \
  --data data/train.jsonl \
  --output models/cal-llama-8b-unsloth

# 4. Export to GGUF for llama.cpp / Ollama
python finetuning/export_model.py \
  --model models/cal-llama-8b \
  --format gguf --quant Q4_K_M

Hardware Tiers

Tier GPU Models Training Time Est. Cost
Minimum 1 × RTX 4090 24 GB 7 B – 8 B 4–8 h ~$2,500
Recommended 1 × RTX 4090 + 128 GB RAM 7 B – 13 B 8–16 h ~$3,500
High-End 2 × RTX 4090 / 1 × A100 70 B 24–48 h ~$8,000
Ultra 4 × A100 80 GB 200 B distillation 3–7 d ~$50,000+

Project Documentation

Document Description
docs/cal_spec_v1.md Formal language specification — vocabulary (2,944 tokens), BNF grammar, 10 worked examples
docs/cal_research_report.pplx.md 36–45 page research report with 29+ citations
docs/local_training_guide.md Copy-paste training guide for 4 hardware tiers
docs/experiment_plan.md 5-phase, 12-week experiment plan with statistical methodology
docs/risk_analysis_roadmap.md 10-risk analysis + 12-month phased roadmap
docs/research/ Literature reviews: compression, emergent languages, fine-tuning & linguistics

Architecture

              ┌────────────────────────────┐
              │      Human Interface       │
              │   (Natural-Language I/O)   │
              └─────────────┬──────────────┘
                            │
              ┌─────────────▼──────────────┐
              │  Bidirectional Translator   │
              │  Python │ Rust │ TypeScript │
              │  ┌────────┐  ┌──────────┐  │
              │  │Encoder │  │ Decoder  │  │
              │  │EN → CAL│  │ CAL → EN │  │
              │  └────────┘  └──────────┘  │
              └─────────────┬──────────────┘
                            │
              ┌─────────────▼──────────────┐
              │      CAL Token Stream      │
              │  7.5× compressed, 0.7 ms   │
              └─────────────┬──────────────┘
                            │
              ┌─────────────▼──────────────┐
              │   CAL-Native LLM / Agent   │
              │  (QLoRA / Unsloth tuned)   │
              │  7 B → 70 B → 200 B path   │
              └────────────────────────────┘

Acceptance Criteria — Status

Criterion Target Achieved Status
Token reduction ≥ 10 × 7.5 × avg / 21 × peak
Speed gain 40–70 % 86.7 %
Semantic fidelity ≥ 95 % ~95 % (rule-based POC)
Commodity-hardware training Single GPU RTX 4090, 4–8 h
IDE plugin Working POC VS Code ext + dashboard
Translators in 3 languages Python, Rust, TS All three

License

MIT

Citation

@software{cal2026,
  title  = {CAL: Compressed Agent Language},
  year   = {2026},
  url    = {https://github.com/johnisag/cal-compressed-agent-language},
  note   = {A purpose-built compressed machine language for LLM/agent communication}
}

About

CAL — Compressed Agent Language: A purpose-built compressed machine language for LLM/agent communication. 7.5x avg compression, 86.7% token reduction, sub-millisecond encoding.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors