Skip to content

ADR-129: RuvLTRA GCloud Training Pipeline with TurboQuant Optimization #310

@ruvnet

Description

@ruvnet

Overview

Implement the full training pipeline from ADR-129 to retrain RuvLTRA models with TurboQuant KV-cache profiling on Google Cloud.

Phases

Phase 1: imatrix Recalibration + TurboQuant KV Profiling (Week 1)

  • Build gcr.io/ruv-dev/ruvltra-training:latest Docker image
  • Run imatrix recalibration with code-focused calibration data
  • Generate .turboquant.json sidecar profiles per model
  • Benchmark recalibrated GGUFs vs baseline (ablation run B)

Phase 2: WET-Augmented LoRA Fine-Tuning (Week 2-3)

  • Export brain memories + WET data as training corpus
  • Run eval contamination check (13-gram overlap)
  • Validate dataset governance (schema, dedup, quality scores)
  • Run LoRA SFT on Vertex AI A100 (ablation run C)
  • Run DPO training (ablation run D)

Phase 3: Benchmarking & Validation (Week 3-4)

  • Run full ablation matrix (runs A-E)
  • Evaluate all 7 release gates (G1-G7)
  • Produce contamination report + ablation report
  • Automate via scripts/training/release_gate.py

Phase 4: Publishing (Week 4)

  • Produce GGUF variants + .turboquant.json sidecars
  • Publish to HuggingFace (all 4 models)
  • Update model cards with benchmark results
  • Update ruvllm registry with checksums
  • Publish ruvllm and @ruvector/ruvllm with sidecar loading
  • Set up weekly benchmark scheduler job

Release Gates (G1-G7)

Gate Criterion
G1 HumanEval pass@1 ≥ 45% (0.5B) / ≥ 55% (3B)
G2 Routing accuracy ≥ 80% (no regression)
G3 Wikitext-2 PPL increase < 5%
G4 TurboQuant ≥ 8x compression, PPL delta < 1%
G5 Long context PPL < 20 at 16K tokens
G6 Zero eval contamination
G7 Inference ≥ 80 tok/s (0.5B) / ≥ 40 (3B)

Infrastructure

  • Compute: L4 GPU (Cloud Run Jobs) + A100-80GB (Vertex AI)
  • Data: Brain memories (3,870+), WET corpus, Claude Flow routing (2,700+), ADR corpus (129 docs)
  • Estimated cost: ~$70-210 (experimental compute)

Files Created

  • scripts/training/release_gate.py — Automated ship/no-ship checker
  • scripts/training/export_training_data.py — Dataset export with governance
  • scripts/training/contamination_check.py — Eval contamination detection
  • scripts/training/Dockerfile — Training image
  • scripts/training/deploy_training.sh — Cloud Run job creation
  • scripts/training/run_calibration.py — Phase 1 entry point
  • scripts/training/run_sft.py — Phase 2 entry point
  • crates/ruvllm/src/quantize/turboquant_profile.rs — Sidecar config loading

Related

🤖 Generated with claude-flow

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions