Skip to content
View t-timms's full-sized avatar
💭
God is the source code
💭
God is the source code
  • Dallas-Fort Worth, TX
  • 01:37 (UTC -05:00)

Block or report t-timms

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
t-timms/README.md

Tremayne Timms

ML & AI Engineer — Fine-Tuning · Agentic Systems · Edge Deployment · Production LLM Ops

LinkedIn GitHub followers GitHub stars


About Me

I build production LLM systems from the metal up — from quantized models running on Jetson edge hardware to multi-agent cloud deployments with tool-use, permission gating, and audit trails. Currently focused on MoE fine-tuning (ZAYA1-8B), Blackwell-native FP4 quantization (NVFP4), and SOTA agentic coding benchmarks.

Dallas-Fort Worth, TX · ttimmsinternational@gmail.com

Python Rust TypeScript PyTorch CUDA Docker GitHub Actions PostgreSQL


Current Focus (May 2026)

Project What Why It Matters
zaya1-godspeed Fine-tuning ZAYA1-8B MoE for agentic tool calling 760M active params matching 14B models — closing a deliberate gap Zyphra left in the tech report
llama.cpp NVFP4 Blackwell-native FP4 quantization with MSE-optimal scales First consumer NVFP4 tooling on RTX 5070 Ti — PR #22897 awaiting upstream review

What I'm Building

Security-first open-source coding agent. Hand-rolled async ReAct loop with 4-tier deny-first permission engine, SHA-256 hash-chained audit trail, and 200+ LLM providers via LiteLLM. 854 tests.

  • 30+ built-in tools with JSON Schema validation, MCP server + client
  • Parallel + speculative tool dispatch, cost budget enforcement
  • Self-evolution via LLM-guided mutations, multi-language verify gate with retry
  • Training data export (openai/chatml/sharegpt), per-step reward annotations for GRPO
  • SWE-bench Lite: 34.8% single-shot · 52.2% oracle best-of-5

Autonomous multi-agent personal intelligence system on NVIDIA Jetson Orin Nano. 5 LangGraph expert agents, LiteLLM gateway (4 providers + Ollama), 3-tier ONNX intent router. 393 tests. Fully on-device — zero cloud dependencies.

Multi-agent algorithmic trading pipeline with DeepSeek R1 reasoning at every stage. 4-agent pipeline (TA → Chief → Risk → Execution), Kelly Criterion position sizing, Monte Carlo risk simulation, real-time WebSocket market data.

Qwen3.5-4B fine-tuned with ORPO for biblical Q&A. Hybrid RAG (ChromaDB + BM25 + cross-encoder reranking), constitutional AI guardrails, voice pipeline (Whisper + Kokoro TTS), Gradio UI. 183 tests, 34 W&B runs, 5,925 training steps.

Comprehensive GPU fleet validation modeled on NVIDIA DCGM. 16 diagnostic modules, Prometheus + Grafana, fault injection, JUnit XML for CI. 188 tests.

ML research control plane — experiment lifecycle management, model registry, cloud training launcher. Orchestrates gpu-server-test-suite (preflight checks) and llm-wiki (knowledge persistence). 28 tests, v0.1.0.

Git-backed knowledge base — Karpathy's LLM Wiki pattern. LangGraph ingest/query pipelines, instructor + Pydantic structured output, BM25 search, Groq → Gemini → Ollama fallback via LiteLLM. 117 tests, 40 wiki pages.

SQL + Python ETL pipeline for semiconductor quality analysis — supplier performance scoring, defect Pareto distributions, yield trend analysis.

Multi-model ML pipeline for Tesla tire wear prediction. Random Forest, XGBoost, Neural Network ensemble with Claude AI integration.


Open Source Contributions

  • llama.cpp #22897 — NVFP4 default type mapping + per-tensor scale tensors + MSE-optimal correction
  • llama.cpp #22858 — Missing LLAMA_FTYPE_MOSTLY_NVFP4 case fix (closed, replaced by #22897)
  • Zyphra/ZAYA1-8B — Agentic fine-tuning to complete the model's post-training (SFT + GRPO)

GitHub Activity


📈 Contribution Graph

Skills

Area Technologies
LLMs & Agents LiteLLM, 200+ providers, Ollama, llama.cpp, multi-agent orchestration, ReAct loops
Fine-Tuning Unsloth, TRL (SFT/DPO/GRPO/ORPO), QLoRA, PEFT, MoE architectures, RLHF/RLAIF
Inference vLLM (custom forks), speculative decoding (750 tok/s), TensorRT-LLM, EXL2
Quantization NVFP4 (Blackwell-native), GGUF, EXL2, FP8, NF4, GPTQ, AWQ
ML Infrastructure PyTorch, CUDA 12.8, torch.compile, DeepSpeed, lm-eval, W&B, MLflow
Systems Python, Rust, TypeScript, Docker, GitHub Actions CI/CD, systemd
Edge / Hardware NVIDIA Jetson Orin Nano, RTX 5070 Ti (Blackwell sm_120), 16 GB VRAM optimization
Data PostgreSQL, SQL, pandas, SQLAlchemy, ChromaDB, LanceDB, BM25

Tremayne Timms · GitHub · LinkedIn · Email

Pinned Loading

  1. bible-ai-assistant bible-ai-assistant Public

    Bible Q&A — Qwen3.5-4B fine-tuned with ORPO, hybrid RAG, constitutional AI guardrails, voice pipeline

    Python

  2. godspeed-coding-agent godspeed-coding-agent Public

    Alpha — Security-first open-source coding agent. 4-tier permissions, hash-chained audit trails, 200+ LLM providers. Seeking testers.

    Python 1

  3. sovereign-edge sovereign-edge Public

    Sovereign Edge Personal Intelligence System — Jetson Nano Super

    Python

  4. manna-trading manna-trading Public

    Multi-agent AI crypto trading — 4-agent pipeline (TA → Chief → Risk → Execution), DeepSeek R1 reasoning, Kelly Criterion, Monte Carlo, WebSocket

    TypeScript

  5. gpu-server-test-suite gpu-server-test-suite Public

    GPU Server Diagnostic Test Suite — modeled on NVIDIA DCGM architecture

    Python

  6. tesla-tire-wear-ml tesla-tire-wear-ml Public

    Tesla tire wear prediction — ML models (Random Forest, XGBoost, Ensemble) with Claude AI analysis for tire longevity insights

    Jupyter Notebook