ondeinference · keypair34 · Jun 27, 2026 · Jun 27, 2026 · Jun 27, 2026 · Jun 27, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,66 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## What this is
+
+`onde-cli` is a native Rust terminal UI (binary name `onde`) for managing an Onde Inference account and running a local model pipeline: fine-tune a safetensors small language model with LoRA, merge the adapter, export to GGUF, test it in a local chat, upload it to Hugging Face, and assign it to an Onde app. All packaging (npm, PyPI, NuGet, pub.dev, Homebrew, crates.io) is thin wrappers around this one binary; the Rust crate is the source of truth.
+
+## Build, run, test
+
+```sh
+cargo build                 # debug build
+cargo run                   # launches the TUI
+cargo test                  # unit tests only (the heavy ones are #[ignore])
+cargo clippy --all-targets
+```
+
+The build **bakes credentials in at compile time** via `build.rs`, which reads `.env` (or the environment in CI). The build fails if any of these are missing: `ONDE_APP_ID`, `ONDE_APP_SECRET`, `GRESIQ_API_KEY`, `GRESIQ_API_SECRET`. `HF_TOKEN` is optional (defaults to empty). Changing `.env` triggers a rebuild. These are exposed in code as `env!(...)` constants in `src/app.rs`.
+
+On macOS the build pulls in candle's `metal` and `accelerate` features, so inference and training run on Metal; elsewhere they fall back to CPU.
+
+### Tests that load real models
+
+`src/gguf.rs` has two `#[ignore]` integration tests that export a ~1GB GGUF and run inference through `onde::mistralrs`. They need a Qwen3-0.6B model cached in the Onde App Group container and skip gracefully when it is absent.
+
+```sh
+cargo test gguf::tests::exported_qwen3_gguf_is_runnable -- --ignored --nocapture
+cargo test gguf::tests::finetune_merge_export_run     -- --ignored --nocapture
+```
+
+Env knobs for these: `ONDE_TEST_MODEL_DIR`, `ONDE_TEST_DTYPE` (`f16`/`q8_0`), `ONDE_TEST_LR`.
+
+### Debugging the TUI
+
+`main.rs` redirects both stdout and stderr to `~/.cache/onde/debug.log` before ratatui takes the alternate screen, because `mistral.rs` writes to both fds and would otherwise tear up the TUI. To see what the app or the inference engine is doing, tail that log; `println!`/`eprintln!`/`log::*` all land there, not on screen.
+
+## Architecture
+
+### TUI event loop (`app.rs`, `ui.rs`, `main.rs`)
+
+The whole app is one `App` struct plus a `Screen` enum acting as a state machine. `app::run` owns a single `tokio::sync::mpsc` channel of `AuthEvent`s and a `crossterm` `EventStream`, multiplexed with `tokio::select!`. Keystrokes mutate `App` synchronously; anything slow (network calls, downloads, fine-tune, merge, GGUF export, chat inference) is spawned as a background tokio task or OS thread that streams progress back as `AuthEvent`s, which `App::apply` folds into state. `ui.rs` is a pure render of `App` and intentionally does not depend on the SDK directly (it re-exports `OndeApp`/`OndeModel` through `app.rs`). When adding a feature, the pattern is: add a `Screen` variant, a key handler, an `AuthEvent` variant, and a background task that emits progress.
+
+Background work uses two task kinds deliberately: network/IO uses `tokio::spawn`; CPU-heavy tensor work (`finetune`, `merge`, `gguf`) uses `std::thread::spawn` so it does not starve the async runtime.
+
+### The local model pipeline
+
+These modules form the fine-tune-to-deploy chain, each running on a background thread and streaming a `*Progress` enum:
+
+- `finetune.rs` — hand-written LoRA trainer (candle). Builds a Qwen forward pass (RMSNorm, RoPE, GQA, optional Qwen3 QK-norm), trains LoRA A/B on q/v projections in F32, writes `lora_adapter.safetensors`. Gradients are sanitized (non-finite elements zeroed) and globally norm-clipped before each AdamW step; a step is skipped entirely if its global grad norm is non-finite. Skipping this guard corrupts every weight after the first bad step.
+- `merge.rs` — folds the LoRA adapter back into base weights (`W + scale·(B@A)`), writes a merged `model.safetensors` plus copied config/tokenizer.
+- `gguf.rs` — **hand-rolled GGUF writer** (no llama.cpp). Conventions that must hold for mistral.rs/candle to load and run the file correctly: tensor dims are written innermost-first (reverse of safetensors shape, because candle reverses on read); `head_dim` comes from config and is emitted as `attention.key_length`/`value_length` (Qwen3 decouples it from `hidden_size/num_heads`); `token_type` is an INT32 array; `general.architecture` is chosen from `model_type` so Qwen3 routes to candle's `quantized_qwen3` loader.
+- `chat.rs` — loads a local GGUF via `onde::mistralrs::GgufModelBuilder` (the same engine the Onde SDK uses) and streams a multi-turn chat, so a model can be tested before publishing.
+- `hf_upload.rs` / `hf_clone.rs` / `hf_search.rs` / `hf.rs` — Hugging Face Hub: upload the GGUF, check/create a repo, search models, and resolve/merge the local HF cache (incl. the macOS Onde App Group container).
+
+### Account / deploy side (`gresiq.rs`, `token.rs`, `project.rs`)
+
+`gresiq.rs` wraps `smbcloud-gresiq-sdk` for apps and the model catalog. "Deploying" a model means `assign_model(app_id, model_id)` against a catalog entry; the end app (e.g. `rumilearnpersian`) then fetches that assignment through the Onde SDK's `load_assigned_model` and downloads the GGUF. The CLI can only assign models that already exist in the GresIQ catalog. `token.rs` persists the auth token; `project.rs` manages per-project fine-tune workspaces under `~/.onde`.
+
+### The `onde` dependency
+
+The `onde` crate provides the inference engine (`onde::mistralrs`, a vendored mistral.rs) and `onde::inference::models::SUPPORTED_MODEL_INFO` (the supported-model catalog the inference picker mirrors). It is normally the published crate; `Cargo.toml` has commented `path`/`[patch.crates-io]` blocks for developing against local checkouts of `onde`, the `smbcloud-*` crates, and `candle`. Never commit with those uncommented.
+
+## Conventions
+
+- Git: merge feature/release/hotfix branches with `--no-ff` (explicit merge commits). Tag the merge commit that holds the final release state. See `.agents/skills/git/SKILL.md`.
+- Distribution changes: see `.agents/skills/distribution/SKILL.md` before touching any wrapper package; keep all channel versions aligned with the Rust crate version.
diff --git a/src/app.rs b/src/app.rs
@@ -97,35 +97,44 @@ pub struct AdapterEntry {
     pub kind: ArtifactKind,
 }
 
-impl AdapterEntry {
-    /// Classify the location of this artifact for display purposes.
-    ///
-    /// - `"Onde Inference"`: inside the shared App Group container
-    /// - `"HF Cache"`: inside `~/.cache/huggingface/hub` or `$HF_HOME`
-    /// - the raw path string: anything else, usually a custom or local export
-    pub fn location_label(&self) -> String {
-        let path_str = self.path.to_string_lossy();
+/// Classify a filesystem path into a short, human-friendly location label.
+///
+/// Long absolute paths (e.g. the App Group container) get truncated to
+/// gibberish in narrow TUI fields, so anywhere we'd otherwise print a raw
+/// path we show this label instead.
+///
+/// - `"Onde Inference"`: inside the shared App Group container
+/// - `"HF Cache"`: inside `~/.cache/huggingface/hub` or `$HF_HOME`
+/// - the parent directory string: anything else, usually a custom or local export
+pub fn location_label_for_path(path: &std::path::Path) -> String {
+    let path_str = path.to_string_lossy();
+
+    // App Group container (macOS)
+    if path_str.contains("group.com.ondeinference.apps") {
+        return "Onde Inference".to_string();
+    }
 
-        // App Group container (macOS)
-        if path_str.contains("group.com.ondeinference.apps") {
-            return "Onde Inference".to_string();
-        }
+    // Standard HF cache locations
+    if path_str.contains(".cache/huggingface/hub") {
+        return "HF Cache".to_string();
+    }
+    if let Ok(hf_home) = std::env::var("HF_HOME")
+        && path_str.starts_with(&hf_home)
+    {
+        return "HF Cache".to_string();
+    }
 
-        // Standard HF cache locations
-        if path_str.contains(".cache/huggingface/hub") {
-            return "HF Cache".to_string();
-        }
-        if let Ok(hf_home) = std::env::var("HF_HOME")
-            && path_str.starts_with(&hf_home)
-        {
-            return "HF Cache".to_string();
-        }
+    // For custom or local paths, just show the directory.
+    path.parent()
+        .map(|p| p.to_string_lossy().to_string())
+        .unwrap_or_else(|| path_str.to_string())
+}
 
-        // For custom or local paths, just show the directory.
-        self.path
-            .parent()
-            .map(|p| p.to_string_lossy().to_string())
-            .unwrap_or_else(|| path_str.to_string())
+impl AdapterEntry {
+    /// Classify the location of this artifact for display purposes.
+    /// See [`location_label_for_path`].
+    pub fn location_label(&self) -> String {
+        location_label_for_path(&self.path)
     }
 
     /// Whether this GGUF should show the upload-to-HuggingFace UI.

diff --git a/src/gguf.rs b/src/gguf.rs
@@ -1078,7 +1078,10 @@ mod tests {
         });
         let adapter_path = adapter_path.expect("fine-tune must produce an adapter");
         eprintln!("[e2e] fine-tune done, last loss {last_loss}");
-        assert!(last_loss.is_finite(), "training loss is NaN/Inf — training diverged");
+        assert!(
+            last_loss.is_finite(),
+            "training loss is NaN/Inf — training diverged"
+        );
 
         // 2. Merge.
         let merged_dir = work.join("merged");
@@ -1158,8 +1161,15 @@ mod tests {
             Ok("f16") => GgufDtype::F16,
             _ => GgufDtype::Q8_0,
         };
-        eprintln!("[test] model_dir={} dtype={}", model_dir.display(),
-            if matches!(dtype, GgufDtype::F16) { "f16" } else { "q8_0" });
+        eprintln!(
+            "[test] model_dir={} dtype={}",
+            model_dir.display(),
+            if matches!(dtype, GgufDtype::F16) {
+                "f16"
+            } else {
+                "q8_0"
+            }
+        );
 
         let out_dir = std::env::temp_dir().join("onde-gguf-test");
         std::fs::create_dir_all(&out_dir).unwrap();

diff --git a/src/ui.rs b/src/ui.rs
@@ -1860,10 +1860,21 @@ fn render_finetune_done(frame: &mut Frame, app: &App, adapter_path: &std::path::
         .style(Style::new().bg(C_SURFACE_STRONG));
     let path_inner = path_block.inner(rows[3]);
     frame.render_widget(path_block, rows[3]);
+    // Show a friendly location badge plus the file name instead of the raw
+    // absolute path, which truncates to gibberish in this narrow box.
+    let adapter_file = adapter_path
+        .file_name()
+        .map(|n| n.to_string_lossy().to_string())
+        .unwrap_or_default();
     frame.render_widget(
-        Paragraph::new(adapter_path.to_string_lossy().to_string())
-            .style(Style::new().fg(C_NEON))
-            .wrap(Wrap { trim: true }),
+        Paragraph::new(Line::from(vec![
+            Span::styled(
+                format!("[{}]", crate::app::location_label_for_path(adapter_path)),
+                Style::new().fg(C_NEON).bold(),
+            ),
+            Span::styled(format!("  {adapter_file}"), Style::new().fg(C_TEXT)),
+        ]))
+        .wrap(Wrap { trim: true }),
         path_inner,
     );
 
@@ -1938,8 +1949,8 @@ fn render_merge_gguf_section(frame: &mut Frame, app: &App, area: Rect) {
                     Span::styled("✓ ", Style::new().fg(C_NEON)),
                     Span::styled("Merged → ", Style::new().fg(C_NEON).bold()),
                     Span::styled(
-                        output_path.to_string_lossy().to_string(),
-                        Style::new().fg(C_TEXT),
+                        format!("[{}]", crate::app::location_label_for_path(output_path)),
+                        Style::new().fg(C_TEXT).bold(),
                     ),
                 ])),
                 rows[0],