CoreML colgrep fails under default macOS TMPDIR; use stable model cache dir

## Summary

CoreML-enabled `colgrep` can fail during model loading on macOS when the default per-user temp directory is used.

The failure happens before indexing/searching starts, while ONNX Runtime's CoreML execution provider is compiling/loading the ONNX model into a temporary compiled CoreML bundle under `$TMPDIR/onnxruntime-*.model.mlmodelc`.

The same binary, model, and minimal project succeed when only `TMPDIR` is changed to a plain writable directory under `/private/tmp`.

This means users currently need to prefix every model-loading invocation with a custom `TMPDIR`, for example:

```bash
TMPDIR=/private/tmp/colgrep-tmp/ colgrep init
TMPDIR=/private/tmp/colgrep-tmp/ colgrep "query"
```

## Environment

- `colgrep 1.5.4`
- Installed from crates.io with CoreML enabled:

```bash
cargo install colgrep --features "accelerate,coreml" --force
```

- Active binary:

```text
/Users/$USER/.cargo/bin/colgrep
```

- Model:

```text
lightonai/LateOn-Code-edge (CoreML)
```

- Default temp directory:

```text
/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/
```

The same path via `/private/var/...` has Apple metadata:

```text
drwx------@ ... /private/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T
com.apple.rootless: folders
flags: sunlnk
```

`/private/tmp` does not have the same rootless metadata and works.

## Reproduction

Minimal isolated repro using a one-file project and isolated colgrep data dir.

This fails with the default macOS `$TMPDIR`:

```bash
rm -rf /private/tmp/colgrep-coreml-default-tmp
mkdir -p /private/tmp/colgrep-coreml-default-tmp/indices /private/tmp/colgrep-coreml-default-tmp/project
printf '{\n  "default_model": "lightonai/LateOn-Code-edge"\n}\n' > /private/tmp/colgrep-coreml-default-tmp/config.json
printf 'pub fn hello_colgrep_debug() -> &'\''static str { "hello" }\n' > /private/tmp/colgrep-coreml-default-tmp/project/lib.rs

COLGREP_DATA_DIR=/private/tmp/colgrep-coreml-default-tmp/indices \
  colgrep init /private/tmp/colgrep-coreml-default-tmp/project -y
```

Observed output:

```text
🤖 Model: lightonai/LateOn-Code-edge (CoreML)
📂 Building index...
Error: Failed to load ColBERT model

Caused by:
    0: Failed to load ONNX model
    1: Error compiling model: Failed to create a working directory appropriate for URL: file:///var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/
```

In a larger real repo, the same failure also appears as a CoreML compiled model load/plan-build failure:

```text
Error: Failed to load ColBERT model

Caused by:
    0: Failed to load ONNX model
    1: Failed to create MLModel, error: Failed to build the model execution plan using a model architecture file '/private/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/onnxruntime-11475FAC-14AE-4132-A80E-FD8C260D4D6C-95744-000000734CDC4F0E.model.mlmodelc/model.espresso.net' with error code: -2.
```

And previously:

```text
Failed to create MLModel, error: IO Error loading model from compiled model archive: Error opening file .../model.espresso.net: Operation not permitted
```

## Control Case

The same minimal project succeeds when `TMPDIR` is set to a plain directory under `/private/tmp`:

```bash
rm -rf /private/tmp/colgrep-coreml-fp32
mkdir -p /private/tmp/colgrep-coreml-fp32/indices /private/tmp/colgrep-coreml-fp32/tmp /private/tmp/colgrep-coreml-fp32/project
printf '{\n  "default_model": "lightonai/LateOn-Code-edge"\n}\n' > /private/tmp/colgrep-coreml-fp32/config.json
printf 'pub fn hello_colgrep_debug() -> &'\''static str { "hello" }\n' > /private/tmp/colgrep-coreml-fp32/project/lib.rs

COLGREP_DATA_DIR=/private/tmp/colgrep-coreml-fp32/indices \
TMPDIR=/private/tmp/colgrep-coreml-fp32/tmp/ \
  colgrep init /private/tmp/colgrep-coreml-fp32/project -y
```

Observed output:

```text
🤖 Model: lightonai/LateOn-Code-edge (CoreML)
📂 Building index...
Indexed /private/tmp/colgrep-coreml-fp32/project (added: 1, changed: 0, deleted: 0, unchanged: 0)
```

## Diagnosis

The failing boundary is CoreML temporary working directory selection, not the downloaded Hugging Face model cache and not the repo being indexed.

Evidence:

- Deleting the Hugging Face model cache and the repo index does not resolve the issue.
- A one-file project reproduces the failure with the default macOS `$TMPDIR`.
- The same one-file project succeeds when only `TMPDIR` changes to `/private/tmp/...`.
- The failing path is always an ONNX Runtime/CoreML-generated compiled model bundle:

```text
$TMPDIR/onnxruntime-*.model.mlmodelc/model.espresso.net
```

In `next-plaid-onnx`, CoreML is configured without a model cache directory:

```rust
fn configure_coreml(builder: SessionBuilder) -> Result<SessionBuilder> {
    builder
        .with_execution_providers([CoreMLExecutionProvider::default().build()])
        .map_err(|e| anyhow::anyhow!("Failed to configure CoreML execution provider: {e:?}"))
}
```

The `ort` crate's CoreML provider supports an explicit model cache directory:

```rust
CoreMLExecutionProvider::default()
    .with_model_cache_dir("/path/to/cache")
    .build()
```

Without that option, ONNX Runtime/CoreML compiles temporary model artifacts under the process temp dir. On this macOS environment, the default temp dir under `/var/folders/.../T` has rootless metadata and CoreML fails to create/open the compiled bundle there.

## Regression / Provenance

There are two related version boundaries:

1. **CoreML support / Apple Silicon release features existed earlier.**

   Commit `d9f2199` (`Fix release builds missing hardware acceleration (Accelerate, CoreML)`) made Apple Silicon release builds inject:

   ```text
   features = ["coreml", "accelerate"]
   ```

   That commit is contained in releases from `1.0.6` onward.

2. **The `colgrep init` regression appears to start in `v1.5.3`.**

   In `v1.5.2`, `IndexBuilder::ensure_model_created` still selected CPU for non-CUDA colgrep builds, even if the binary was compiled with `coreml`:

   ```rust
   #[cfg(not(feature = "cuda"))]
   let (num_sessions, execution_provider) = {
       crate::onnx_runtime::ensure_onnx_runtime()
           .context("Failed to initialize ONNX Runtime")?;

       (
           self.parallel_sessions
               .unwrap_or_else(crate::config::get_default_cpu_parallel_sessions),
           ExecutionProvider::Cpu,
       )
   };
   ```

   Commit `54da546` (`feat(onnx): add DirectML/MIGraphX/CoreML execution provider support (#119)`) changed that branch so non-CUDA builds with `coreml` select `ExecutionProvider::CoreML`:

   ```rust
   #[cfg(any(feature = "directml", feature = "migraphx", feature = "coreml"))]
   #[cfg(not(feature = "cuda"))]
   let (num_sessions, execution_provider) = {
       crate::onnx_runtime::ensure_onnx_runtime()
           .context("Failed to initialize ONNX Runtime")?;

       let provider = if cfg!(feature = "directml") {
           ExecutionProvider::DirectML
       } else if cfg!(feature = "migraphx") {
           ExecutionProvider::MIGraphX
       } else {
           ExecutionProvider::CoreML
       };

       (
           self.parallel_sessions
               .unwrap_or_else(crate::config::get_default_cpu_parallel_sessions),
           provider,
       )
   };
   ```

   `git tag --contains 54da546` shows this commit is first included in:

   ```text
   v1.5.3
   v1.5.4
   ```

   So the specific `colgrep init` failure where the model line prints `(CoreML)` is very likely introduced in `v1.5.3`, and remains present in `v1.5.4`.

Nuance: `set-model` validation and some search-load paths may have been able to hit CoreML earlier through the lower-level `ExecutionProvider::Auto` / Searcher path. However, the reproducible `init` regression changed specifically in `54da546` / `v1.5.3`.

## Version Workaround Verification

Downgrading to `colgrep 1.5.2` via Cargo avoids the `init` regression, even when compiled with `coreml`:

```bash
cargo install colgrep --version 1.5.2 --features "accelerate,coreml" --force
```

Verified in a real repo:

```text
~/repo/$repository  colgrep init
📋 Resuming interrupted build: 2 files already indexed, 281 remaining
🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
Indexed /Users/$USER/repo/$repository (added: 281, changed: 0, deleted: 0, unchanged: 2)
```

Also verified with a clean two-file smoke project:

```bash
rm -rf /private/tmp/colgrep-152-smoke
mkdir -p /private/tmp/colgrep-152-smoke/project
printf 'pub fn add(left: i32, right: i32) -> i32 { left + right }\n' > /private/tmp/colgrep-152-smoke/project/lib.rs
printf 'fn main() { println!("{}", important_business_logic()); }\nfn important_business_logic() -> &'\''static str { "ok" }\n' > /private/tmp/colgrep-152-smoke/project/main.rs
colgrep init /private/tmp/colgrep-152-smoke/project -y
```

Output:

```text
🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
Indexed /private/tmp/colgrep-152-smoke/project (added: 2, changed: 0, deleted: 0, unchanged: 0)
```

Note that the model line does not print `(CoreML)` on `1.5.2`, matching the source-level finding that `init` still selected `ExecutionProvider::Cpu` before `v1.5.3`.


## Expected Behavior

CoreML-enabled `colgrep` should work without requiring users to override `TMPDIR` for every invocation.

## Suggested Fix

Configure CoreML with a stable model cache directory instead of relying on `$TMPDIR`.

Possible implementation:

- Add support in `next-plaid-onnx` for `NEXT_PLAID_COREML_CACHE_DIR`.
- If unset, choose a stable default on macOS such as:

```text
~/Library/Caches/next-plaid/coreml
```

or a colgrep-owned cache path such as:

```text
~/.cache/colgrep/coreml
```

- Ensure the directory exists before constructing the CoreML EP.
- Use:

```rust
CoreMLExecutionProvider::default()
    .with_model_cache_dir(cache_dir.display().to_string())
    .build()
```

This should also reduce repeated CoreML compile cost across invocations and across multiple ONNX sessions.

## Workaround

Users can work around the issue by wrapping `colgrep`:

```zsh
mkdir -p /private/tmp/colgrep-tmp
chmod 700 /private/tmp/colgrep-tmp

colgrep() {
  TMPDIR=/private/tmp/colgrep-tmp/ command colgrep "$@"
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreML colgrep fails under default macOS TMPDIR; use stable model cache dir #129

Summary

Environment

Reproduction

Control Case

Diagnosis

Regression / Provenance

Version Workaround Verification

Expected Behavior

Suggested Fix

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

CoreML colgrep fails under default macOS TMPDIR; use stable model cache dir #129

Description

Summary

Environment

Reproduction

Control Case

Diagnosis

Regression / Provenance

Version Workaround Verification

Expected Behavior

Suggested Fix

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions