Skip to content

CoreML colgrep fails under default macOS TMPDIR; use stable model cache dir #129

@RBozydar

Description

@RBozydar

Summary

CoreML-enabled colgrep can fail during model loading on macOS when the default per-user temp directory is used.

The failure happens before indexing/searching starts, while ONNX Runtime's CoreML execution provider is compiling/loading the ONNX model into a temporary compiled CoreML bundle under $TMPDIR/onnxruntime-*.model.mlmodelc.

The same binary, model, and minimal project succeed when only TMPDIR is changed to a plain writable directory under /private/tmp.

This means users currently need to prefix every model-loading invocation with a custom TMPDIR, for example:

TMPDIR=/private/tmp/colgrep-tmp/ colgrep init
TMPDIR=/private/tmp/colgrep-tmp/ colgrep "query"

Environment

  • colgrep 1.5.4
  • Installed from crates.io with CoreML enabled:
cargo install colgrep --features "accelerate,coreml" --force
  • Active binary:
/Users/$USER/.cargo/bin/colgrep
  • Model:
lightonai/LateOn-Code-edge (CoreML)
  • Default temp directory:
/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/

The same path via /private/var/... has Apple metadata:

drwx------@ ... /private/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T
com.apple.rootless: folders
flags: sunlnk

/private/tmp does not have the same rootless metadata and works.

Reproduction

Minimal isolated repro using a one-file project and isolated colgrep data dir.

This fails with the default macOS $TMPDIR:

rm -rf /private/tmp/colgrep-coreml-default-tmp
mkdir -p /private/tmp/colgrep-coreml-default-tmp/indices /private/tmp/colgrep-coreml-default-tmp/project
printf '{\n  "default_model": "lightonai/LateOn-Code-edge"\n}\n' > /private/tmp/colgrep-coreml-default-tmp/config.json
printf 'pub fn hello_colgrep_debug() -> &'\''static str { "hello" }\n' > /private/tmp/colgrep-coreml-default-tmp/project/lib.rs

COLGREP_DATA_DIR=/private/tmp/colgrep-coreml-default-tmp/indices \
  colgrep init /private/tmp/colgrep-coreml-default-tmp/project -y

Observed output:

🤖 Model: lightonai/LateOn-Code-edge (CoreML)
📂 Building index...
Error: Failed to load ColBERT model

Caused by:
    0: Failed to load ONNX model
    1: Error compiling model: Failed to create a working directory appropriate for URL: file:///var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/

In a larger real repo, the same failure also appears as a CoreML compiled model load/plan-build failure:

Error: Failed to load ColBERT model

Caused by:
    0: Failed to load ONNX model
    1: Failed to create MLModel, error: Failed to build the model execution plan using a model architecture file '/private/var/folders/c6/g90y1z5n60b06q22qmzr70zr0000gp/T/onnxruntime-11475FAC-14AE-4132-A80E-FD8C260D4D6C-95744-000000734CDC4F0E.model.mlmodelc/model.espresso.net' with error code: -2.

And previously:

Failed to create MLModel, error: IO Error loading model from compiled model archive: Error opening file .../model.espresso.net: Operation not permitted

Control Case

The same minimal project succeeds when TMPDIR is set to a plain directory under /private/tmp:

rm -rf /private/tmp/colgrep-coreml-fp32
mkdir -p /private/tmp/colgrep-coreml-fp32/indices /private/tmp/colgrep-coreml-fp32/tmp /private/tmp/colgrep-coreml-fp32/project
printf '{\n  "default_model": "lightonai/LateOn-Code-edge"\n}\n' > /private/tmp/colgrep-coreml-fp32/config.json
printf 'pub fn hello_colgrep_debug() -> &'\''static str { "hello" }\n' > /private/tmp/colgrep-coreml-fp32/project/lib.rs

COLGREP_DATA_DIR=/private/tmp/colgrep-coreml-fp32/indices \
TMPDIR=/private/tmp/colgrep-coreml-fp32/tmp/ \
  colgrep init /private/tmp/colgrep-coreml-fp32/project -y

Observed output:

🤖 Model: lightonai/LateOn-Code-edge (CoreML)
📂 Building index...
Indexed /private/tmp/colgrep-coreml-fp32/project (added: 1, changed: 0, deleted: 0, unchanged: 0)

Diagnosis

The failing boundary is CoreML temporary working directory selection, not the downloaded Hugging Face model cache and not the repo being indexed.

Evidence:

  • Deleting the Hugging Face model cache and the repo index does not resolve the issue.
  • A one-file project reproduces the failure with the default macOS $TMPDIR.
  • The same one-file project succeeds when only TMPDIR changes to /private/tmp/....
  • The failing path is always an ONNX Runtime/CoreML-generated compiled model bundle:
$TMPDIR/onnxruntime-*.model.mlmodelc/model.espresso.net

In next-plaid-onnx, CoreML is configured without a model cache directory:

fn configure_coreml(builder: SessionBuilder) -> Result<SessionBuilder> {
    builder
        .with_execution_providers([CoreMLExecutionProvider::default().build()])
        .map_err(|e| anyhow::anyhow!("Failed to configure CoreML execution provider: {e:?}"))
}

The ort crate's CoreML provider supports an explicit model cache directory:

CoreMLExecutionProvider::default()
    .with_model_cache_dir("/path/to/cache")
    .build()

Without that option, ONNX Runtime/CoreML compiles temporary model artifacts under the process temp dir. On this macOS environment, the default temp dir under /var/folders/.../T has rootless metadata and CoreML fails to create/open the compiled bundle there.

Regression / Provenance

There are two related version boundaries:

  1. CoreML support / Apple Silicon release features existed earlier.

    Commit d9f2199 (Fix release builds missing hardware acceleration (Accelerate, CoreML)) made Apple Silicon release builds inject:

    features = ["coreml", "accelerate"]
    

    That commit is contained in releases from 1.0.6 onward.

  2. The colgrep init regression appears to start in v1.5.3.

    In v1.5.2, IndexBuilder::ensure_model_created still selected CPU for non-CUDA colgrep builds, even if the binary was compiled with coreml:

    #[cfg(not(feature = "cuda"))]
    let (num_sessions, execution_provider) = {
        crate::onnx_runtime::ensure_onnx_runtime()
            .context("Failed to initialize ONNX Runtime")?;
    
        (
            self.parallel_sessions
                .unwrap_or_else(crate::config::get_default_cpu_parallel_sessions),
            ExecutionProvider::Cpu,
        )
    };

    Commit 54da546 (feat(onnx): add DirectML/MIGraphX/CoreML execution provider support (#119)) changed that branch so non-CUDA builds with coreml select ExecutionProvider::CoreML:

    #[cfg(any(feature = "directml", feature = "migraphx", feature = "coreml"))]
    #[cfg(not(feature = "cuda"))]
    let (num_sessions, execution_provider) = {
        crate::onnx_runtime::ensure_onnx_runtime()
            .context("Failed to initialize ONNX Runtime")?;
    
        let provider = if cfg!(feature = "directml") {
            ExecutionProvider::DirectML
        } else if cfg!(feature = "migraphx") {
            ExecutionProvider::MIGraphX
        } else {
            ExecutionProvider::CoreML
        };
    
        (
            self.parallel_sessions
                .unwrap_or_else(crate::config::get_default_cpu_parallel_sessions),
            provider,
        )
    };

    git tag --contains 54da546 shows this commit is first included in:

    v1.5.3
    v1.5.4
    

    So the specific colgrep init failure where the model line prints (CoreML) is very likely introduced in v1.5.3, and remains present in v1.5.4.

Nuance: set-model validation and some search-load paths may have been able to hit CoreML earlier through the lower-level ExecutionProvider::Auto / Searcher path. However, the reproducible init regression changed specifically in 54da546 / v1.5.3.

Version Workaround Verification

Downgrading to colgrep 1.5.2 via Cargo avoids the init regression, even when compiled with coreml:

cargo install colgrep --version 1.5.2 --features "accelerate,coreml" --force

Verified in a real repo:

~/repo/$repository  colgrep init
📋 Resuming interrupted build: 2 files already indexed, 281 remaining
🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
Indexed /Users/$USER/repo/$repository (added: 281, changed: 0, deleted: 0, unchanged: 2)

Also verified with a clean two-file smoke project:

rm -rf /private/tmp/colgrep-152-smoke
mkdir -p /private/tmp/colgrep-152-smoke/project
printf 'pub fn add(left: i32, right: i32) -> i32 { left + right }\n' > /private/tmp/colgrep-152-smoke/project/lib.rs
printf 'fn main() { println!("{}", important_business_logic()); }\nfn important_business_logic() -> &'\''static str { "ok" }\n' > /private/tmp/colgrep-152-smoke/project/main.rs
colgrep init /private/tmp/colgrep-152-smoke/project -y

Output:

🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
Indexed /private/tmp/colgrep-152-smoke/project (added: 2, changed: 0, deleted: 0, unchanged: 0)

Note that the model line does not print (CoreML) on 1.5.2, matching the source-level finding that init still selected ExecutionProvider::Cpu before v1.5.3.

Expected Behavior

CoreML-enabled colgrep should work without requiring users to override TMPDIR for every invocation.

Suggested Fix

Configure CoreML with a stable model cache directory instead of relying on $TMPDIR.

Possible implementation:

  • Add support in next-plaid-onnx for NEXT_PLAID_COREML_CACHE_DIR.
  • If unset, choose a stable default on macOS such as:
~/Library/Caches/next-plaid/coreml

or a colgrep-owned cache path such as:

~/.cache/colgrep/coreml
  • Ensure the directory exists before constructing the CoreML EP.
  • Use:
CoreMLExecutionProvider::default()
    .with_model_cache_dir(cache_dir.display().to_string())
    .build()

This should also reduce repeated CoreML compile cost across invocations and across multiple ONNX sessions.

Workaround

Users can work around the issue by wrapping colgrep:

mkdir -p /private/tmp/colgrep-tmp
chmod 700 /private/tmp/colgrep-tmp

colgrep() {
  TMPDIR=/private/tmp/colgrep-tmp/ command colgrep "$@"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions