perf(coreml): cache compiled models by default (faster runs + fixes restricted $TMPDIR) (#129)#142
Merged
Merged
Conversation
…ache-dir (#129) CoreML compiles the ONNX model into a CoreML bundle at session creation. With no cache dir, ONNX Runtime compiles into the ephemeral $TMPDIR, so the model is recompiled on every invocation, and on macOS setups where $TMPDIR (under /var/folders/.../T) is rootless-restricted the compile fails outright (#129). - next-plaid-onnx: point CoreML at a persistent per-user model cache dir by default (~/Library/Caches/next-plaid/coreml, honoring XDG_CACHE_HOME). The compiled model is reused across runs and never compiled under $TMPDIR. Precedence: NEXT_PLAID_COREML_CACHE_DIR -> default cache dir -> $TMPDIR (only if the cache dir cannot be created). - colgrep: persistent `coreml_cache_dir` setting to relocate the cache (`colgrep settings --coreml-cache-dir <PATH>` / `--clear-coreml-cache-dir`), shown in `colgrep settings`, exported as NEXT_PLAID_COREML_CACHE_DIR at startup. Benefit (per-invocation --force-gpu CoreML search, fresh process each run): - LateOn-Code-edge: ~1.01s -> ~0.70s/run after first (~31% faster) - answerai-small: ~1.85s -> ~1.16s/run after first (~37% faster) First (cold) run matches main; every subsequent run loads the cached model. Also fixes the rootless-$TMPDIR crash by construction (no $TMPDIR compile). Closes #129 Co-authored-by: RBozydar <34664833+RBozydar@users.noreply.github.com>
bab14b1 to
9885755
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #129. Co-authored with the issue reporter @RBozydar.
What & why
CoreML compiles the ONNX model into a CoreML bundle at session creation. With no cache dir, ONNX Runtime compiles into the ephemeral
$TMPDIR, so:colgrepinvocation (each run is a fresh process), and$TMPDIR(under/var/folders/.../T) is rootless-restricted, the compile fails outright (the reported crash).This points CoreML at a persistent per-user cache dir by default —
~/Library/Caches/next-plaid/coreml(honoringXDG_CACHE_HOME) — so the compiled model persists across runs and never touches$TMPDIR.Benefit (benchmarked: per-invocation
--force-gpuCoreML search, fresh process each run)LateOn-Code-edgeanswerai-smallThe first (cold) run matches
main; every subsequent run loads the cached compiled model. Since searches run constantly, this compounds.And it fixes the original crash by construction: CoreML no longer compiles under
$TMPDIR, so the rootless-restriction failure can't occur.Behavior / compatibility
~/Library/Cachesinstead of using ephemeral temp. (Opted into for the speed + robustness win.)NEXT_PLAID_COREML_CACHE_DIR/colgrep settings --coreml-cache-dir→ default cache dir →$TMPDIR(only if the cache dir can't be created).Override (persistent colgrep setting)
Validation
--features coreml; unit tests (coreml_cache_dirset/get/clear/serialize) + full suites green viaci-quick./var/folderscrash locally, but caching-by-default removes the$TMPDIRdependency that causes it. @RBozydar — confirming on your environment would be the final check.Closes #129