Add OpenClaw recipes and document VRAM-tiered context defaults#1
Add OpenClaw recipes and document VRAM-tiered context defaults#1kenvandine wants to merge 16 commits into
Conversation
| }, | ||
| "model_name": "user.gemma-3-4b-it-GGUF", | ||
| "labels": [ | ||
| "appear-builtin", |
There was a problem hiding this comment.
“appear-builtin” means “don’t show the user. in the name”, is that intentional here?
Main reason I ask is because the name gemma-3-4b-it-GGUF is very close to a builtin name.
Makes me wonder if we should have any special prefix or suffix for recipes. Like openclaw.gemma-3-4b-it-GGUF or gemma-3-4b-it-GGUF-OpenClaw
There was a problem hiding this comment.
Good point. Switched to openclaw. prefix across all recipes and dropped appear-builtin — so the names will display as openclaw.gemma-3-4b-it-GGUF etc.
| "main": "unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_M", | ||
| "mmproj": "mmproj-F16.gguf" | ||
| }, | ||
| "model_name": "user.Gemma-4-26B-A4B", |
There was a problem hiding this comment.
See comment above; in addition let’s make sure the naming scheme is consistent? e.g., the prior file had -GGUF but this one doesn’t.
There was a problem hiding this comment.
Fixed — model_name is now openclaw.Gemma-4-26B-A4B-GGUF.
| "recipe_options": { | ||
| "ctx_size": 32768 | ||
| }, | ||
| "size": 2.56 |
There was a problem hiding this comment.
Crazy thought… if we’re hardcoding the context size on these, could we statically factor that into the size? That would make VRAM capacity based filtering much easier on the client side.
There was a problem hiding this comment.
Let me think about that and come back to you.
| @@ -0,0 +1,13 @@ | |||
| { | |||
| "checkpoint": "unsloth/Qwen2.5-Coder-7B-Instruct-128K-GGUF:Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf", | |||
There was a problem hiding this comment.
What’s the reason for using Qwen2.5 and Qwen3-VL? I would have thought the newer Qwen3.5 models of the same size would be strictly better.
There was a problem hiding this comment.
It's just from the list of recommended models I found from other sources, now that I've proven the recipes mechanism works well with a decent collection of models I can look at refreshing that list and testing some of the newer models and remove some of the older models.
There was a problem hiding this comment.
Checked — there's no Qwen3.5-Coder at ~7-9B (the coder line jumps to Qwen3-Coder-30B and Qwen3-Coder-Next at 80B). Replaced Qwen2.5-Coder-7B with Qwen3.5-9B (5.68 GB Q4_K_M), which is the better general model at the same size tier. For vision, Qwen2.5-VL-7B is dropped since Qwen3-VL-8B is already in the set and is the more capable vision-specific model.
There was a problem hiding this comment.
All Qwen3.5 models are vision enabled and from my tests (I tested vision enabled model a lot) they are infinitely better than the older Qwen3-VL
There was a problem hiding this comment.
Good to know — updated both Qwen3.5-9B and Qwen3.5-35B-A3B with mmproj and the vision label, and dropped the Qwen3-VL-8B and Qwen3-VL-30B recipes.
- Replace user. prefix with openclaw. and drop appear-builtin label on all recipes - Add -GGUF suffix to Gemma-4-26B-A4B model name for consistency - Replace Qwen2.5-Coder-7B with Qwen3.5-9B (newer generation, better across the board) - Remove Qwen2.5-VL-7B (superseded by Qwen3-VL-8B already in the set) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Qwen3.5 models are all vision-enabled and outperform Qwen3-VL. - Add mmproj and vision label to Qwen3.5-9B and Qwen3.5-35B-A3B - Remove Qwen3-VL-8B and Qwen3-VL-30B recipes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds --chat-template-kwargs to explicitly disable thinking mode, which may cause openclaw to not respond due to <think> blocks in the output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both models are small enough to fit in iGPU shared memory (2.3 GB and 4.1 GB at Q4_K_M) and support tool-calling via llama.cpp CPU/Vulkan backends. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix Phi-4-mini checkpoint to use quantization variant (Q4_K_M) instead of a full filename that doesn't exist in the repo. Remove Mistral-7B-v0.3 recipe as it requires HF authentication to download. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drop the Q4_K_M variant since it can't find a matching file in the repo. Let lemonade pick the first .gguf file automatically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phi-4-mini was slow on Intel Vulkan and not responding in openclaw. Llama-3.2-3B has well-supported tool calling format and a similar size (~2 GB). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix reserveTokensFloor complaint by matching ctx_size used by other recipes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This PR adds a collection of model recipes curated for OpenClaw-style agent and assistant use cases.
Changes
openclaw/directory with recipes for:openclaw/README.mddocumenting VRAM-tiered context size defaultsREADME.mdto reference the newopenclaw/directory