common: fix --override-kv to support comma-separated values #18056

ServeurpersoCom · 2025-12-15T12:46:13Z

Make sure to read the contributing guidelines before submitting a PR

Two KV override working :

(root|~/llama.cpp.pascal) ./build/bin/llama-server --port 8081 \
  --model /var/www/ia/models/mradermacher/gemma-3-1b-it-i1-GGUF/gemma-3-1b-it.i1-Q6_K.gguf \
  --override-kv "tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=bool:false"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
... etc...
print_info: file size   = 958.64 MiB (8.04 BPW)
validate_override: Using metadata override ( bool) 'tokenizer.ggml.add_bos_token' = false
validate_override: Using metadata override ( bool) 'tokenizer.ggml.add_eos_token' = false
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
... etc...

Help message :

(root|~/llama.cpp.pascal) ./build/bin/llama-server --port 8081 \
  --model /var/www/ia/models/mradermacher/gemma-3-1b-it-i1-GGUF/gemma-3-1b-it.i1-Q6_K.gguf \
  --override-kv "tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=INVALID:blah"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
string_parse_kv_override: invalid type for KV override 'tokenizer.ggml.add_eos_token=INVALID:blah'
error while handling argument "--override-kv": error: Invalid type for KV override: tokenizer.ggml.add_eos_token=INVALID:blah


usage:
--override-kv KEY=TYPE:VALUE,...        advanced option to override model metadata by key. use comma-separated
                                        list of overrides.
                                        types: int, float, bool, str. example: --override-kv
                                        tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=bool:false


to show complete usage, run with -h
(root|~/llama.cpp.pascal)

Fixes #18040

ggerganov · 2025-12-15T12:49:33Z

I think it's OK, but would like @ngxson to confirm.

ngxson

we will probably handle the case key=str:a,b where string = "a,b"

but this can be resolved later, there is a chance that no one actually use it

common/arg.cpp

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

personalmountains · 2025-12-15T15:27:50Z

Thanks for this PR, but I feel like it just introduces more problems. It will require further changes to escape the comma, it doesn't fix the other options, and it breaks backward compatibility. I can come up with a PR today that allows multiple parameters again, it's already done locally. Would that make sense or am I misunderstanding something?

ServeurpersoCom · 2025-12-15T15:44:46Z

Thanks for this PR, but I feel like it just introduces more problems. It will require further changes to escape the comma, it doesn't fix the other options, and it breaks backward compatibility. I can come up with a PR today that allows multiple parameters again, it's already done locally. Would that make sense or am I misunderstanding something?

Thanks for the feedback! I understand your concern about backward compatibility.
However, I think the comma-separated approach is cleaner for several reasons:

Consistency: It matches --override-tensor (-ot) which already uses commas
Router compatibility: Multi-argument support doesn't work with the router/preset system (which uses std::map)
Simplicity: Users only need to change a few characters in their startup scripts

That said, I'm open to discussion. If the maintainers prefer keeping backward compatibility with multi-arguments, I can adjust the PR. But IMO, consistency across the codebase is more valuable than preserving a pattern that's broken in router mode anyway.
What do @ggerganov and @ngxson think?

ngxson · 2025-12-15T16:10:04Z

@personalmountains I'm quite against supporting repeated args because as explained in the OP, the environment variable does not accept it.

The arg.cpp and preset.cpp system should be designed to treat CLI arg, ini preset and env var the same way. I think having a dedicated logic to escape the comma won't hurt anyway.

ngxson · 2025-12-15T16:12:47Z

also, the design of presets using std::map was for a reason, some args don't want to be specified twice (which can potentially lead to undefined behavior). it also allow easier merging multiple presets in the future, where one preset can extend from another

so IMO we should not make repeated args a default experience, apparently other config formats like JSON or YAML doesn't want it, then why would we?

personalmountains · 2025-12-15T16:25:57Z

Fair enough.

personalmountains · 2025-12-15T17:32:17Z

Sorry, maybe also add a noisy warning when options are discarded? I'm just concerned about the silent backward compatibility break. This can break a model in subtle ways by failing to override certain keys. I found this by chance, really.

ggerganov

I agree with @ngxson. It's not ideal that the change is not backwards compatible, but the prospect of improving the new config system is a good tradeoff IMO.

ServeurpersoCom · 2025-12-15T17:46:04Z

I propose we add a warning directly in the parser for all duplicated arguments. This way everyone gets into good habits, and it allows for progressive migration toward cleaner config patterns ?

ngxson · 2025-12-15T17:50:54Z

Yes that sounds good. We should remove all help messages mentioning about repeated args too, so new users won't follow the same pattern

ServeurpersoCom · 2025-12-15T17:51:56Z

For --override-kv, comma escaping could be supported for edge cases where a comma appears in the value itself:

diff --git a/common/arg.cpp b/common/arg.cpp
index 002c3c168..1ecad7e9c 100644
--- a/common/arg.cpp
+++ b/common/arg.cpp
@@ -2196,7 +2196,32 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
         "advanced option to override model metadata by key. to specify multiple overrides, either use comma-separated or repeat this argument.\n"
         "types: int, float, bool, str. example: --override-kv tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=bool:false",
         [](common_params & params, const std::string & value) {
-            for (const auto & kv_override : string_split<std::string>(value, ',')) {
+            std::vector<std::string> kv_overrides;
+
+            std::string current;
+            bool escaping = false;
+
+            for (const char c : value) {
+                if (escaping) {
+                    current.push_back(c);
+                    escaping = false;
+                } else if (c == '\\') {
+                    escaping = true;
+                } else if (c == ',') {
+                    kv_overrides.push_back(current);
+                    current.clear();
+                } else {
+                    current.push_back(c);
+                }
+            }
+
+            if (escaping) {
+                current.push_back('\\');
+            }
+
+            kv_overrides.push_back(current);
+
+            for (const auto & kv_override : kv_overrides) {
                 if (!string_parse_kv_override(kv_override.c_str(), params.kv_overrides)) {
                     throw std::runtime_error(string_format("error: Invalid type for KV override: %s\n", kv_override.c_str()));
                 }

ServeurpersoCom · 2025-12-15T17:52:50Z

Yes that sounds good. We should remove all help messages mentioning about repeated args too, so new users won't follow the same pattern

OK I do it on this PR

ServeurpersoCom · 2025-12-15T18:29:42Z

It look like :

DEPRECATED: argument '--override-kv' specified multiple times, use comma-separated values instead (only last value will be used)

(root|~/llama.cpp.pascal) ./build/bin/llama-server --port 8081 --model /var/www/ia/models/mradermacher/gemma-3-1b-it-i1-GGUF/gemma-3-1b-it.i1-Q6_K.gguf --override-kv "tokenizer.ggml.add_bos_token=bool:false" --override-kv "tokenizer.ggml.add_eos_token=bool:false"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
DEPRECATED: argument '--override-kv' specified multiple times, use comma-separated values instead (only last value will be used)
main: setting n_parallel = 4 and kv_unified = true (add -kvu to disable this)
build: 7435 (1e0bca00b) with GNU 12.2.0 for Linux x86_64
system info: n_threads = 16, n_threads_batch = 16, total_threads = 32

ServeurpersoCom · 2025-12-15T18:33:32Z

And escaping for --override-kv :

string_parse_kv_override: invalid type for KV override 'tokenizer.ggml.add_eos_token=ESCAP,ING:bl,ah'

(root|~/llama.cpp.pascal) ./build/bin/llama-server --port 8081   --model /var/www/ia/models/mradermacher/gemma-3-1b-it-i1-GGUF/gemma-3-1b-it.i1-Q6_K.gguf   --override-kv "tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=ESCAP\,ING:bl\,ah"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
string_parse_kv_override: invalid type for KV override 'tokenizer.ggml.add_eos_token=ESCAP,ING:bl,ah'
error while handling argument "--override-kv": error: Invalid type for KV override: tokenizer.ggml.add_eos_token=ESCAP,ING:bl,ah


usage:
--override-kv KEY=TYPE:VALUE,...        advanced option to override model metadata by key. to specify multiple
                                        overrides, either use comma-separated or repeat this argument.
                                        types: int, float, bool, str. example: --override-kv
                                        tokenizer.ggml.add_bos_token=bool:false,tokenizer.ggml.add_eos_token=bool:false


to show complete usage, run with -h
(root|~/llama.cpp.pascal)

common/arg.cpp

Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>

ServeurpersoCom · 2025-12-15T19:25:37Z

Yes that sounds good. We should remove all help messages mentioning about repeated args too, so new users won't follow the same pattern

Should we clean up ALL "can be repeated" help text in this PR, or keep it focused on --override-kv?
The warning is already general (applies to all args), but most other args don't support comma-separated syntax yet. We could:

Migrate all args to comma-separated syntax inside this PR (more work but complete solution ~~-> would need a common utility function for escaping~~)
Remove all "can be repeated" mentions now (forces migration even without comma support available yet)
Keep this PR focused on --override-kv and migrate other args progressively in follow-up PRs

What do you prefer?

Edit: Turns out there weren't that many, so I migrated them all to eliminate the technical debt.

common: fix --override-kv to support comma-separated values

98ea3ee

ServeurpersoCom requested a review from ggerganov as a code owner December 15, 2025 12:46

loci-dev mentioned this pull request Dec 15, 2025

UPSTREAM PR #18056: common: fix --override-kv to support comma-separated values auroralabs-loci/llama.cpp#576

Open

ggerganov requested a review from ngxson December 15, 2025 12:49

ngxson approved these changes Dec 15, 2025

View reviewed changes

common/arg.cpp Outdated Show resolved Hide resolved

Update common/arg.cpp

b83309d

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

ggerganov approved these changes Dec 15, 2025

View reviewed changes

common: deprecate repeated arguments, suggest comma-separated values

502f718

common: add comma escape support for --override-kv

a37539e

personalmountains reviewed Dec 15, 2025

View reviewed changes

common/arg.cpp Outdated Show resolved Hide resolved

personalmountains reviewed Dec 15, 2025

View reviewed changes

common/arg.cpp Show resolved Hide resolved

common: optimize duplicate detection with insert().second

ea7044e

Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>

common: migrate all repeated args to comma-separated syntax

cb2321c

ggerganov merged commit 487674f into ggml-org:master Dec 17, 2025
67 of 76 checks passed

ngxson mentioned this pull request Dec 18, 2025

arg: fix ASAN error on sampler_type_names empty #18167

Merged

common: fix --override-kv to support comma-separated values #18056

common: fix --override-kv to support comma-separated values #18056

Uh oh!

Conversation

ServeurpersoCom commented Dec 15, 2025

Two KV override working :

Help message :

Uh oh!

ggerganov commented Dec 15, 2025

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

personalmountains commented Dec 15, 2025

Uh oh!

ServeurpersoCom commented Dec 15, 2025

Uh oh!

ngxson commented Dec 15, 2025

Uh oh!

ngxson commented Dec 15, 2025

Uh oh!

personalmountains commented Dec 15, 2025

Uh oh!

personalmountains commented Dec 15, 2025

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ServeurpersoCom commented Dec 15, 2025

Uh oh!

ngxson commented Dec 15, 2025

Uh oh!

ServeurpersoCom commented Dec 15, 2025

Uh oh!

ServeurpersoCom commented Dec 15, 2025

Uh oh!

ServeurpersoCom commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ServeurpersoCom commented Dec 15, 2025

Uh oh!

Uh oh!

Uh oh!

ServeurpersoCom commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ServeurpersoCom commented Dec 15, 2025 •

edited

Loading

ServeurpersoCom commented Dec 15, 2025 •

edited

Loading