feat(providers): add ZCode (z.ai GLM-5.2) usage provider#537
Merged
Conversation
Reads ZCode CLI usage from ~/.zcode/cli/db/db.sqlite. The model_usage table has exact per-request tokens; cost is computed from the pricing table since ZCode stores none (GLM-5.2 runs on the z.ai start-plan subscription). - Split cached tokens out of input_tokens (OpenAI-style) so fresh input and cache reads price correctly - Attach each turn's tool calls to one request to avoid double-counting - Map GLM-5.2 to glm-5p1 (GLM-5.1 rate) until LiteLLM lists it - Register as a lazy SQLite provider; add test and provider doc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds CodeBurn support for ZCode, the z.ai CLI coding agent (GLM-5.2).
Where the data comes from
ZCode writes a single SQLite db at
~/.zcode/cli/db/db.sqlite. Themodel_usagetable carries exact per-request token counts (input, output, reasoning, cache read/creation), model id, status, and timestamps. The JSONL activity log redacts token counts and the desktop app dir holds only Electron state, so the db is the only usable source.No source records a dollar cost: GLM-5.2 runs on the z.ai start-plan subscription, so ZCode logs tokens only. Cost is computed from the pricing table, the same notional approach used for a Claude Max plan.
Notable details
input_tokens(OpenAI-style). The parser subtracts cache read and creation so fresh input bills at the input rate and cached at the cache-read rate. Verified against the nested Anthropic usage inprovider_metadata_json(100 input = 36 fresh + 64 cached).tool_usage, which links to a turn rather than a specific request, so each turn's tools attach to one request to avoid double-counting. Bash command text is not stored, sobashCommandsis empty.glm-5p1(GLM-5.1 rate) inBUILTIN_ALIASES. Reports show it asglm-5p1, consistent with how every aliased model displays. Drop the alias once LiteLLM lists GLM-5.2.node:sqlite), matching crush, cursor, and opencode.Tests
tests/providers/zcode.test.ts: discovery, token normalization plus pricing, and dedup.Files
src/providers/zcode.ts(new)src/providers/index.ts(register lazy provider)src/models.ts(GLM-5.2 pricing alias)tests/providers/zcode.test.ts(new)docs/providers/zcode.md(new) anddocs/providers/README.md(index row)