diff --git a/docs/providers/devin.md b/docs/providers/devin.md index 04c1cb43..d1566473 100644 --- a/docs/providers/devin.md +++ b/docs/providers/devin.md @@ -4,7 +4,7 @@ Cognition Devin CLI local usage tracking. - **Source:** `src/providers/devin.ts` - **Loading:** eager (`src/providers/index.ts`) -- **Test:** `tests/providers/devin.test.ts` (336 lines) +- **Test:** `tests/providers/devin.test.ts` ## Where it reads from @@ -14,7 +14,7 @@ Devin CLI data lives under: ~/.local/share/devin/cli/ ``` -The MVP usage source is transcript JSON: +Usage comes from transcript JSON: ```text ~/.local/share/devin/cli/transcripts/*.json @@ -59,117 +59,106 @@ appear in CLI/UI results until configured. ## Storage format -Transcript root is a JSON object following the [ATIF-v1.4 trajectory schema][atif], -with Devin-specific additions such as per-step `metadata`. The parser does not -validate `schema_version`; it only requires a parseable object with `steps[]`. - +Observed [ATIF trajectory schema][atif] variants used by Devin are **ATIF-v1.4** and **ATIF-v1.7**. +The parser does not validate `schema_version`; it only requires a parseable object with `steps[]` array. +Supports Devin-specific additions such as per-step `metadata`. Core fields include `session_id`, `agent.model_name`, and `steps[]`. -Each counted step can provide: - -- `step_id` -- `metadata.committed_acu_cost` -- `metadata.metrics.input_tokens` -- `metadata.metrics.output_tokens` -- `metadata.metrics.cache_creation_tokens` -- `metadata.metrics.cache_read_tokens` -- `metadata.created_at` -- `metadata.generation_model` -- `metadata.request_id` -- `tool_calls[].function_name` - -User-input steps (`metadata.is_user_input === true`) are skipped. Non-user -steps are included only if they have positive ACU usage or positive token usage. +### Field normalization + +The provider normalizes equivalent usage fields across versions: + +- **Cost inputs (ACU before conversion to USD):** + - `metadata.committed_acu_cost` + - `extra.committed_acu_cost` + - `metadata.committed_credit_cost / 10000` + - `extra.committed_credit_cost / 10000` +- **Input tokens:** + - legacy: `metadata.metrics.input_tokens` + - newer prompt-style: `metrics.prompt_tokens - cache_read - cache_creation` +- **Output tokens:** + - `metrics.output_tokens` or `metrics.completion_tokens` +- **Cache write tokens:** + - `metrics.cache_creation_tokens` + - `metrics.cache_creation_input_tokens` + - `metrics.extra.cache_creation_input_tokens` +- **Cache read tokens:** + - `metrics.cache_read_tokens` + - `metrics.cache_read_input_tokens` + - `metrics.cached_tokens` + - `metrics.extra.cache_read_input_tokens` + +### User/agent step detection + +- User steps are skipped when either: + - `metadata.is_user_input === true` (older exports) + - `source === "user"` (ATIF step source) +- Non-user steps are included only when they contain positive ACU usage or + positive token usage. + +### Session, model, and timestamp fallback + +- **sessionId:** `session_id` -> `trajectory_id` -> transcript filename +- **model:** `step.extra.generation_model` -> `step.metadata.generation_model` -> `step.model_name` -> `agent.model_name` -> `sessions.model` -> `devin` +- **timestamp:** `step.metadata.created_at` -> `step.timestamp` -> `sessions.last_activity_at` -> `sessions.created_at` ## Pricing -`metadata.committed_acu_cost` is per step, not cumulative. The provider converts -each step with: +`costUSD` is always provider-supplied and uses configured ACU conversion: ```text costUSD = committed_acu_cost * devin.acuUsdRate ``` +If a step only has `committed_credit_cost`, CodeBurn converts credits to ACU +using Devin's current export convention: + +```text +committed_acu_cost = committed_credit_cost / 10000 +``` + Token-only steps are still included when they have positive token metrics, but -their `costUSD` is `0` if `committed_acu_cost` is absent. +their `costUSD` is `0` if no committed cost is present. `src/parser.ts` preserves Devin's provider-supplied `costUSD` instead of re-pricing it through LiteLLM. ## sessions.db enrichment -The provider currently reads these columns from `sessions`: - -| Column | Use | -| ------------------- | ----------------------------------------------------------------------------------------------------------- | -| `id` | join key with transcript `session_id` during parsing; discovery uses the transcript filename before `.json` | -| `working_directory` | `projectPath` and derived project name | -| `model` | model fallback | -| `title` | project name fallback | -| `created_at` | timestamp fallback | -| `last_activity_at` | preferred session timestamp fallback | -| `hidden` | skip hidden sessions | - -`message_nodes`, `prompt_history`, and `tool_call_state` are not parsed yet. - -## Timestamps +The provider reads these columns from `sessions`: -Step timestamps come from `metadata.created_at`, falling back to -`sessions.last_activity_at`, then `sessions.created_at`. - -Transcript step timestamps are passed through as ATIF string timestamps. -Numeric normalization is only applied to `sessions.db` timestamps: - -- less than `10_000_000_000`: seconds -- otherwise: milliseconds - -## Model Resolution - -Model names resolve in this order: - -1. `step.metadata.generation_model` -2. `step.model_name` -3. `transcript.agent.model_name` -4. `sessions.model` -5. `devin` - -## Caching - -No provider-level cache. - -The normal session cache stores parsed provider calls, but Devin is always -reparsed by `src/parser.ts` because `sessions.db` can change without the -transcript JSON fingerprint changing. +| Column | Use | +| ------------------- | ------------------------------------------------------------------------------------------------------------ | +| `id` | join key with transcript session id during parsing; discovery uses transcript filename before `.json` | +| `working_directory` | `projectPath` and derived project name | +| `model` | model fallback | +| `title` | project name fallback | +| `created_at` | timestamp fallback | +| `last_activity_at` | preferred session timestamp fallback | +| `hidden` | skip hidden sessions | ## Deduplication `devin::` -The provider name is part of the key via the `devin:` prefix. +When `step_id` is missing, parser falls back to 1-based step index. ## Quirks -- The transcript directory has usage; `sessions.db` is enrichment only. -- `committed_acu_cost` is per-generation/per-step ACU usage. Never treat it as cumulative. +- Transcript JSON is the usage source; `sessions.db` only enriches metadata. - There is no default ACU-to-USD rate. Missing config intentionally hides Devin. - Hidden sessions from `sessions.db` are skipped in discovery and parsing. -- Tool names come directly from `tool_calls[].function_name`; the provider assumes valid ATIF tool-call records. -- If SQLite is unavailable or `sessions.db` cannot be opened, the provider still parses transcripts without enrichment. +- Tool names are taken from either `tool_calls[].function_name` or + `tool_calls[].function.name`. +- If SQLite is unavailable or `sessions.db` cannot be opened, transcript parsing + still works without enrichment. ## When fixing a bug here -1. First check whether `~/.config/codeburn/config.json` contains a valid - `devin.acuUsdRate`. Without it, no Devin sessions should appear. -2. For usage total bugs, compare against: - - ```bash - jq '[.steps[] | select(.metadata.committed_acu_cost != null) | .metadata.committed_acu_cost] | add' ~/.local/share/devin/cli/transcripts/.json - ``` - -3. If project/model/timestamp metadata is wrong, inspect `sessions.db`, not the transcript. -4. If a hidden session appears, check the `hidden` column. Discovery can only - hide sessions whose transcript filename matches `sessions.id`; parsing uses - the transcript `session_id` when present. -5. Run `tests/providers/devin.test.ts` after parser changes. It covers ACU conversion, disabled-until-configured behavior, timestamp parsing, deduplication, hidden sessions, and `sessions.db` enrichment. +1. Confirm `~/.config/codeburn/config.json` contains a valid positive + `devin.acuUsdRate`. +2. Validate a transcript step's raw usage fields first (cost + token fields). +3. If project/model/timestamp metadata is wrong, inspect `sessions.db`. +4. Run `tests/providers/devin.test.ts` after parser changes. [atif]: https://github.com/harbor-framework/harbor/blob/main/rfcs/0001-trajectory-format.md diff --git a/src/providers/devin.ts b/src/providers/devin.ts index 3061f96d..da00f0eb 100644 --- a/src/providers/devin.ts +++ b/src/providers/devin.ts @@ -17,51 +17,84 @@ import { isPositiveNumber, safeNumber } from "../parser.js"; type AgentTrajectory = { schema_version: string; session_id?: string; + trajectory_id?: string; agent: Agent; steps: T[]; }; +type ToolDefinition = { + type?: string; + function?: { + name?: string; + description?: string; + parameters?: unknown; + }; +}; + type Agent = { name: string; version: string; model_name?: string; + tool_definitions?: ToolDefinition[]; }; type ToolCall = { - tool_call_id: string; - function_name: string; + tool_call_id?: string; + function_name?: string; + function?: { + name?: string; + }; arguments: unknown; }; +type DevinMetrics = { + input_tokens?: number; + output_tokens?: number; + cache_creation_tokens?: number; + cache_read_tokens?: number; + prompt_tokens?: number; + completion_tokens?: number; + cached_tokens?: number; + total_input_tokens?: number; + cache_creation_input_tokens?: number; + cache_read_input_tokens?: number; + extra?: { + cache_creation_input_tokens?: number; + cache_read_input_tokens?: number; + }; +}; + type DevinMetadata = { created_at?: string; committed_acu_cost?: number; + committed_credit_cost?: number; generation_model?: string; is_user_input?: boolean; num_tokens?: number; request_id?: string; finish_reason?: string; - metrics?: { - input_tokens?: number; - output_tokens?: number; - cache_creation_tokens?: number; - cache_read_tokens?: number; - tokens_per_sec?: number; - total_time_ms?: number; - ttft_ms?: number; - tpot_ms?: number; - }; + metrics?: DevinMetrics; +}; + +type DevinStepExtra = { + committed_acu_cost?: number; + committed_credit_cost?: number; + generation_model?: string; }; type Step = { - step_id: number; - source: string; + step_id?: number | string; + source?: string; + timestamp?: string; model_name?: string; - message: string; + message?: unknown; + metrics?: DevinMetrics; + metadata?: DevinMetadata; + extra?: DevinStepExtra; tool_calls?: Array; }; -type DevinStep = Step & { metadata?: DevinMetadata }; +type DevinStep = Step; type DevinAgentTrajectory = AgentTrajectory; @@ -96,10 +129,15 @@ const DEVIN_PROVIDER_NAME = "devin"; const DEVIN_PROVIDER_DISPLAY_NAME = "Devin"; const DEVIN_TRANSCRIPTS_SUBDIR = "transcripts"; const DEVIN_SESSIONS_DB = "sessions.db"; +const DEVIN_CREDITS_PER_ACU = 10_000; function parseTranscript(raw: string): DevinAgentTrajectory | null { try { - return JSON.parse(raw) as DevinAgentTrajectory; + const parsed = JSON.parse(raw) as unknown; + if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) { + return null; + } + return parsed as DevinAgentTrajectory; } catch { return null; } @@ -110,37 +148,117 @@ function parseNumericTimestamp(value: number): string { return new Date(millis).toISOString(); } -function getUsage( - metadata: DevinMetadata | undefined | null, -): DevinUsage | null { - if (!metadata) return null; - const metrics = metadata.metrics; +function normalizeMessageText(message: unknown): string { + if (typeof message === "string") return message.trim(); + + if (Array.isArray(message)) { + return message + .map((part) => { + if (typeof part === "string") return part; + if (!part || typeof part !== "object") return ""; + const text = (part as Record).text; + return typeof text === "string" ? text : ""; + }) + .map((text) => text.trim()) + .filter(Boolean) + .join(" "); + } + + if (message && typeof message === "object") { + const text = (message as Record).text; + if (typeof text === "string") return text.trim(); + } + + return ""; +} + +function getCommittedAcuCost(step: DevinStep): number { + const metadataAcuCost = safeNumber(step.metadata?.committed_acu_cost); + if (isPositiveNumber(metadataAcuCost)) return metadataAcuCost; + + const extraAcuCost = safeNumber(step.extra?.committed_acu_cost); + if (isPositiveNumber(extraAcuCost)) return extraAcuCost; + + const metadataCreditCost = safeNumber(step.metadata?.committed_credit_cost); + if (isPositiveNumber(metadataCreditCost)) { + return metadataCreditCost / DEVIN_CREDITS_PER_ACU; + } + + const extraCreditCost = safeNumber(step.extra?.committed_credit_cost); + if (isPositiveNumber(extraCreditCost)) { + return extraCreditCost / DEVIN_CREDITS_PER_ACU; + } + + return 0; +} + +function getUsage(step: DevinStep): DevinUsage | null { + const metrics = step.metrics ?? step.metadata?.metrics; + const committedAcuCost = getCommittedAcuCost(step); + + const cacheCreationInputTokens = safeNumber( + metrics?.cache_creation_tokens ?? + metrics?.cache_creation_input_tokens ?? + metrics?.extra?.cache_creation_input_tokens, + ); + + const cacheReadInputTokens = safeNumber( + metrics?.cache_read_tokens ?? + metrics?.cache_read_input_tokens ?? + metrics?.cached_tokens ?? + metrics?.extra?.cache_read_input_tokens, + ); + + const promptTokens = safeNumber( + metrics?.prompt_tokens ?? metrics?.total_input_tokens, + ); + + let inputTokens = safeNumber(metrics?.input_tokens); + if (inputTokens === 0 && promptTokens > 0) { + inputTokens = Math.max( + 0, + promptTokens - cacheReadInputTokens - cacheCreationInputTokens, + ); + } + + const outputTokens = safeNumber( + metrics?.output_tokens ?? metrics?.completion_tokens, + ); const hasAnyUsage = [ - metadata.committed_acu_cost, - metrics?.input_tokens, - metrics?.output_tokens, - metrics?.cache_creation_tokens, - metrics?.cache_read_tokens, + committedAcuCost, + inputTokens, + outputTokens, + cacheCreationInputTokens, + cacheReadInputTokens, ].some((x) => isPositiveNumber(x)); if (!hasAnyUsage) return null; return { - committedAcuCost: safeNumber(metadata.committed_acu_cost), - inputTokens: safeNumber(metrics?.input_tokens), - outputTokens: safeNumber(metrics?.output_tokens), - cacheCreationInputTokens: safeNumber(metrics?.cache_creation_tokens), - cacheReadInputTokens: safeNumber(metrics?.cache_read_tokens), + committedAcuCost, + inputTokens, + outputTokens, + cacheCreationInputTokens, + cacheReadInputTokens, }; } +function isUserInputStep(step: DevinStep): boolean { + return step.metadata?.is_user_input === true || step.source === "user"; +} + function getSessionId( source: SessionSource, transcript: DevinAgentTrajectory, ): string { const fromTranscript = transcript.session_id?.trim(); - return fromTranscript || basename(source.path, ".json"); + if (fromTranscript) return fromTranscript; + + const fromTrajectoryId = transcript.trajectory_id?.trim(); + if (fromTrajectoryId) return fromTrajectoryId; + + return basename(source.path, ".json"); } function projectNameFromPath(path: string): string { @@ -170,6 +288,7 @@ function getTimestamp( ): string | undefined { return [ step.metadata?.created_at, + step.timestamp, session?.lastActivityAt, session?.createdAt, ] @@ -184,6 +303,7 @@ function getModelName( ): string { return ( [ + step.extra?.generation_model, step.metadata?.generation_model, step.model_name, transcript.agent?.model_name, @@ -194,8 +314,41 @@ function getModelName( ); } -function getToolNames(step: DevinStep): string[] { - return (step.tool_calls ?? []).map((call) => call.function_name); +function getDeclaredToolNames( + transcript: DevinAgentTrajectory, +): Map { + const declaredToolNames = new Map(); + const definitions = Array.isArray(transcript.agent?.tool_definitions) + ? transcript.agent.tool_definitions + : []; + + for (const definition of definitions) { + if (!definition || typeof definition !== "object") continue; + const toolName = definition.function?.name?.trim(); + if (!toolName) continue; + declaredToolNames.set(toolName.toLowerCase(), toolName); + } + + return declaredToolNames; +} + +function getToolNames( + step: DevinStep, + declaredToolNames: Map, +): string[] { + const tools: string[] = []; + const toolCalls = Array.isArray(step.tool_calls) ? step.tool_calls : []; + + for (const call of toolCalls) { + const rawToolName = (call.function_name ?? call.function?.name)?.trim(); + if (!rawToolName) continue; + + const canonicalToolName = + declaredToolNames.get(rawToolName.toLowerCase()) ?? rawToolName; + tools.push(canonicalToolName); + } + + return tools; } function getFirstUserMessageBeforeStep( @@ -204,8 +357,8 @@ function getFirstUserMessageBeforeStep( ): string | null { for (let i = index - 1; i >= 0; i--) { const step = steps[i]; - if (!step?.metadata?.is_user_input) continue; - const message = step.message?.trim(); + if (!step || !isUserInputStep(step)) continue; + const message = normalizeMessageText(step.message); if (message) return message; } return null; @@ -267,7 +420,7 @@ class DevinSessionParser implements SessionParser { if (!raw) return; const transcript = parseTranscript(raw); - if (!transcript?.steps) return; + if (!transcript?.steps || !Array.isArray(transcript.steps)) return; const sessionId = getSessionId(this.source, transcript); const session = this.sessionMetadata.get(sessionId) ?? null; @@ -277,23 +430,25 @@ class DevinSessionParser implements SessionParser { const projectPath = getProjectPath(session); const costFactor = await getCostFactor(); if (costFactor === null) return; + const declaredToolNames = getDeclaredToolNames(transcript); for (let index = 0; index < transcript.steps.length; index++) { const step = transcript.steps[index]; - if (step.metadata?.is_user_input) continue; + if (!step || typeof step !== "object" || Array.isArray(step)) continue; + if (isUserInputStep(step)) continue; - const usage = getUsage(step.metadata); + const usage = getUsage(step); if (!usage) continue; const timestamp = getTimestamp(step, session) ?? ""; - - const deduplicationKey = `devin:${sessionId}:${step.step_id}`; + const stepId = `${step.step_id ?? index + 1}`; + const deduplicationKey = `devin:${sessionId}:${stepId}`; if (this.seenKeys.has(deduplicationKey)) continue; this.seenKeys.add(deduplicationKey); const model = getModelName(transcript, step, session); - const tools = getToolNames(step); + const tools = getToolNames(step, declaredToolNames); const userMessage = getFirstUserMessageBeforeStep(transcript.steps, index) ?? ""; diff --git a/tests/providers/devin.test.ts b/tests/providers/devin.test.ts index 5197140f..72870e3e 100644 --- a/tests/providers/devin.test.ts +++ b/tests/providers/devin.test.ts @@ -100,7 +100,7 @@ describe('devin provider', () => { it('parses per-step ACUs, tokens, tools, and model resolution', async () => { await configureDevinRate() const filePath = await writeTranscript('glimmer-platinum.json', { - schema_version: '1', + schema_version: 'ATIF-v1.4', session_id: 'session-123', agent: { model_name: 'agent-model' }, steps: [ @@ -214,6 +214,189 @@ describe('devin provider', () => { expect(calls[0]!.costUSD).toBeCloseTo(1, 12) }) + it('supports ATIF-v1.5 agent tool_definitions without changing parsing', async () => { + await configureDevinRate() + const filePath = await writeTranscript('atif-v15.json', { + schema_version: 'ATIF-v1.5', + session_id: 'atif-v15', + agent: { + model_name: 'agent-model', + tool_definitions: [ + { + type: 'function', + function: { name: 'read_file' }, + }, + ], + }, + steps: [ + { + step_id: 1, + source: 'user', + message: 'hello from v1.5', + metadata: { + is_user_input: true, + created_at: '2027-01-15T08:00:00.000Z', + }, + }, + { + step_id: 2, + source: 'agent', + message: 'done', + metadata: { + created_at: '2027-01-15T08:00:01.000Z', + committed_acu_cost: 0.25, + metrics: { + input_tokens: 12, + output_tokens: 3, + }, + }, + tool_calls: [{ function_name: 'read_file' }], + }, + ], + }) + + const calls = await parseTranscript(filePath) + + expect(calls).toHaveLength(1) + expect(calls[0]).toMatchObject({ + sessionId: 'atif-v15', + userMessage: 'hello from v1.5', + inputTokens: 12, + outputTokens: 3, + costUSD: 0.25, + tools: ['read_file'], + }) + }) + + it('supports ATIF-v1.6 content-part messages and step-level metrics', async () => { + await configureDevinRate() + const filePath = await writeTranscript('atif-v16.json', { + schema_version: 'ATIF-v1.6', + session_id: 'atif-v16', + agent: { model_name: 'agent-v16' }, + steps: [ + { + step_id: 1, + source: 'user', + message: [ + { type: 'text', text: 'review image' }, + { type: 'image', image_source: { path: 'images/1.png' } }, + ], + metadata: { + created_at: '2027-01-15T08:00:00.000Z', + is_user_input: true, + }, + }, + { + step_id: 2, + source: 'agent', + timestamp: '2027-01-15T08:00:02.000Z', + model_name: 'Claude Opus 4.6', + message: [ + { type: 'text', text: 'analysis complete' }, + ], + metrics: { + prompt_tokens: 200, + completion_tokens: 40, + cached_tokens: 150, + extra: { + cache_creation_input_tokens: 20, + }, + }, + extra: { + generation_model: 'claude-opus-4-6-thinking', + committed_credit_cost: 300, + }, + tool_calls: [ + { + function: { name: 'read_file' }, + }, + ], + }, + ], + }) + + const calls = await parseTranscript(filePath) + + expect(calls).toHaveLength(1) + expect(calls[0]).toMatchObject({ + sessionId: 'atif-v16', + userMessage: 'review image', + model: 'claude-opus-4-6-thinking', + inputTokens: 30, + outputTokens: 40, + cacheReadInputTokens: 150, + cacheCreationInputTokens: 20, + costUSD: 0.03, + tools: ['read_file'], + timestamp: '2027-01-15T08:00:02.000Z', + }) + }) + + it('supports ATIF-v1.7 by reading source/timestamp/step metrics and trajectory_id fallback', async () => { + await configureDevinRate(2) + const filePath = await writeTranscript('atif-v17.json', { + schema_version: 'ATIF-v1.7', + trajectory_id: 'traj-001', + agent: { + model_name: 'Claude Opus 4.6', + tool_definitions: [], + }, + steps: [ + { + step_id: 1, + source: 'user', + timestamp: '2027-01-15T08:00:00.000Z', + message: [ + { type: 'text', text: 'Run checks' }, + ], + extra: { telemetry: { source: 'user' } }, + }, + { + step_id: 2, + source: 'agent', + timestamp: '2027-01-15T08:00:01.000Z', + model_name: 'GPT-5.3-Codex', + message: '', + metrics: { + prompt_tokens: 1200, + completion_tokens: 45, + cached_tokens: 1100, + extra: { + cache_creation_input_tokens: 50, + }, + }, + extra: { + committed_credit_cost: 150, + generation_model: 'gpt-5-3-codex-xhigh', + }, + tool_calls: [{ function_name: 'shell_command' }], + }, + ], + final_metrics: { + total_prompt_tokens: 1200, + total_completion_tokens: 45, + }, + }) + + const calls = await parseTranscript(filePath) + + expect(calls).toHaveLength(1) + expect(calls[0]).toMatchObject({ + sessionId: 'traj-001', + deduplicationKey: 'devin:traj-001:2', + userMessage: 'Run checks', + model: 'gpt-5-3-codex-xhigh', + inputTokens: 50, + outputTokens: 45, + cacheReadInputTokens: 1100, + cacheCreationInputTokens: 50, + costUSD: 0.03, + tools: ['shell_command'], + timestamp: '2027-01-15T08:00:01.000Z', + }) + }) + it('falls back to filename session id and deduplicates by step id', async () => { await configureDevinRate() const filePath = await writeTranscript('fallback-session.json', {