feat: `agentic-core` conversation/responses hydration (ADR-03) by maralbahari · Pull Request #46 · vllm-project/agentic-api

maralbahari · 2026-06-03T10:21:40Z

Summary

Implementsagentic-core executor module as specified in ADR-03 — Layered Crate Architecture.

Each step of the agentic loop is exposed as a composable public function. usable standalone via execute() for now only handles text-only messages.

Public step functions:

Function	Role
`create_conversation(ctx)`	Create a new conversation record
`rehydrate_conversation(request, ctx)`	Build `RequestContext` — load history, resolve tools
`call_inference(json, url, client, auth)`	Stream raw SSE lines from the LLM backend
`persist_response(payload, ctx, ch, rh)`	Write new-turn items to conversation or response store
`execute(request, ctx)`	Full loop: rehydrate → infer → persist; returns `Either<ResponsePayload, BoxStream>`

Note on engine.rs: temporary consolidation

rehydrate_conversation, call_inference, and persist_response are currently co-located in engine.rs as a deliberate short-term decision to avoid merge conflicts while parallel work lands.

Per ADR-03, each of these public functions will be homed in its proper module once the in-flight features are integrated:

Function	Target module
`rehydrate_conversation`	`conversation.rs`
`call_inference`	`inference.rs`
`persist_response`	`store.rs`

These are plain Rust public functions with no gateway-specific API. Once moved, they will be re-exported through lib.rs at the same paths (agentic_core::rehydrate_conversation, etc.) so callers remain unaffected.

The tidy-up will follow once this PR and its dependents are merged.

Key design decisions matching ADR-03:

execute() is the convenience entry point; each sub-function is independently callable (D1, D2)
Two execution paths: run_blocking (from_json) and run_stream (from_stream via SSE accumulator with spawn_blocking for CPU-bound JSON parsing)
ConversationHandler and ResponseHandler own all store operations including rehydration.
ExecutionContext holds handlers + HTTP client + LLM base URL; conversations_url() and responses_url() as convenience methods
Persistence is synchronous (inline await) so response state is consistent before returning. safe for sequential multi-turn callers

SSE accumulator (ResponseAccumulator):

from_json for non-streaming path; from_stream (channel + spawn_blocking) for streaming path
Handles both response.done (vLLM) and response.completed (OpenAI) terminal events
In-flight message state owned by the struct; finalize_current_message() deduplicates text-delta assembly

Test Plan

Unit tests (85 passing):

ResponseAccumulator: delta accumulation, text assignment, usage extraction, status transitions
ExecutorError: display formatting, source chaining via thiserror
ConversationHandler / ResponseHandler: all methods error correctly on disabled store

Integration tests (10 passing) — cassette-based, no live model:

stateful_responses_integration (5 tests):

Single-turn non-streaming and streaming
Two-turn previous_response_id chaining, non-streaming and streaming
store=false response rejected as previous_response_id

stateful_conversation_integration (5 tests):

Two-turn conversation_id non-streaming and streaming
Conversation isolation (two independent conversations, 3 turns each)
Branch off turn 1 via previous_response_id (mixed conversation + response chain)
5-turn chain with 2 inline branches

All 113 tests pass. Zero clippy warnings.

Running Tests

cargo test -p agentic-core
# or with explicit thread count
cargo test -p agentic-core -- --test-threads=16

Running Benchmarks

# All benchmarks (storage + executor), default depth 5
cargo bench --bench benches

# Executor only, custom depth and sample size
BENCH_MAX_DEPTH=10 cargo bench --bench benches -- execute --sample-size=10

# Storage only
cargo bench --bench benches -- storage

# Rehydrate only
cargo bench --bench benches -- rehydrate

Benchmark groups:

Group	Measures
`execute/blocking/turns N`	rehydrate (N-1 prior turns from DB) + JSON fetch + persist
`execute/streaming/turns N`	rehydrate + SSE accumulate (spawn_blocking) + persist
`rehydrate_only/prev_response_depth N`	pure rehydrate step, no LLM call

DB is cleared between groups to prevent cross-contamination.

Benchmark Results

Environment: SQLite in-process, local axum mock server returning canned minimal responses (no network or model latency). Numbers measure the cost of one turn in isolation; the prior chain is seeded before criterion starts timing. sample-size=10 per depth.

`execute/blocking` and `execute/streaming` - per-turn cost at each chain depth

Prior turns	Blocking (median)	Streaming (median)
0 (turn 1)	1.56 ms	1.61 ms
1 (turn 2)	2.71 ms	2.78 ms
2 (turn 3)	2.89 ms	2.70 ms
3 (turn 4)	3.03 ms	2.77 ms
4 (turn 5)	2.80 ms	2.79 ms
5 (turn 6)	2.80 ms	2.78 ms
6 (turn 7)	2.83 ms	3.11 ms
7 (turn 8)	3.09 ms	2.83 ms
8 (turn 9)	2.89 ms	2.76 ms
9 (turn 10)	2.80 ms	2.78 ms

`rehydrate_only` - DB read step, no LLM call

Depth 1-10	Time (median)
all depths	220-285 us

Analysis

Per-turn cost is O(1) with respect to chain depth. After the first turn, every subsequent turn costs a constant ~2.8 ms regardless of how many prior turns exist. The prior benchmark showing linear growth to 12 ms at depth 10 was a measurement bug: the seed time was included inside the timed routine. That is now fixed.

Blocking and streaming are within 10% of each other at every depth. SSE accumulation via spawn_blocking adds no meaningful overhead.

Rehydration is flat (~250 us, isolated). rehydrate_from_response fetches only the immediate prior response item list via a single indexed query. The ~1.2 ms overhead visible in the full execute benchmarks (vs ~250 us isolated) reflects the DB write (persist) that also occurs each turn.

Signed-off-by: maral <maralbahari.98@gmail.com>

Add executor module: rehydration, LLM inference, SSE accumulation, and persistence for both conversation and response stateful flows. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

ashwing

Read through the core source. Structure makes sense: RequestContext for per-turn stuff, ExecutionContext for runtime deps, handlers wrapping store ops. The spawn_blocking trick for JSON parsing on the streaming path is a good call.

Few things I noticed thinking about how function_calls will plug in on top of this:

The streaming path in run_stream doesn't actually stream to the client — it consumes the full SSE response via from_stream(), then yields one big payload.as_responses_chunk() at the end. Works fine for text-only, but clients setting stream=true expect incremental response.output_text.delta events as they arrive. ADR-01 §3 also calls this out explicitly ("SSE stream to the client is interleaved with the tool loop — events go out in real time, not buffered until done"). I'll need to tee the stream for the tool dispatch layer anyway — forward to client while accumulating for tool-call detection. Not blocking this PR on that, just flagging it.

execute() is one pass right now (rehydrate → infer → persist). For function_calls we'll need it to loop: detect tool calls → dispatch → re-enter inference. I'm thinking a LoopDecision enum (Continue/Done/Incomplete) driving re-entry. Would you rather that wrap execute() from the outside, or should we refactor execute() itself to become the loop?

ashwing · 2026-06-03T21:42:38Z

+}
+
+fn run_stream(ctx: RequestContext, exec_ctx: Arc<ExecutionContext>) -> BoxStream {
+    let url = exec_ctx.responses_url();


This accumulates everything before yielding to the caller, so "streaming" here means streaming from upstream but not to the client. Intentional for now? Asking because for tool dispatch I'll need to tap the stream mid-flight — forward deltas to the client while watching for function_call items completing.

@ashwing yes we need to tap the stream and add in event normalizer to catch reasoning delta and the rest of the SSE types. so initially before we move to rust we were planning to use PydanticAI to handle all the event normalization for us in https://github.com/vllm-project/agentic-api/pull/21/changes#diff-38d6d6323f9401ad47e9230c6d3fc779e2530c09a502cc55d32acde0277f7d89R7
now we need to implement them on rust we can draw inspiration from PydanticAI design and handle many of those objects natively in rust. as you mentioned in the other comment the SSE event types would grow.
we need to design the events and handling them during streaming smartly so that it's easy to maintain while the streaming loop wouldn't regress. so performance is key important point. we need to include this to your proposal implementation to consider the SSE streaming line and normalizing the events.
this is one of the important part I think it can be in a separate module in agentic-core then import it into executor. so the SSE enum in Types could be removed and the SSE handling normalizing the events into separate core module to avoid the accumulator from getting bloated.
what do you think?

+1 on the separate module approach. Here's the shape I'm thinking:

crates/agentic-core/src/ events/ mod.rs // pub mod normalize; pub mod types; types.rs // SSEEventType (expanded to 28+ variants) + typed EventPayload enum normalize.rs // normalize_sse_line(&str) -> EventFrame { event_type, payload }

The normalizer takes a raw data: {...} line and produces a typed EventFrame. The accumulator then pattern-matches on EventFrame instead of doing inline JSON parsing — keeps the streaming loop tight and makes adding new event types a one-line match arm.

I'll look at PydanticAI's StreamedResponse._process_event() for the dispatch table shape — that's basically what we're porting to Rust with zero-copy where possible.

This module has no dependency on the executor, so I can open a PR against main this week while #46 is still in review. Then once #46 lands, the accumulator switches to consuming EventFrame as a follow-up.

I'll start with the function_call event types (response.function_call_arguments.delta, response.output_item.done, response.function_call_arguments.done) since those are the minimum for tool dispatch to work, plus reasoning deltas. Sound good?

@ashwing sounds all good. Thanks

ashwing · 2026-06-03T21:42:38Z

+///
+/// Used by [`run_blocking`] so it can pass the result to [`ResponseAccumulator::from_json`].
+async fn fetch_response_json(
+    upstream_json: String,


These two (fetch_response_json + send_inference_request) do the same error mapping (timeout→504, connect fail→502, non-2xx body read). Could share a helper that builds+sends+maps errors, with the callers just differing on .text().await vs .bytes_stream(). Minor — just noticed the duplication.

thanks. I pushed a commit to refactor.

ashwing · 2026-06-03T21:42:38Z

+    ///
+    /// Non-`data:` lines, `[DONE]`, and malformed JSON are silently skipped.
+    fn process_sse_line(&mut self, line: &str) {
+        let Some(data_str) = line.strip_prefix("data: ") else {


Right now Other silently drops everything that isn't text message events. For function_calls, response.output_item.done is where we detect a completed tool call in the output. I'll extend this when building dispatch — just noting where the hook goes.

yes thank you

ashwing · 2026-06-03T21:42:38Z

+/// Owns the storage handlers, HTTP client, and LLM endpoint configuration.
+#[derive(Debug)]
+pub struct ExecutionContext {
+    pub conv_handler: ConversationHandler,


When tool dispatch lands we'll need MCP clients, web search providers, etc. accessible from context. Would you rather grow this struct with optional fields, or pass a separate ToolContext into dispatch_tools()? I'd lean toward the latter to keep this focused on the inference flow.

I think we should keep context as separate as possible so the tools would have their own ToolContext and then in agent loop we resolve the context. I dont have the full picture in mind now. but to keep modules in core small with their own context or config then later in agentic loop we can handle orchestration of each component.

ashwing · 2026-06-03T21:42:38Z

+/// response generation process.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
+pub enum SSEEventType {
+    /// Response object created; contains initial response metadata.


The OpenAI Responses API has 28+ event types. Once function_calls land we'll need at minimum response.function_call_arguments.delta, response.output_item.done, and a few more. Current setup with Other as fallback is fine — just expect this enum to grow.

…dation Reframes the design doc as a hybrid reference: - Acknowledges PR vllm-project#46 (maralbahari) as the base executor loop - Defines clear ownership boundaries (base loop vs tool dispatch) - Organizes into 4 implementation phases, each = one PR - Phase 1 (SSE events) independent of PR vllm-project#46 - Phases 2-4 build on top of PR vllm-project#46 Removes speculative API surface (AgenticState, AgenticConfig, full trait definitions) in favor of concrete code snippets matching actual implementation targets. Keeps just enough detail to execute follow-up PRs without over-specifying. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

- Phase 1 correctly depends on PR vllm-project#46 (accumulator.rs lives there) - call_inference is sync fn returning lazy stream, not async - persist_response takes explicit handler params (noted) - Native async traits instead of #[async_trait] (Rust 1.85) - Removed undefined ContextSize type, use &str - Phase 2 explicitly non-streaming (streaming gated on Phase 3) - Removed max_iterations redundancy from dispatch_tools params - ADR-01 reference reworded as paraphrase not quote Signed-off-by: Ashwin Giridharan <girida@amazon.com>

…ame entry Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari · 2026-06-04T06:20:10Z

Read through the core source. Structure makes sense: RequestContext for per-turn stuff, ExecutionContext for runtime deps, handlers wrapping store ops. The spawn_blocking trick for JSON parsing on the streaming path is a good call.

Few things I noticed thinking about how function_calls will plug in on top of this:

The streaming path in run_stream doesn't actually stream to the client — it consumes the full SSE response via from_stream(), then yields one big payload.as_responses_chunk() at the end. Works fine for text-only, but clients setting stream=true expect incremental response.output_text.delta events as they arrive. ADR-01 §3 also calls this out explicitly ("SSE stream to the client is interleaved with the tool loop — events go out in real time, not buffered until done"). I'll need to tee the stream for the tool dispatch layer anyway — forward to client while accumulating for tool-call detection. Not blocking this PR on that, just flagging it.

execute() is one pass right now (rehydrate → infer → persist). For function_calls we'll need it to loop: detect tool calls → dispatch → re-enter inference. I'm thinking a LoopDecision enum (Continue/Done/Incomplete) driving re-entry. Would you rather that wrap execute() from the outside, or should we refactor execute() itself to become the loop?

yes for streaming we need to add the interleave streaming now only works on text-only.

the execute() for now is to test the text message rehydration flow. since the ADR03 suggesting per step flow for the whole loop orchestration each individual step would be called sequentially. for the test and benchmarking purpose to assess the main functionality and correctness of previous response hydration and storage functionality we have this until the entire agentic loop orchestration is implemented fully.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

@maralbahari

Phase 1 now targets a separate `events/` module in agentic-core per @maralbahari's feedback on PR vllm-project#46 — avoids bloating the accumulator and has no PR vllm-project#46 dependency (lands on main directly). Also adds OGX compatibility note to Phase 4 traits and updates the Praxis filter mapping with the event normalizer step. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari · 2026-06-10T04:54:22Z

Note on engine.rs: temporary consolidation

rehydrate_conversation, call_inference, and persist_response are currently co-located in engine.rs as a deliberate short-term decision to avoid merge conflicts while parallel work lands.

Per ADR-03, each of these public functions will be homed in its proper module once the in-flight features are integrated:

Function	Target module
`rehydrate_conversation`	`conversation.rs`
`call_inference`	`inference.rs`
`persist_response`	`store.rs`

These are plain Rust public functions with no gateway-specific API. Once moved, they will be re-exported through lib.rs at the same paths (agentic_core::rehydrate_conversation, etc.) so callers

@maralbahari

## Summary Design reference for the `agentic-core` public API extensions. This doc defines what we're building on top of PR #46's base executor loop. **Ownership split:** - @maralbahari — base loop: `execute()`, `rehydrate_conversation()`, `call_inference()`, `persist_response()` (PR #46) - @ashwing — tool dispatch, loop control, streaming tee, executor traits (this doc + follow-up PRs) **Implementation phases (each = one PR):** | Phase | Scope | Depends On | |-------|-------|------------| | 1 | SSEEventType expansion + accumulator FunctionCall detection | nothing (lands on main) | | 2 | `LoopDecision` + `dispatch_tools()` + `execute_loop()` | PR #46 | | 3 | Streaming tee (forward to client + accumulate) | PR #46 | | 4 | Tool executor traits + mock impls (MCP, web_search, vector_store) | Phase 2 | **What changed in this update:** - Removed all code stubs (per @maralbahari's feedback — impl goes in follow-up PRs) - Reframed around PR #46 as the foundation - Reduced from 626 lines to ~230 — just enough to execute, not over-specify - Added Praxis filter mapping table with ownership per step ## Test Plan Design doc only — no code changes. Verification via follow-up implementation PRs with integration tests. --------- Signed-off-by: Ashwin Giridharan <girida@amazon.com>

franciscojavierarceo

hey this looks pretty good, some small adjustments and then i think we're good to go.

franciscojavierarceo · 2026-06-10T10:42:04Z

+        new_items.extend(ctx.new_input_items.into_iter().map(InOutItem::Input));
+        new_items.extend(output_items.into_iter().map(InOutItem::Output));
+
+        self.store


Can we add one more ADR-02 checkpoint test around this path? For non-conversation Responses continuations, each stored response should be a continuation checkpoint: history_item_ids should represent the full ordered model-visible history at that point, not just this turn's new input/output items. Otherwise a 3-turn previous_response_id chain can rehydrate only turn 2 when turn 3 is sent. A good regression test would run three response turns and assert the third upstream request includes turn 1 + turn 2 + the new input.

franciscojavierarceo · 2026-06-10T10:42:04Z

+        conversation_id: None,
+    };
+
+    if !ctx.original_request.store {


Can we add coverage for store=false with existing state? store should control whether this new response is persisted, not whether existing previous_response_id / conversation_id context is hydrated for inference. With this early return, store=false + previous_response_id only validates the prior response exists and then forwards just the new input upstream. I'd like tests for store=false + previous_response_id and store=false + conversation_id that assert the upstream request is hydrated but no new state is persisted.

franciscojavierarceo · 2026-06-10T10:42:05Z

+        return Ok(ctx);
+    }
+
+    if ctx.original_request.conversation_id.is_some() {


Please add an explicit validation test for requests that provide both conversation_id and previous_response_id. Right now conversation hydration silently wins, but the response later still carries the original previous_response_id, which makes it look like that branch was used. We should reject the ambiguous combination before inference.

franciscojavierarceo · 2026-06-10T10:42:05Z

+        let handle = tokio::spawn(async move {
+            let app = Router::new()
+                .route(
+                    "/v1/responses",


The mock should probably capture/assert the incoming /v1/responses request body for the stateful tests. Right now it ignores _body and just dequeues the next cassette response, so tests can pass even if rehydration sends the wrong history upstream. The new tests for ADR-02 checkpoints, store=false hydration, and ambiguous ID validation will need this to be meaningful.

franciscojavierarceo

actually let's land this and address my comments in a follow up PR

CC @leseb @ashwing

Introduces `events/` module in agentic-core with: - SSEEventType enum (20 variants covering all Responses API events) - EventPayload enum (typed extraction per event type) - EventFrame struct as the normalized output - normalize_sse_line() pure function: raw SSE data line → typed frame Handles both vLLM (response.done) and OpenAI (response.completed) wire formats. No dependency on the executor module — lands on main independently of PR vllm-project#46. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Move inline unit tests from normalize.rs to tests/event_normalizer_test.rs. All tests use the public API only (normalize_sse_line) so they don't need module-private access. Matches the repo convention established by PR vllm-project#46's cassette tests. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

maralbahari and others added 17 commits May 21, 2026 12:02

feat: implement storage CRUD layer with SQLx and benchmarks

b0272b4

Signed-off-by: maral <maralbahari.98@gmail.com>

use rust criterion for benchmarking and clean code

473485d

Signed-off-by: maral <maralbahari.98@gmail.com>

clean code

0bb4eb8

Signed-off-by: maral <maralbahari.98@gmail.com>

cover more unit tests and add integration tests

9ad6c66

Signed-off-by: maral <maralbahari.98@gmail.com>

Merge remote-tracking branch 'origin/main' into impl-database-crud

c6b9968

Signed-off-by: maral <maralbahari.98@gmail.com>

avoid unnecessary clone

b681865

Signed-off-by: maral <maralbahari.98@gmail.com>

move integration test in agentic-core

4a29e6f

Signed-off-by: maral <maralbahari.98@gmail.com>

fix multi-thread unit test and clean the main cargo.toml

0f9d2c3

Signed-off-by: maral <maralbahari.98@gmail.com>

fix cargo clippy

3ec60a2

Signed-off-by: maral <maralbahari.98@gmail.com>

fix clippy errors in benchmark

06be1e1

Signed-off-by: maral <maralbahari.98@gmail.com>

Merge remote-tracking branch 'origin/main' into agentic-core-executor

fdf5696

Signed-off-by: maral <maralbahari.98@gmail.com>

add integration test based on pre-recorded cassets from openai

89f8f8c

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

clean code and fix cargo clippy

b024ef2

Signed-off-by: maral <maralbahari.98@gmail.com>

improve error handling

da38e55

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

improve apis

8d2d843

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

simplify call_inference

080aabe

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari marked this pull request as ready for review June 3, 2026 14:23

maralbahari requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, noobHappylife, qandrew and tjtanaa as code owners June 3, 2026 14:23

maralbahari mentioned this pull request Jun 3, 2026

docs: core public API design — phased implementation plan #44

Merged

ashwing reviewed Jun 3, 2026

View reviewed changes

fix benchmarking to record per turn pref and merge all benches into s…

04a0271

…ame entry Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari added 2 commits June 4, 2026 06:01

Merge remote-tracking branch 'origin/main' into agentic-core-executor

2d17980

fix cargo clippy

fc25c96

Signed-off-by: maral <maralbahari.98@gmail.com>

clean code

734cae1

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

franciscojavierarceo changed the title ~~[FEAT]: agentic-core implement agentic loop executor (ADR-03)~~ [FEAT]: agentic-core conversation/responses hydration (ADR-03) Jun 4, 2026

maralbahari mentioned this pull request Jun 8, 2026

feat: agentic-server stateful conversation/responses endpoints. #48

Draft

ashwing mentioned this pull request Jun 10, 2026

feat: add SSE event normalizer module #49

Open

maralbahari changed the title ~~[FEAT]: agentic-core conversation/responses hydration (ADR-03)~~ feat: agentic-core conversation/responses hydration (ADR-03) Jun 10, 2026

fix rehydrate priority by conversation

192e878

Signed-off-by: maral <maralbahari.98@gmail.com>

franciscojavierarceo reviewed Jun 10, 2026

View reviewed changes

franciscojavierarceo mentioned this pull request Jun 10, 2026

feat: add OGX integration with agentic loop and state hydration #34

Draft

5 tasks

franciscojavierarceo approved these changes Jun 10, 2026

View reviewed changes

franciscojavierarceo merged commit 8bd32dc into vllm-project:main Jun 10, 2026
3 checks passed

Conversation

maralbahari commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Running Tests

Running Benchmarks

Benchmark Results

execute/blocking and execute/streaming - per-turn cost at each chain depth

rehydrate_only - DB read step, no LLM call

Analysis

Uh oh!

ashwing left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maralbahari commented Jun 10, 2026

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maralbahari commented Jun 3, 2026 •

edited

Loading

`execute/blocking` and `execute/streaming` - per-turn cost at each chain depth

`rehydrate_only` - DB read step, no LLM call

maralbahari Jun 4, 2026 •

edited

Loading

maralbahari commented Jun 4, 2026 •

edited

Loading

franciscojavierarceo left a comment •

edited

Loading