docs: core public API design — phased implementation plan by ashwing · Pull Request #44 · vllm-project/agentic-api

ashwing · 2026-05-29T22:07:31Z

Summary

Design reference for the agentic-core public API extensions. This doc defines what we're building on top of PR #46's base executor loop.

Ownership split:

@maralbahari — base loop: execute(), rehydrate_conversation(), call_inference(), persist_response() (PR feat: agentic-core conversation/responses hydration (ADR-03) #46)
@ashwing — tool dispatch, loop control, streaming tee, executor traits (this doc + follow-up PRs)

Implementation phases (each = one PR):

Phase	Scope	Depends On
1	SSEEventType expansion + accumulator FunctionCall detection	nothing (lands on main)
2	`LoopDecision` + `dispatch_tools()` + `execute_loop()`	PR #46
3	Streaming tee (forward to client + accumulate)	PR #46
4	Tool executor traits + mock impls (MCP, web_search, vector_store)	Phase 2

What changed in this update:

Removed all code stubs (per @maralbahari's feedback — impl goes in follow-up PRs)
Reframed around PR feat: agentic-core conversation/responses hydration (ADR-03) #46 as the foundation
Reduced from 626 lines to ~230 — just enough to execute, not over-specify
Added Praxis filter mapping table with ownership per step

Test Plan

Design doc only — no code changes. Verification via follow-up implementation PRs with integration tests.

maralbahari · 2026-06-02T11:30:36Z

@ashwing Thank you for the job on this PR , it clarifies general functions required in the core module howeverI have a concern with this is that the design implementations are based on agent assumptions on some input output which is not explicit and once the actual implementation is in place the design might not be working well and developer would need to make so much changes and the review process would become complex.
I feel we should focus on actual small feature implementation with benchmarks and a smoke test of each small feature so we know the actual usage of each module and their purpose.

ashwing · 2026-06-02T19:14:10Z

Fair point — stubs that drift from implementation would just create rework. I'll shift the approach: instead of landing all the type definitions at once, I'll implement one complete slice end-to-end (e.g., rehydrate_conversation with a real test against your store layer from PR #33) so the types are validated by actual usage rather than assumptions.

The design doc stays as a reference for the overall direction, but the PRs will be small, tested features — one step function at a time with integration tests proving the interfaces work.

ashwing · 2026-06-02T19:26:17Z

Updated — stripped the code stubs and kept this as a pure design doc for inline review. Implementation will follow as separate small PRs with integration tests against the store layer.

@leseb @franciscojavierarceo @maralbahari would appreciate your review on the design direction in docs/design/core-public-api.md. Happy to adjust based on feedback before starting implementation.

maralbahari · 2026-06-03T14:32:08Z

Updated — stripped the code stubs and kept this as a pure design doc for inline review. Implementation will follow as separate small PRs with integration tests against the store layer.

@leseb @franciscojavierarceo @maralbahari would appreciate your review on the design direction in docs/design/core-public-api.md. Happy to adjust based on feedback before starting implementation.

Thank you @ashwing for the proposal. I have a simple response/conversation flow in #46. this PR only implements the stateful agent loop for simple input text messages it might look big because of the integration tests and cassettes records against OpenAI's conversation and responses api for assessment. However the agentic-loop is not complete with function_calls, instructions the default tool calls as your proposal which contains a bigger skeleton to ship. as discussed on slack we can work on #46 to have a more explicit implementation and complete the agentic-loop. I think could start with implementation to complete the function_calls and handling reasoning tokens etc?

ashwing · 2026-06-03T21:42:48Z

@maralbahari sounds good — I reviewed PR #46 and left some comments there. Makes sense for you to own the stateful text flow as the base, and I'll build function_calls, tool dispatch, and the looped execution on top once #46 lands. The design doc here stays as the reference for the broader skeleton.

@franciscojavierarceo

Consolidated proposal incorporating leseb's expanded 14-function vision with phased implementation plan. Intended for inline PR review per @franciscojavierarceo's request. Ref: vllm-project#42 Signed-off-by: Ashwin Giridharan <girida@amazon.com>

…dation Reframes the design doc as a hybrid reference: - Acknowledges PR vllm-project#46 (maralbahari) as the base executor loop - Defines clear ownership boundaries (base loop vs tool dispatch) - Organizes into 4 implementation phases, each = one PR - Phase 1 (SSE events) independent of PR vllm-project#46 - Phases 2-4 build on top of PR vllm-project#46 Removes speculative API surface (AgenticState, AgenticConfig, full trait definitions) in favor of concrete code snippets matching actual implementation targets. Keeps just enough detail to execute follow-up PRs without over-specifying. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

- Phase 1 correctly depends on PR vllm-project#46 (accumulator.rs lives there) - call_inference is sync fn returning lazy stream, not async - persist_response takes explicit handler params (noted) - Native async traits instead of #[async_trait] (Rust 1.85) - Removed undefined ContextSize type, use &str - Phase 2 explicitly non-streaming (streaming gated on Phase 3) - Removed max_iterations redundancy from dispatch_tools params - ADR-01 reference reworded as paraphrase not quote Signed-off-by: Ashwin Giridharan <girida@amazon.com>

ashwing · 2026-06-04T06:54:27Z

@franciscojavierarceo @maralbahari @leseb Gentle nudge to review the updated implementation plan. The doc is now minimal with work items phased out.

maralbahari · 2026-06-04T07:19:12Z

+
+### Phase 4: Tool Executor Traits + Mock Implementations (depends on Phase 2)
+
+**PR scope:** `executor/tools/` module.


@ashwing I think for the tools it should be on just agentic-core/tools as ADR-03
we need to keep the modules in agentic-core as small as possible to avoid circular imports and dependencies so it is easier to maintain as well.

@ashwing just nit the tools would be in agentic-core/tools not executor/tools

Makes sense — keeps tool traits at the crate top level rather than coupling them to the executor. crates/agentic-core/src/tools/ it is. The executor just imports and calls them.

maralbahari · 2026-06-04T07:21:22Z

+
+### Phase 3: Streaming Tee (depends on PR #46)
+
+**PR scope:** `executor/stream_tee.rs`, refactor `run_stream` path.


@ashwing I think as SSE streaming grows we can also maintain them in their own small module in agentic-core/sse.

maralbahari · 2026-06-04T07:25:49Z

+
+## Open Questions
+
+1. **`execute_loop` vs refactoring `execute`:** Should the loop wrapper be a new function or replace PR #46's `execute()`? Pending maralbahari's response on PR #46 review.


the execute in #46 is temporary as the entire loop is not complete with whole functionality since it is only validating text-only responses. so that we can test each small component and verify their correctness as we implement each composable pieces.
we shall write the last execute_loop once we have all the pieces in.

maralbahari · 2026-06-08T11:02:34Z

@franciscojavierarceo @maralbahari @leseb Gentle nudge to review the updated implementation plan. The doc is now minimal with work items phased out.

Thank you @ashwing I have responded to your questions in #46 (comment)
for the SSE events and handling them during streaming in more details would need your feedback on it thanks.

@maralbahari

Phase 1 now targets a separate `events/` module in agentic-core per @maralbahari's feedback on PR vllm-project#46 — avoids bloating the accumulator and has no PR vllm-project#46 dependency (lands on main directly). Also adds OGX compatibility note to Phase 4 traits and updates the Praxis filter mapping with the event normalizer step. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

franciscojavierarceo · 2026-06-10T03:22:48Z

+
+This PR includes mock implementations for integration testing (in-memory tool executors that return canned responses). Real implementations (MCP client, Brave search, Qdrant) come in later PRs.
+
+**Note:** These traits must be compatible with @franciscojavierarceo's OGX integration (PR #34). The trait-based approach allows OGX to be one implementation behind `McpToolExecutor` — the dispatch layer doesn't care whether tools run via OGX or native Rust.


@ashwing Can you clarify what you mean here? OGX is a vector store search service—the existing integration (vector_search/ogx.rs) implements a VectorSearch trait and hits OGX's REST API directly. It's not an MCP server.

Did you mean OGX would be an implementation behind VectorStoreClient (not McpToolExecutor)? That would make sense and is basically what we already have.

If you're proposing routing OGX through MCP as an intermediary, I'd push back on that—it adds indirection for something that's a straightforward REST call.

You're right — that was sloppy wording on my part. OGX is a VectorStoreClient implementation, not MCP. The dispatch layer routes by tool type: function calls → McpToolExecutor, file_search → VectorStoreClient (your OGX impl), web_search → WebSearchProvider. No MCP indirection for direct REST calls.

Won't push a fix since you've already approved — the correct mapping is in the Praxis filter table below and will be reflected in the code PRs.

franciscojavierarceo · 2026-06-10T03:23:58Z

+| 0 | `request_validate` | `validate_request()` | Future | — |
+| 1 | `response_store` (init) | `init_store()` | Future | — |
+| 2 | `rehydrate` | `rehydrate_conversation()` | PR #46 | @maralbahari |
+| 3 | `file_resolve` | `resolve_files()` | Future | — |


happy to take this one on

franciscojavierarceo · 2026-06-10T03:24:09Z

+| 1 | `response_store` (init) | `init_store()` | Future | — |
+| 2 | `rehydrate` | `rehydrate_conversation()` | PR #46 | @maralbahari |
+| 3 | `file_resolve` | `resolve_files()` | Future | — |
+| 4 | `tool_parse` | `parse_tools()` | Future | — |


franciscojavierarceo · 2026-06-10T03:24:15Z

+| 8 | `mcp_tool` | `McpToolExecutor::execute()` | Phase 4 | @ashwing |
+| 9 | `web_search` | `WebSearchProvider::search()` | Phase 4 | @ashwing |
+| 10 | `file_search` | `VectorStoreClient::search()` | Phase 4 | @ashwing |
+| 11 | `compact` | `compact_context()` | Future | — |


franciscojavierarceo · 2026-06-10T03:24:20Z

+| 9 | `web_search` | `WebSearchProvider::search()` | Phase 4 | @ashwing |
+| 10 | `file_search` | `VectorStoreClient::search()` | Phase 4 | @ashwing |
+| 11 | `compact` | `compact_context()` | Future | — |
+| 12 | `reasoning` | `summarize_reasoning()` | Future | — |


franciscojavierarceo · 2026-06-10T03:24:40Z

+
+| # | Praxis Filter | Core Function | Phase | Owner |
+|---|---------------|---------------|-------|-------|
+| 0 | `request_validate` | `validate_request()` | Future | — |


i can do this one too

Awesome — updated ownership split in my head:

You:

file_resolve / resolve_files()

tool_parse / parse_tools()

web_search / WebSearchProvider (Brave or similar)

file_search / VectorStoreClient (OGX)

Me:

event_normalize (PR feat: add SSE event normalizer module #49 — submitted)

stream_events / streaming tee

tool_dispatch / dispatch_tools() + LoopDecision

mcp_tool / McpToolExecutor

The dispatch layer calls your providers via the trait interface so we can develop in parallel once trait definitions are agreed. Let me know if you want to define the trait signatures for your pieces or if you'd rather I propose them in a follow-up PR and you iterate.

franciscojavierarceo

some small nits but otherwise this lgtm

Signed-off-by: Ashwin Giridharan <girida@amazon.com>

ashwing · 2026-06-10T05:57:58Z

@maralbahari @franciscojavierarceo Updated — tools path, OGX mapping, and ownership table all reflect your feedback. Thanks!

ashwing mentioned this pull request Jun 1, 2026

feat: agentic-core public API — composable step functions per ADR-03 #42

Closed

ashwing force-pushed the feat/core-public-api branch from a994ccc to 59f1ae4 Compare June 2, 2026 05:31

ashwing force-pushed the feat/core-public-api branch from 59f1ae4 to 1232f6e Compare June 2, 2026 19:17

ashwing changed the title ~~feat: agentic-core public API — types, traits, and executor stubs~~ docs: agentic-core public API design proposal Jun 2, 2026

ashwing force-pushed the feat/core-public-api branch 6 times, most recently from d92b158 to 3501d67 Compare June 2, 2026 21:35

ashwing added 2 commits June 3, 2026 16:53

ashwing changed the title ~~docs: agentic-core public API design proposal~~ docs: core public API design — phased implementation plan Jun 3, 2026

ashwing force-pushed the feat/core-public-api branch from 3501d67 to bc210e5 Compare June 3, 2026 23:56

maralbahari reviewed Jun 4, 2026

View reviewed changes

ashwing mentioned this pull request Jun 10, 2026

feat: add SSE event normalizer module #49

Open

franciscojavierarceo reviewed Jun 10, 2026

View reviewed changes

franciscojavierarceo approved these changes Jun 10, 2026

View reviewed changes

ashwing marked this pull request as ready for review June 10, 2026 04:41

ashwing requested review from bbrowning, jiahuei, leseb, noobHappylife, qandrew and tjtanaa as code owners June 10, 2026 04:41

docs: update tools path, OGX mapping, and ownership table

abe8b85

Signed-off-by: Ashwin Giridharan <girida@amazon.com>

maralbahari merged commit 453b0f8 into vllm-project:main Jun 10, 2026
3 checks passed


		### Phase 4: Tool Executor Traits + Mock Implementations (depends on Phase 2)

		PR scope: `executor/tools/` module.


		### Phase 3: Streaming Tee (depends on PR #46)

		PR scope: `executor/stream_tee.rs`, refactor `run_stream` path.


		## Open Questions

		1. `execute_loop` vs refactoring `execute`: Should the loop wrapper be a new function or replace PR #46's `execute()`? Pending maralbahari's response on PR #46 review.


		This PR includes mock implementations for integration testing (in-memory tool executors that return canned responses). Real implementations (MCP client, Brave search, Qdrant) come in later PRs.

		Note: These traits must be compatible with @franciscojavierarceo's OGX integration (PR #34). The trait-based approach allows OGX to be one implementation behind `McpToolExecutor` — the dispatch layer doesn't care whether tools run via OGX or native Rust.

Conversation

ashwing commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

maralbahari commented Jun 2, 2026

Uh oh!

ashwing commented Jun 2, 2026

Uh oh!

ashwing commented Jun 2, 2026

Uh oh!

maralbahari commented Jun 3, 2026

Uh oh!

ashwing commented Jun 3, 2026

Uh oh!

ashwing commented Jun 4, 2026 • edited by maralbahari Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maralbahari Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari commented Jun 8, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

ashwing commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ashwing commented May 29, 2026 •

edited

Loading

ashwing commented Jun 4, 2026 •

edited by maralbahari

Loading

maralbahari Jun 4, 2026 •

edited

Loading

maralbahari Jun 4, 2026 •

edited

Loading