Ergonomic sys_op traits with generic SysOpOutput<T> and auto-generated glue#3088
Ergonomic sys_op traits with generic SysOpOutput<T> and auto-generated glue#3088
Conversation
…d glue Replace verbose manual sys_op function signatures with per-module traits (SysOpFs, SysOpSys, SysOpNet, SysOpHttp, SysOpLlm) generated by the `generate_sys_op_traits` proc macro. Trait methods have clean typed signatures and return `SysOpOutput<T>` where T is the concrete return type (e.g. FsFile, String, bool, ()). Generated glue methods handle argument extraction, validation, and error wrapping automatically. - Add `SysOpOutput<T = BexExternalValue>` generic enum with typed ok/err/async_op helpers and `into_result()` that converts T -> BexExternalValue via Into - Add `From<owned::*> for BexExternalValue` impls for all owned builtin types, plus From<()> (-> Null) and From<PromptAst> - Generate per-module traits from DSL #[sys_op] annotations with typed returns - Generate SysOps::from_impl<T>() constructor to wire trait impls automatically - Rewrite sys_native to implement traits directly with clean typed returns - Update llm_ops::build_request to return owned::HttpRequest directly - Add Debug derives to all owned builtin types Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR refactors system operation dispatch by introducing a centralized Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client Code
participant Engine as BexEngine
participant SysOps as SysOps Table
participant Context as SysOpContext
participant Handler as Handler Fn
Note over Engine,Handler: New Context-Aware Dispatch Pattern
Client->>Engine: execute_sys_op(heap, op, args)
Engine->>Context: Access precomputed LLM metadata
Note over Context: Contains llm_functions<br/>and function_global_indices
Engine->>SysOps: get(op) → function pointer
Note over SysOps: Macro-generated table<br/>with all sys_ops
SysOps-->>Engine: SysOpFn
Engine->>Handler: call(heap, args, &context)
Note over Handler: Handler now receives<br/>context parameter
Handler->>Handler: Use context for metadata lookup
Handler-->>Engine: SysOpResult
Engine-->>Client: result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
CodSpeed Performance ReportMerging this PR will degrade performance by 14.57%Comparing
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | WallTime | bench_incremental_modify_function |
79.1 µs | 91.8 µs | -13.8% |
| ❌ | WallTime | bench_incremental_add_new_file |
76.9 µs | 89.1 µs | -13.64% |
| ❌ | WallTime | bench_incremental_close_string |
263.8 µs | 298.5 µs | -11.6% |
| ❌ | WallTime | bench_incremental_add_attribute |
264.3 µs | 299.4 µs | -11.74% |
| ❌ | WallTime | bench_incremental_add_field |
79.7 µs | 93.3 µs | -14.57% |
| ❌ | WallTime | bench_incremental_no_change |
43.8 µs | 51 µs | -14.11% |
| ❌ | WallTime | bench_incremental_add_string_char |
260.5 µs | 296.4 µs | -12.11% |
Footnotes
-
63 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
baml_language/crates/llm_ops/src/specialize_prompt/mod.rs (1)
20-32: 🧹 Nitpick | 🔵 TrivialDocumentation is misplaced after adding the new wrapper.
The detailed doc comment (lines 20-25) describing the three transformations now sits above
specialize_prompt_from_owned, but it actually documents the privatespecialize_promptfunction below. Consider moving the detailed documentation to the private function or consolidating.📝 Suggested fix
+/// Apply prompt specialization given already-extracted owned types. pub(crate) fn specialize_prompt_from_owned( client: &PrimitiveClient, prompt: bex_vm_types::PromptAst, ) -> bex_vm_types::PromptAst { specialize_prompt(client, prompt) } +/// Specialize a prompt for a specific provider. +/// +/// Applies three transformations in order: +/// 1. Merge adjacent same-role messages +/// 2. Consolidate system prompts (when `max_one_system_prompt` is true) +/// 3. Filter role metadata (strip disallowed metadata keys) fn specialize_prompt(baml_language/crates/llm_ops/src/lib.rs (1)
363-391:⚠️ Potential issue | 🟡 MinorValidate the PrimitiveClient arg in
execute_parse_response.
Right now the first argument is ignored, so type mismatches won’t be caught.✅ Add minimal type validation
- let _arg0 = args.remove(0); + let arg0 = args.remove(0); let arg1 = args.remove(0); let arg2 = args.remove(0); let (response, function_name) = heap .with_gc_protection(|protected| { + let _ = arg0 + .as_builtin_class::<builtin_types::PrimitiveClient>(&protected)?; let response = arg1.as_string(&protected).cloned()?; let function_name = arg2.as_string(&protected).cloned()?; Ok::<_, bex_heap::AccessError>((response, function_name)) }) .map_err(OpErrorKind::AccessError)?;
| /// Extract the module name (second path segment) from a `sys_op` path. | ||
| /// | ||
| /// E.g., `"baml.fs.open"` → `"fs"`, `"baml.llm.PrimitiveClient.parse"` → `"llm"`. | ||
| fn module_from_path(path: &str) -> &str { | ||
| path.split('.') | ||
| .nth(1) | ||
| .expect("sys_op path should have at least 2 segments (e.g., baml.fs.open)") | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider returning Option or Result instead of panicking.
module_from_path uses expect() which will cause a compile-time panic if a malformed sys_op path is provided. While this is acceptable for a proc macro (errors surface at compile time), consider whether a more descriptive error message would help debugging.
fn module_from_path(path: &str) -> &str {
path.split('.')
.nth(1)
.expect("sys_op path should have at least 2 segments (e.g., baml.fs.open)")
}The current message is adequate, but you might want to include the actual path in the error for easier debugging:
💡 Suggested improvement
fn module_from_path(path: &str) -> &str {
path.split('.')
.nth(1)
- .expect("sys_op path should have at least 2 segments (e.g., baml.fs.open)")
+ .unwrap_or_else(|| panic!("sys_op path '{path}' should have at least 2 segments (e.g., baml.fs.open)"))
}| fn sys_op_ref_type_ident(type_name: &str) -> Ident { | ||
| match type_name { | ||
| "File" => format_ident!("FsFile"), | ||
| "Socket" => format_ident!("NetSocket"), | ||
| "Response" => format_ident!("HttpResponse"), | ||
| "Request" => format_ident!("HttpRequest"), | ||
| "PrimitiveClient" => format_ident!("PrimitiveClient"), | ||
| other => panic!("Unknown builtin struct for sys_op extraction: {other}"), | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Hardcoded type mapping may become out of sync with builtin definitions.
sys_op_ref_type_ident has a hardcoded mapping from DSL struct names to Rust type identifiers. If a new builtin struct is added to the DSL without updating this function, the macro will panic at compile time.
Consider adding a comment to remind maintainers to update this function when adding new builtin structs, or explore whether this mapping could be derived from the DSL definitions themselves.
| use sys_types::{FunctionRef, OpErrorKind, SysOpContext}; | ||
|
|
||
| // ============================================================================ | ||
| // SysOp Implementations | ||
| // Clean (owned-type) entry points for trait-based dispatch | ||
| // ============================================================================ | ||
|
|
||
| /// Render a Jinja template given already-extracted owned types. | ||
| /// | ||
| /// `args` is expected to be `BexExternalValue::Map { entries, .. }`. | ||
| pub fn execute_render_prompt_from_owned( | ||
| client: &builtin_types::owned::PrimitiveClient, | ||
| template: &str, | ||
| args: &BexExternalValue, | ||
| ) -> Result<bex_vm_types::PromptAst, OpErrorKind> { | ||
| let template_args: indexmap::IndexMap<String, BexExternalValue> = match args { | ||
| BexExternalValue::Map { entries, .. } => entries.clone(), | ||
| _ => { | ||
| return Err(OpErrorKind::TypeError { | ||
| expected: "map", | ||
| actual: args.type_name().to_string(), | ||
| }); | ||
| } | ||
| }; | ||
|
|
||
| let render_ctx = llm_jinja::RenderContext { | ||
| client: llm_jinja::RenderContextClient { | ||
| name: client.name.clone(), | ||
| provider: client.provider.clone(), | ||
| default_role: client.default_role.clone(), | ||
| allowed_roles: client.allowed_roles.clone(), | ||
| }, | ||
| output_format: llm_types::OutputFormatContent::new(bex_external_types::Ty::String), | ||
| tags: indexmap::IndexMap::new(), | ||
| enums: std::collections::HashMap::new(), | ||
| }; | ||
|
|
||
| let prompt_ast = llm_jinja::render_prompt(template, &template_args, &render_ctx)?; | ||
| Ok(std::sync::Arc::new(prompt_ast)) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Avoid cloning template args; borrow the map instead.
This removes an unnecessary allocation on hot paths.
♻️ Suggested tweak
- let template_args: indexmap::IndexMap<String, BexExternalValue> = match args {
- BexExternalValue::Map { entries, .. } => entries.clone(),
+ let template_args = match args {
+ BexExternalValue::Map { entries, .. } => entries,
_ => {
return Err(OpErrorKind::TypeError {
expected: "map",
actual: args.type_name().to_string(),
});
}
};
@@
- let prompt_ast = llm_jinja::render_prompt(template, &template_args, &render_ctx)?;
+ let prompt_ast = llm_jinja::render_prompt(template, template_args, &render_ctx)?;| fn baml_net_socket_close(socket: builtin_types::owned::NetSocket) -> SysOpOutput<()> { | ||
| drop(socket); | ||
| SysOpOutput::ok(()) | ||
| } |
There was a problem hiding this comment.
Socket close doesn't remove handle from registry — same leak as file close.
Same issue as baml_fs_file_close: dropping the socket doesn't unregister it from the registry.
🔧 Suggested fix
fn baml_net_socket_close(socket: builtin_types::owned::NetSocket) -> SysOpOutput<()> {
- drop(socket);
+ registry::REGISTRY.unregister_socket(socket._handle.key());
SysOpOutput::ok(())
}There was a problem hiding this comment.
this is expected. dropping from the registry happens on drop automatically
There was a problem hiding this comment.
@hellovai, thank you for the clarification! I understand now — the handles use RAII and the Drop implementation takes care of registry cleanup automatically. That's a clean design.
✏️ Learnings added
Learnt from: hellovai
Repo: BoundaryML/baml PR: 3088
File: baml_language/crates/sys_native/src/lib.rs:146-149
Timestamp: 2026-02-08T05:41:13.256Z
Learning: In the baml_language codebase, registry cleanup for resource handles (FsFile, NetSocket, HttpResponse, etc.) happens automatically via the Drop trait implementation on the handle types. There is no need to manually call unregister methods when dropping these handles — the _handle field's Drop implementation handles registry cleanup automatically.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| // TODO: Implement proper response parsing | ||
| SysOpOutput::err(OpErrorKind::NotImplemented { | ||
| message: "parse not yet implemented with clean types".into(), | ||
| }) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unimplemented parse function returns error.
The parse operation returns NotImplemented with a TODO comment. This is a known gap, but it may cause runtime failures if called.
Would you like me to open an issue to track implementing the proper response parsing?
Summary
SysOpFs,SysOpSys,SysOpNet,SysOpHttp,SysOpLlm) auto-generated bygenerate_sys_op_traitsproc macro from DSL#[sys_op]annotations — eliminates manual function signatures, argument extraction boilerplate, and error wrappingSysOpOutput<T>enum whereTdefaults toBexExternalValue— trait methods return typed values (e.g.SysOpOutput<FsFile>,SysOpOutput<String>,SysOpOutput<()>) and the generated glue converts toSysOpResultviaInto<BexExternalValue>automaticallySysOps::from_impl<T>()constructor wires per-module trait impls into the function pointer table, replacing manual field-by-field assignmentFromimpls for all owned builtin types (FsFile,NetSocket,HttpRequest,HttpResponse,PrimitiveClient),()(→ Null), andPromptAst→BexExternalValuellm_ops::build_requestnow returnsowned::HttpRequestdirectly instead ofBexExternalValue#[derive(Debug)]to all owned builtin typesTest plan
cargo checkpasses cleancargo test --lib -p sys_types -p sys_native -p llm_ops -p bex_heap -p bex_external_types)Made with Cursor
Summary by CodeRabbit
Release Notes