Skip to content

Optional scoped retrieval tools for recursive child RLMs (depth-1 multi-hop) #35

@apenab

Description

@apenab

Summary

Allow recursive child RLMs to opt in to a scoped set of retrieval tools (a retriever and/or doc_tools), instead of always being constructed with retriever=None / doc_tools=None. This would turn recursion into genuine multi-hop document-graph traversal, letting a child resolve a sub-question by fetching more context rather than only reasoning over the snippet it was handed.

Current behavior

When recursion_impl="child", a recursive subcall re-enters run() as a child RLM built via replace(self, …) in rlm.py (~L647–676). The child is deliberately stripped of retrieval:

child = replace(
    self,
    depth=depth,
    policy=child_policy,
    system_prompt_supplement="",   # reset so es_* docs don't leak into the child
    retriever=None,
    doc_tools=None,
    compaction=False,
    ...
)
result_text, sub_trace = child.run(query_text, Context.from_text(snippet))

So the child can only reason over the snippet placed in its context (Context.from_text(snippet)). The budget plumbing already exists: the child gets its own Policy seeded from the parent's remaining budget (~L640–645) and rolls its spend back up (policy.account_subtree(...)).

Motivation

In downstream engines (e.g. an Elasticsearch-backed doc engine) the natural task shape is multi-hop: a row references a detail page via a doc://page/<comp>/<pageId> link, the detail page references a dedicated component, etc. Today the root must hand-roll this traversal in its system_prompt_supplement (explicit "follow the cross-ref" REPL protocols), because children can't retrieve. If a depth-1 child could call es_*/doc_tools, the root could simply decompose — "for each candidate, ask a child to find X" — and the child would follow the links itself. This is a cleaner divide-and-conquer and removes brittle hand-written multi-hop code from the prompt.

Proposed change

An opt-in flag, e.g. recursive_child_tools: bool = False (or recursive_child_retriever / recursive_child_max_tool_depth), that when enabled passes retriever / doc_tools into the child replace(...) — ideally only for depth-1 children (grandchildren stay tool-less) to bound fan-out.

Design considerations / risks

  1. Budget & fan-out. A retrieving child can pull large documents and spawn its own subcalls (grandchildren). Restrict tools to depth-1 and rely on the existing seeded-budget mechanism; consider a separate sub-budget for retrieval.
  2. Context window. The child currently sets compaction=False; a retrieving child handling large docs would likely need compaction re-enabled.
  3. System prompt. The child supplement is reset to "" to avoid leaking es_* docs. A tool-having child would instead need a scoped supplement that documents only the retrieval tools it actually has.
  4. Traceability. Nested retrieval subtrees are harder to follow than today's flat root-level retrieval; trace output may need work.
  5. Backward compatible. Must default to current behavior (tools off) so existing runs are unaffected.

Context

Surfaced while designing a multi-hop retrieval experiment in the autodoc-rlm engine (following doc:// cross-references PNPE → detail page → SI component). Recursion is currently kept OFF there partly because tool-less children can't help with multi-hop; this feature would make recursion a viable lever for that class of task.

Related code: rlm.py child construction (~L632–681), Policy.account_subtree (policy.py).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions