fix: FINAL() callable in REPL; parser ignores FINAL/FINAL_VAR inside code fences by jkbrooks · Pull Request #115 · alexzhang13/rlm

jkbrooks · 2026-02-19T05:05:49Z

Problem

Two bugs caused the RLM to silently return wrong answers when a model chose to call FINAL() or FINAL_VAR() inside a ```repl ``` code block — a natural pattern given the system prompt examples.

Bug 1 – Parser (`utils/parsing.py`)

find_final_answer() ran its FINAL(...) / FINAL_VAR(...) regex over the raw assistant response, including inside fenced code blocks. So when the model wrote:

```repl
FINAL(final_answer)
```

the parser matched FINAL(final_answer) and returned the literal string "final_answer" as the completion response instead of the variable's value. No error was raised — the RLM just silently returned a wrong string.

Bug 2 – Runtime (`environments/local_repl.py`, `environments/base_env.py`)

The system prompt offers FINAL(value) as option 1 for submitting a final answer:

Use FINAL(your final answer here) to provide the answer directly

But FINAL was never injected into the REPL globals — only FINAL_VAR was. So when the model called FINAL(x) inside a repl block it got a NameError, and the REPL stderr was shown to the model, wasting an iteration. The only reason the run didn't always fail is that the parser still picked up FINAL(...) from the response text — leading to bug 1 above.

Fix

utils/parsing.py — strip all fenced code blocks from the response before running the FINAL/FINAL_VAR regex. This ensures only prose-level FINAL(...) signals termination.

environments/local_repl.py — add _final(value) method (mirrors _final_var for direct values), inject it as globals["FINAL"], and restore it in _restore_scaffold().

environments/base_env.py — add "FINAL" to RESERVED_TOOL_NAMES so it can't be overwritten by user code.

Tests added (`tests/test_parsing.py`)

Test	Covers
`test_final_inside_repl_code_block_not_parsed_as_terminal`	Parser ignores `FINAL()` in code fence
`test_final_var_inside_repl_code_block_not_parsed_as_terminal`	Parser ignores `FINAL_VAR()` in code fence
`test_final_in_prose_still_works_alongside_repl_block`	Prose `FINAL()` still terminates after code fence
`test_final_callable_in_repl_environment`	`FINAL(x)` callable in REPL sets `final_answer`

All 16 TestFindFinalAnswer tests pass.

How to reproduce (before fix)

from rlm import RLM
rlm = RLM(backend="openai", backend_kwargs={"model_name": "gpt-4o-mini"}, environment="local", max_depth=1)
result = rlm.completion("Compute the sum of integers from 1 to 10 and print only the number.")
print(result.response)  # prints "final_answer" instead of "55"

Made with Cursor

…code fences Two bugs caused the RLM to return wrong answers when a model called FINAL() or FINAL_VAR() inside a ```repl``` code block: 1. Parser bug (utils/parsing.py): find_final_answer() ran its regex over the raw response including fenced code blocks, so FINAL(x) in a repl block was parsed as the terminal answer, returning the literal string "final_answer" instead of its value. Fix: strip all fenced code blocks before running the FINAL/FINAL_VAR regex. FINAL/FINAL_VAR in prose still work as before. 2. Runtime bug (environments/local_repl.py, environments/base_env.py): The system prompt advertises FINAL(value) as option 1 for submitting a final answer, but FINAL was never injected into the REPL globals - only FINAL_VAR was. Models calling FINAL() inside a repl block got a NameError. Fix: add _final() method to LocalREPL (mirrors _final_var for direct values), inject it as globals["FINAL"], restore it in _restore_scaffold, and add "FINAL" to RESERVED_TOOL_NAMES. Tests added to TestFindFinalAnswer: - test_final_inside_repl_code_block_not_parsed_as_terminal - test_final_var_inside_repl_code_block_not_parsed_as_terminal - test_final_in_prose_still_works_alongside_repl_block - test_final_callable_in_repl_environment Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Fixes incorrect termination behavior when models emit FINAL(...) / FINAL_VAR(...) inside fenced REPL blocks, and makes FINAL(...) callable inside the local REPL to align runtime behavior with the system prompt.

Changes:

Update find_final_answer() to ignore FINAL(...) / FINAL_VAR(...) that appear inside fenced code blocks.
Inject FINAL into LocalREPL globals and restore it after executions; reserve FINAL as a non-overridable tool name.
Add parsing + REPL runtime tests covering the regressions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`tests/test_parsing.py`	Adds regression tests ensuring fenced `FINAL`/`FINAL_VAR` are ignored and `FINAL()` is callable in `LocalREPL`.
`rlm/utils/parsing.py`	Strips fenced code blocks before scanning for terminal `FINAL`/`FINAL_VAR` markers.
`rlm/environments/local_repl.py`	Adds `_final()` helper and injects/restores `FINAL` in REPL globals.
`rlm/environments/base_env.py`	Adds `FINAL` to `RESERVED_TOOL_NAMES` to prevent override by custom tools/user code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T05:09:32Z

    """
+    # Remove fenced code blocks first so FINAL()/FINAL_VAR() inside ```repl``` code
+    # does not get parsed as a terminal answer.
+    text_no_code = re.sub(r"```[\s\S]*?```", "", text)


text_no_code = re.sub(r"```[\s\S]*?```", "", text) strips all fenced code blocks from the entire response before searching for FINAL(...). This can silently corrupt a legitimate final answer if the payload inside FINAL(...) includes a markdown code fence (e.g., returning a code snippet), because the fenced content will be removed before extraction. Consider narrowing the stripping to only repl blocks (the ones the runtime executes), or performing code-fence removal only outside the matched FINAL(...)/FINAL_VAR(...) span so the final payload is preserved verbatim.

Suggested change

text_no_code = re.sub(r"```[\s\S]*?```", "", text)

text_no_code = re.sub(r"```repl[\s\S]*?```", "", text)

Copilot AI review requested due to automatic review settings February 19, 2026 05:05

Copilot started reviewing on behalf of jkbrooks February 19, 2026 05:06 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: FINAL() callable in REPL; parser ignores FINAL/FINAL_VAR inside code fences#115

fix: FINAL() callable in REPL; parser ignores FINAL/FINAL_VAR inside code fences#115
jkbrooks wants to merge 1 commit into
alexzhang13:mainfrom
jkbrooks:fix/final-in-repl-block

jkbrooks commented Feb 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	text_no_code = re.sub(r"```[\s\S]*?```", "", text)
	text_no_code = re.sub(r"```repl[\s\S]*?```", "", text)

Conversation

jkbrooks commented Feb 19, 2026

Problem

Bug 1 – Parser (utils/parsing.py)

Bug 2 – Runtime (environments/local_repl.py, environments/base_env.py)

Fix

Tests added (tests/test_parsing.py)

How to reproduce (before fix)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 1 – Parser (`utils/parsing.py`)

Bug 2 – Runtime (`environments/local_repl.py`, `environments/base_env.py`)

Tests added (`tests/test_parsing.py`)