Skip to content

renderer: drop incorrect minimax_m2 xfail; add renderer README#408

Open
Hecate0821 wants to merge 1 commit intomainfrom
chengxi/minimax-fix-think-newline
Open

renderer: drop incorrect minimax_m2 xfail; add renderer README#408
Hecate0821 wants to merge 1 commit intomainfrom
chengxi/minimax-fix-think-newline

Conversation

@Hecate0821
Copy link
Copy Markdown
Collaborator

Summary

  • Drop the incorrect xfail marker on the `minimax_m2-single-turn` HF parity case. `pytest.xfail()` short-circuits before running the comparison, so the previous marker was applied based on the live-gateway divergence without ever verifying CPU divergence. Running the parity check directly shows the renderer matches HF byte-for-byte; the gateway is the odd one out (<think> vs HF/renderer's <think>\n). The right action is to keep the renderer as-is and let the test enforce parity going forward.
  • Add a compact `training/renderer/README.md` that routes agents to `skills/renderer/SKILL.md` (implement) and `skills/verifier/SKILL.md` (validate) and embeds the verifier token-stream screenshot.
  • Drop the now-redundant "Renderer skills" section from the top-level cookbook README; the renderer README owns that entry point.

Test plan

  • `pytest -q training/tests/unit/test_renderer_hf_parity.py` → 7/7 pass
  • CI `Training CI / Unit And Import Tests` job goes green on this PR

🤖 Generated with Claude Code

The minimax_m2 parity case was marked xfail with the rationale
"renderer emits an extra '\n' after <think>". That rationale was
inferred from the empirical sweep (live gateway disagreed) and never
verified against HF — pytest.xfail() short-circuits before running
the comparison, so no one actually checked. Running the comparison
directly:

  Renderer:     ]~b]ai\n<think>\n   (31 tokens)
  HF template:  ]~b]ai\n<think>\n   (31 tokens)   ← match ✓
  Gateway:      ]~b]ai\n<think>     (30 tokens)   ← gateway is the
                                                    odd one out

The renderer matches upstream HF byte-for-byte. The bug, if any,
lives in the Fireworks gateway's serving template for minimax-m2p7.
Drop the xfail marker so the parity test enforces the (already
correct) behaviour going forward; if the renderer ever drifts, the
test goes red instead of silently green.

Docs:
- training/renderer/README.md: new compact entry point that points
  agents at the renderer + verifier skills and shows the verifier
  token-stream screenshot.
- README.md: drop the "Renderer skills" section; the renderer
  README now owns the entry point.

7/7 parity cases pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant