Skip to content

feat: Add MTP projection-to-logits smoke#650

Open
hashiqiqixian wants to merge 1 commit into
hw-native-sys:mainfrom
hashiqiqixian:deepseek-v4-mtp-logits
Open

feat: Add MTP projection-to-logits smoke#650
hashiqiqixian wants to merge 1 commit into
hw-native-sys:mainfrom
hashiqiqixian:deepseek-v4-mtp-logits

Conversation

@hashiqiqixian

@hashiqiqixian hashiqiqixian commented Jun 30, 2026

Copy link
Copy Markdown

Summary

  • Refactor mtp_projection.py to expose mtp_projection_impl as a reusable inline JIT while keeping the existing standalone mtp_projection validation entry.
  • Add deepseek_v4_mtp_logits.py to validate the MTP logits path from mtp_hidden to candidate_logits, including local logits and shared-head-norm logits cases.
  • Add deepseek_v4_mtp_tail.py to compose mtp_projection_impl -> mtp_local_logits, validating the MTP projection-to-logits tail path end to end.
  • Align logits matmul with the existing DeepSeek V4 decode matmul style by padding token rows to MATMUL_T_TILE = 16 while preserving the real [T, VOCAB_SHARD] output contract.
  • Add --dump-passes support to the new smoke entry points for compile pipeline debugging.

Testing

  • git diff --check origin/main...HEAD
  • python -m py_compile models/deepseek/v4/mtp_projection.py models/deepseek/v4/deepseek_v4_mtp_logits.py models/deepseek/v4/deepseek_v4_mtp_tail.py
  • Compile-only validation passed for MTP projection, logits, shared-head logits, and projection-to-logits tail.
  • NPU runtime validation passed for MTP projection, logits, shared-head logits, and projection-to-logits tail.
  • Golden comparison passed for all validated outputs.

Related Issues

None

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd59658b-658b-4b28-9715-a86d710a6ce2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors mtp_projection into an inlinable mtp_projection_impl, then adds two new smoke-test modules: deepseek_v4_mtp_logits.py with JIT kernels for local and shared-head MTP candidate-logits generation (plus golden references and contract verification), and deepseek_v4_mtp_tail.py that chains projection and logits into a single combined JIT function with its own CLI runner.

Changes

DeepSeek-V4 MTP Logits and Tail Smoke Tests

Layer / File(s) Summary
mtp_projection refactored into inlinable impl
models/deepseek/v4/mtp_projection.py
Renames the existing JIT body to mtp_projection_impl with @pl.jit.inline, and adds a thin @pl.jit mtp_projection wrapper that delegates to it, enabling reuse by the tail function.
JIT logits kernels
models/deepseek/v4/deepseek_v4_mtp_logits.py
Defines module constants and divisibility assertions, implements mtp_local_logits (chunked BF16→FP32 matmul against a vocab shard), mtp_shared_head_norm (pipelined per-token RMS norm), and the two JIT entry points deepseek_v4_mtp_logits and deepseek_v4_mtp_shared_head_logits.
Golden references and contract verification
models/deepseek/v4/deepseek_v4_mtp_logits.py
Adds _rms_norm and _local_logits PyTorch helpers, golden_mtp_local_logits/golden_mtp_shared_head_logits entrypoints, and _check_logits_contract enforcing output shape, row-to-candidate correspondence, and argmax bounds.
Tensor specs, CASES map, and CLI runner
models/deepseek/v4/deepseek_v4_mtp_logits.py
Implements build_tensor_specs with optional shared-head norm weight, CASES dict pairing JIT functions with golden functions, and a __main__ CLI that runs run_jit with ratio_allclose tolerances.
deepseek_v4_mtp_tail JIT, golden, and CLI
models/deepseek/v4/deepseek_v4_mtp_tail.py
Adds deepseek_v4_mtp_tail that chains mtp_projection_impl into mtp_local_logits, the golden reference using golden_mtp_projection and _local_logits, build_tensor_specs excluding hidden_states_out, and a CLI runner.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • hw-native-sys/pypto-lib#431: Introduces a TP-sharded lm_head local logits kernel with the same vocab-shard projection contract used by mtp_local_logits in this PR.

Poem

🐇 Hop hop, the hidden states fly,
Through RMS norms beneath the sky.
Chunked matmuls in FP32 bloom,
Golden logits light up the room.
The tail now chains projection right,
Candidate tokens glowing bright! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The description matches the changeset, covering the projection refactor plus the new logits and tail smoke modules.
Title check ✅ Passed The title clearly matches the main change: adding an MTP projection-to-logits smoke test.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces DeepSeek-V4 MTP logits and tail contract validation smoke tests, implementing deepseek_v4_mtp_logits and deepseek_v4_mtp_tail using PyPTO. It also refactors mtp_projection.py to expose an inlined implementation. Feedback highlights an assertion failure due to T_TILE being larger than T, and suggests an optimization to use a pipelined loop with a conditional check in mtp_local_logits to improve hardware utilization on CANN/Ascend.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread models/deepseek/v4/deepseek_v4_mtp_logits.py Outdated
Comment thread models/deepseek/v4/deepseek_v4_mtp_logits.py Outdated
@hashiqiqixian hashiqiqixian force-pushed the deepseek-v4-mtp-logits branch 3 times, most recently from a9ab2ec to 186c486 Compare July 1, 2026 08:54
@hashiqiqixian hashiqiqixian changed the title feat(dsv4): Add MTP projection-to-logits smoke feat: Add MTP projection-to-logits smoke Jul 1, 2026
@hashiqiqixian hashiqiqixian force-pushed the deepseek-v4-mtp-logits branch from 186c486 to a9ab2ec Compare July 1, 2026 08:58
@hashiqiqixian hashiqiqixian force-pushed the deepseek-v4-mtp-logits branch from a9ab2ec to bd333e0 Compare July 1, 2026 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant