Skip to content

Pull requests: pytorch/torchtitan

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[rl] Fix batch invariant logprob calculation by forcing vllm use trainer's function ciflow/rl ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3629 opened Jun 11, 2026 by wwwjn Contributor Loading…
Memory snapshot: python-only stacks for fast dumps + configurable max_entries ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3628 opened Jun 11, 2026 by SherlockNoMad Contributor Loading…
[MoE] Use CPU split-size sum for EP permute output size ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3627 opened Jun 10, 2026 by sanketpurandare Contributor Loading…
Add StickySessionRoutingStrategy for GeneratorRouter ciflow/rl ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3625 opened Jun 10, 2026 by pzhan9 Contributor Loading…
Add num_generators field to RLTrainer.Config ciflow/rl ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3624 opened Jun 10, 2026 by pzhan9 Contributor Loading…
[Checkpointer] Remove the dependencies on PyTorch distributed state_dict APIs ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3623 opened Jun 10, 2026 by fegin Contributor Loading…
[graph_trainer] Device-hoist CooR artifacts (drop --virtual-local-rank) ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3620 opened Jun 10, 2026 by bobrenjc93 Contributor Draft
4 tasks
[graph_trainer] dsv3 run: local_batch_size=16 + Run 6 batch-scaling results CLA Signed This label is managed by the Meta Open Source bot.
#3618 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Match Eager FSDP bucket order ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3617 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Force cudagraph for MinimalAsyncEP on H100 + Run 4 results ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3616 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
Add MinimalAsyncEP + offset aware swiglu kernels (#3561) ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3615 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Enable ChunkedCELoss for deepseek_v3 and qwen3 (#3248) ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3614 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Replace stable_topological_sort with _move_overlap_nodes in fsdp_passes ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3613 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Add DSv3 scaling run/profiling helper script CLA Signed This label is managed by the Meta Open Source bot.
#3612 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Match Eager FSDP bucket order ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3611 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Force cudagraph for MinimalAsyncEP on H100 + Run 4 results ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3610 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
Add MinimalAsyncEP + offset aware swiglu kernels (#3561) ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3609 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Enable ChunkedCELoss for deepseek_v3 and qwen3 (#3248) ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3608 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Replace stable_topological_sort with _move_overlap_nodes in fsdp_passes ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3607 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[graph_trainer] Add DSv3 scaling run/profiling helper script CLA Signed This label is managed by the Meta Open Source bot.
#3606 opened Jun 10, 2026 by SherlockNoMad Contributor Draft
[GraphTrainer] Fix test_deterministic: update model hash and calling update_from_config ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3605 opened Jun 10, 2026 by wwwjn Contributor Loading…
[Bug] Fix MoE SP token combine indices bug Something isn't working ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot. high priority
#3604 opened Jun 10, 2026 by Matrix-Z97 Loading…
[flex_shard] Add DeepSeek V3 eager training entry point ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3603 opened Jun 10, 2026 by weifengpy Contributor Draft
[rl] Search-R1: multi-turn retrieval-augmented GRPO ciflow/rl ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3602 opened Jun 10, 2026 by yichuan-w Member Loading…
Add deterministic topk for MoE routing ciflow/h100.8 Trigger H100.8 CI ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#3600 opened Jun 10, 2026 by sanketpurandare Contributor Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.