Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
474d542
Enable experimental rollout flag for CI tests (#492)
fzyzcjy Jan 22, 2026
72bafb1
Fix PYTHONPATH for AMD container Megatron-LM location (#506)
lizamd Jan 22, 2026
37c96a5
Revert "Enable experimental rollout flag for CI tests" (#507)
fzyzcjy Jan 23, 2026
df87211
Add new API with extensibility and compatibility adapters (#432)
fzyzcjy Jan 23, 2026
2dbe0d7
Copy and split sglang_rollout.py to modular_rollout (#433)
fzyzcjy Jan 23, 2026
4adb662
Use new rollout function API for modular rollout (#434)
fzyzcjy Jan 23, 2026
d08836c
Add mock SGLang server (#435)
fzyzcjy Jan 23, 2026
e4c2dbf
Add integration test for rollout generation for several combinations …
fzyzcjy Jan 23, 2026
79368ff
Add new sample generation API (#437)
fzyzcjy Jan 23, 2026
fd7a755
Remove global variables in modular rollout (#438)
fzyzcjy Jan 23, 2026
d6b23e0
Remove misplaced fields in GenerateState (#439)
fzyzcjy Jan 23, 2026
8b1fe7f
Temporarily remove DP rank balancing in generate state (#440)
fzyzcjy Jan 23, 2026
e436f1c
Cleanup and shorten modular rollout code (#441)
fzyzcjy Jan 23, 2026
b915eb3
Add tests for all reward functions (#443)
fzyzcjy Jan 23, 2026
3e3ce1a
Enhance mock sglang server with concurrency and requests recording an…
fzyzcjy Jan 23, 2026
274fc42
Add FunctionRegistry to patch load_function (#445)
fzyzcjy Jan 23, 2026
5b1eb16
Add integration tests to cover various modes and features in rollout …
fzyzcjy Jan 23, 2026
dda5031
Support speculative information in mock sglang server (#449)
fzyzcjy Jan 23, 2026
4cc51e1
Add thorough test for single turn generate function (#450)
fzyzcjy Jan 23, 2026
d1f29ed
Refactor single turn generate function (#451)
fzyzcjy Jan 23, 2026
491e71b
Allow user-provided function to add extra arguments (#452)
fzyzcjy Jan 23, 2026
f75a9a7
Copy core of retool example into multi_turn.py and adapt to new API (…
fzyzcjy Jan 23, 2026
7157eba
Support tool response tokenization logic (#454)
fzyzcjy Jan 23, 2026
1e74a8a
Support mock tools and corresponding server replies (#456)
fzyzcjy Jan 23, 2026
b609350
Refactor to extract generation fixtures (#458)
fzyzcjy Jan 23, 2026
ab03942
Support multi-turn testing with snapshot test utils (#459)
fzyzcjy Jan 23, 2026
a1dca85
Update multi turn single sample implementation to use standard toolin…
fzyzcjy Jan 23, 2026
0c0cc8a
Support comparison tests and edge case tests for multi-turn-single-sa…
fzyzcjy Jan 23, 2026
cc46921
Change behavior of multi turn single sample to match single turn in d…
fzyzcjy Jan 23, 2026
0626e73
Refactor and unify multi-turn-single-sample and single-turn (#466)
fzyzcjy Jan 23, 2026
c309ddd
Support and refactor rollout-max-context-len (#468)
fzyzcjy Jan 23, 2026
4c60a34
Support multiple output samples in addition to single sample in multi…
fzyzcjy Jan 23, 2026
c466744
Fix mock tool response with stop tokens (#471)
fzyzcjy Jan 23, 2026
9e45a1f
Add integration test for router (#472)
fzyzcjy Jan 23, 2026
61c8d8b
Support other http actions for http request utility (#473)
fzyzcjy Jan 23, 2026
759fc8b
Support OpenAI format for tool execution (#474)
fzyzcjy Jan 23, 2026
2072ac8
Support openai endpoint for mock sglang server (#475)
fzyzcjy Jan 23, 2026
b197e99
Support session based API in router with tracing (#476)
fzyzcjy Jan 23, 2026
9be8ea7
Support tracing OpenAI endpoint and converting to Sample (#477)
fzyzcjy Jan 23, 2026
dec72fa
Fix sample filter flaky tests (#478)
fzyzcjy Jan 23, 2026
fec5805
Support blackbox agents with tool calling (#479)
fzyzcjy Jan 23, 2026
ad6052b
Support merging samples to construct trajectory (#480)
fzyzcjy Jan 23, 2026
8a67e90
Support agentic rollout to generate one single sample for the whole t…
fzyzcjy Jan 23, 2026
5d296bf
Add three turn integration testing and refactor related stubs (#482)
fzyzcjy Jan 23, 2026
7419ca7
Add rollout level integration test for (multi-turn, agentic) x (singl…
fzyzcjy Jan 23, 2026
adf9d72
Add environment variable to guard enabling the new rollout (#484)
fzyzcjy Jan 23, 2026
c410f42
Improve fault tolerance for router session retrieval (#485)
fzyzcjy Jan 23, 2026
716c2dd
Support rollout routing replay for multi turn (#486)
fzyzcjy Jan 23, 2026
1c26271
Change max_num_tokens according to rollout_max_context_len (#487)
fzyzcjy Jan 23, 2026
4faeb7a
Minor code and test cleanup (#488)
fzyzcjy Jan 23, 2026
204ecb1
Cleanup file and folder structure for rollout (#489)
fzyzcjy Jan 23, 2026
6ecdec9
Add CPU-only tests to CI (#490)
fzyzcjy Jan 23, 2026
59fa9f1
Use new rollout function by default when corresponding flag is on (#491)
fzyzcjy Jan 23, 2026
f20b28c
Enable experimental rollout flag for CI tests (#508)
fzyzcjy Jan 23, 2026
f652c2c
rather professional readme document (#511)
zhaochenyang20 Jan 25, 2026
5585843
[Docs] Remove linked blog (#518)
zijiexia Jan 26, 2026
a8c8687
Adds new blogs to latest update (#520)
zhaochenyang20 Jan 27, 2026
81ea4a8
[CI] Re-organize and enable necessary end2end CI cases (#499)
yushengsu-thu Jan 27, 2026
9440fd1
fix: fix arg parsing by making help a string instead of a tuple (#522)
AgainstEntropy Jan 27, 2026
a681b87
[fix] mbridge incorrectly handle all weight precision to bf16 (#524)
yueming-yuan Jan 28, 2026
d22bc8c
fix int4 kernel setup.py (#527)
yueming-yuan Jan 29, 2026
ebdea20
fix: missing comma in runtime env JSON for qwen3-235B-A22B (#531)
harvenstar Jan 30, 2026
be0c840
Fix accuracy bug in data packing for thd (#542)
guapisolo Feb 1, 2026
511e560
Fix Memory Leak on Rocm Offload (#545)
zyzshishui Feb 2, 2026
6bc0dd5
[bugfix] Fix R3 padding (#551)
guapisolo Feb 3, 2026
e8c5a8c
Super tiny fix for blog link (#553)
jhinpan Feb 3, 2026
e67fc49
Update CODEOWNERS with new paths and owners (#564)
Ying1123 Feb 5, 2026
32d4c85
feat: Support OAI TITO v1 (#502)
guapisolo Feb 5, 2026
824b849
Update CODEOWNERS for miles directory ownership (#569)
Ying1123 Feb 6, 2026
a93d484
[Doc] Add doc for miles router (#538)
Hecate0821 Feb 7, 2026
05e73c3
fix: allow --seq-length CLI override in megatron backend (#568)
BHZ-BER Feb 10, 2026
6fd53f6
[AMD] Bump Megatron to 3714d81d and SGLang v0.5.7 (#563)
zyzshishui Feb 10, 2026
500d9e9
[CI] Fix CI oom in Qwen-30B-A3B (#580)
guapisolo Feb 11, 2026
9286b1a
docs: add Miles server arguments (#517)
Ratish1 Feb 11, 2026
725dcef
[AMD] Unify run-qwen3-4B.sh to support both AMD and NVIDIA GPUs
lizamd Feb 13, 2026
daab467
Address PR review comments
lizamd Feb 17, 2026
0ea7c3b
Add --sglang-disable-custom-all-reduce for AMD
lizamd Feb 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
.github/CODEOWNERS @fzyzcjy @Ying1123
.github/workflows/ @yushengsu-thu
/miles/ @fzyzcjy @yueming-yuan
/miles/backends/ @fzyzcjy @yueming-yuan @maocheng23
/miles/ray/ @fzyzcjy @yueming-yuan @maocheng23
/miles/rollout/ @fzyzcjy @yueming-yuan @guapisolo
/miles/router/ @fzyzcjy @yueming-yuan @guapisolo
/miles/utils/ @fzyzcjy @yueming-yuan @guapisolo @maocheng23
227 changes: 169 additions & 58 deletions .github/workflows/pr-test.yml

Large diffs are not rendered by default.

112 changes: 61 additions & 51 deletions .github/workflows/pr-test.yml.j2
Original file line number Diff line number Diff line change
@@ -1,76 +1,95 @@
<% set jobs = {
'fast': {
'test_executor': 'pytest',
'tests': [
{'test_file': 'fast', 'num_gpus': 0},
],
},
'unit-test': {
'label': 'run-unit-test',
'tests': [
{'test_file': 'e2e/fsdp/test_qwen3_4B_fsdp_true_on_policy.py', 'num_gpus': 2}
],
},
'e2e-test-sglang': {
'label': 'run-ci-sglang',
'test_executor': 'pytest',
'tests': [
{'test_file': 'e2e/sglang_patch/test_chat_input_ids_equivalence.py', 'num_gpus': 1},
],
},
'e2e-test-short': {
'label': 'run-ci-short',
'tests': [
{'test_file': 'test_qwen2.5_0.5B_gsm8k_async_short.py', 'num_gpus': 4},
{'test_file': 'test_qwen2.5_0.5B_gsm8k_short.py', 'num_gpus': 4},
{'test_file': 'test_qwen3_0.6B_fsdp_colocated_2xGPU.py', 'num_gpus': 2},
{'test_file': 'e2e/short/test_qwen2.5_0.5B_gsm8k_async_short.py', 'num_gpus': 4},
{'test_file': 'e2e/short/test_qwen2.5_0.5B_gsm8k_short.py', 'num_gpus': 4},
{'test_file': 'e2e/short/test_qwen3_0.6B_fsdp_colocated_2xGPU.py', 'num_gpus': 2},
],
},
'e2e-test-fsdp': {
'label': 'run-ci-fsdp',
'tests': [
{'test_file': 'test_qwen3_4B_fsdp_true_on_policy.py', 'num_gpus': 2},
{'test_file': 'test_qwen3_vl_4B_fsdp.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_0.6B_fsdp_distributed.py', 'num_gpus': 2},
{'test_file': 'test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
{'test_file': 'e2e/fsdp/test_qwen3_4B_fsdp_true_on_policy.py', 'num_gpus': 2},
{'test_file': 'e2e/fsdp/test_qwen3_vl_4B_fsdp.py', 'num_gpus': 8},
{'test_file': 'e2e/fsdp/test_qwen3_0.6B_fsdp_distributed.py', 'num_gpus': 2},
{'test_file': 'e2e/fsdp/test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
],
},
'e2e-test-megatron': {
'label': 'run-ci-megatron',
'tests': [
{'test_file': 'test_quick_start_glm4_9B.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_30B_A3B.py', 'num_gpus': 8, 'use_deepep': '1', 'use_fp8_rollout': '1'},
{'test_file': 'test_qwen3_30B_A3B_r3.py', 'num_gpus': 8, 'use_deepep': '1', 'use_fp8_rollout': '1', 'enable_eval': '0'},
{'test_file': 'test_qwen3_30B_A3B_r3.py', 'num_gpus': 8, 'enable_eval': '0'},
{'test_file': 'test_qwen3_4B_ppo.py', 'num_gpus': 8},
{'test_file': 'test_moonlight_16B_A3B.py', 'num_gpus': 8},
{'test_file': 'test_moonlight_16B_A3B_r3.py', 'num_gpus': 8, 'enable_eval': '0'},
{'test_file': 'test_mimo_7B_mtp_only_grad.py', 'num_gpus': 8},
{'test_file': 'e2e/megatron/test_quick_start_glm4_9B.py', 'num_gpus': 8},
{'test_file': 'e2e/megatron/test_qwen3_30B_A3B.py', 'num_gpus': 8, 'use_deepep': '1', 'use_fp8_rollout': '1'},
{'test_file': 'e2e/megatron/test_qwen3_30B_A3B_r3.py', 'num_gpus': 8, 'use_deepep': '1', 'use_fp8_rollout': '1', 'enable_eval': '0'},
{'test_file': 'e2e/megatron/test_qwen3_30B_A3B_r3.py', 'num_gpus': 8, 'enable_eval': '0'},
{'test_file': 'e2e/megatron/test_qwen3_4B_ppo.py', 'num_gpus': 8},
{'test_file': 'e2e/megatron/test_moonlight_16B_A3B.py', 'num_gpus': 8},
{'test_file': 'e2e/megatron/test_moonlight_16B_A3B_r3.py', 'num_gpus': 8, 'enable_eval': '0'},
{'test_file': 'e2e/megatron/test_mimo_7B_mtp_only_grad.py', 'num_gpus': 8},
],
},
'e2e-test-precision': {
'label': 'run-ci-precision',
'tests': [
{'test_file': 'test_qwen3_0.6B_parallel_check.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
{'test_file': 'e2e/precision/test_qwen3_0.6B_parallel_check.py', 'num_gpus': 8},
{'test_file': 'e2e/precision/test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
],
},
'e2e-test-ckpt': {
'label': 'run-ci-ckpt',
'tests': [
{'test_file': 'test_qwen3_4B_ckpt.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_4B_ckpt.py --async-save', 'num_gpus': 8},
{'test_file': 'e2e/ckpt/test_qwen3_4B_ckpt.py', 'num_gpus': 8},
{'test_file': 'e2e/ckpt/test_qwen3_4B_ckpt.py --async-save', 'num_gpus': 8},
],
},
'e2e-test-long': {
'label': 'run-ci-long',
'tests': [
{'test_file': 'test_qwen2.5_0.5B_gsm8k.py', 'num_gpus': 2},
{'test_file': 'test_qwen2.5_0.5B_gsm8k_async.py', 'num_gpus': 2},
{'test_file': 'e2e/long/test_qwen2.5_0.5B_gsm8k.py', 'num_gpus': 2},
{'test_file': 'e2e/long/test_qwen2.5_0.5B_gsm8k_async.py', 'num_gpus': 2},
],
},
'e2e-test-image': {
'label': 'run-ci-image',
'image': 'radixark/miles-test:latest',
'image': 'radixark/miles:latest',
'tests': [
{'test_file': 'test_qwen2.5_0.5B_gsm8k_async_short.py', 'num_gpus': 4},
{'test_file': 'test_qwen2.5_0.5B_gsm8k_short.py', 'num_gpus': 4},
{'test_file': 'test_qwen3_0.6B_fsdp_colocated_2xGPU.py', 'num_gpus': 2},
{'test_file': 'test_qwen3_4B_fsdp_true_on_policy.py', 'num_gpus': 2},
{'test_file': 'test_qwen3_vl_4B_fsdp.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_0.6B_fsdp_distributed.py', 'num_gpus': 2},
{'test_file': 'test_quick_start_glm4_9B.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_30B_A3B.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_4B_ppo.py', 'num_gpus': 8},
{'test_file': 'test_moonlight_16B_A3B.py', 'num_gpus': 8},
{'test_file': 'test_mimo_7B_mtp_only_grad.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_0.6B_parallel_check.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
{'test_file': 'test_qwen3_4B_ckpt.py', 'num_gpus': 8},
{'test_file': 'test_qwen3_4B_ckpt.py --async-save', 'num_gpus': 8},
{'test_file': 'test_qwen2.5_0.5B_gsm8k.py', 'num_gpus': 2},
{'test_file': 'test_qwen2.5_0.5B_gsm8k_async.py', 'num_gpus': 2},
{'test_file': 'e2e/image/test_qwen2.5_0.5B_gsm8k_async_short.py', 'num_gpus': 4},
{'test_file': 'e2e/image/test_qwen2.5_0.5B_gsm8k_short.py', 'num_gpus': 4},
{'test_file': 'e2e/image/test_qwen3_0.6B_fsdp_colocated_2xGPU.py', 'num_gpus': 2},
{'test_file': 'e2e/image/test_qwen3_4B_fsdp_true_on_policy.py', 'num_gpus': 2},
{'test_file': 'e2e/image/test_qwen3_vl_4B_fsdp.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_0.6B_fsdp_distributed.py', 'num_gpus': 2},
{'test_file': 'e2e/image/test_quick_start_glm4_9B.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_30B_A3B.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_4B_ppo.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_moonlight_16B_A3B.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_mimo_7B_mtp_only_grad.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_0.6B_parallel_check.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_0.6B_megatron_fsdp_align.py', 'num_gpus': 4},
{'test_file': 'e2e/image/test_qwen3_4B_ckpt.py', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen3_4B_ckpt.py --async-save', 'num_gpus': 8},
{'test_file': 'e2e/image/test_qwen2.5_0.5B_gsm8k.py', 'num_gpus': 2},
{'test_file': 'e2e/image/test_qwen2.5_0.5B_gsm8k_async.py', 'num_gpus': 2},
],
},
} %>
Expand Down Expand Up @@ -98,7 +117,7 @@ concurrency:
jobs:
<% for job_name, config in jobs.items() %>
<< job_name >>:
if: (github.event_name == 'workflow_dispatch') || (github.event.pull_request && contains(github.event.pull_request.labels.*.name, '<< config.label >>'))
if: (github.event_name == 'workflow_dispatch') || (github.event.pull_request<% if config.label %> && contains(github.event.pull_request.labels.*.name, '<< config.label >>')<% endif %>)
runs-on: self-hosted
container:
image: << config.image if config.image else 'radixark/miles:latest' >>
Expand Down Expand Up @@ -153,14 +172,5 @@ jobs:

- name: Execute
shell: bash
run: python tests/ci/gpu_lock_exec.py --count ${{ matrix.info.num_gpus }} -- python tests/${{ matrix.info.test_file }}

- name: Post-test cleanup
if: always()
shell: bash
run: |
pkill -9 -f 'ray::' 2>/dev/null || true
pkill -9 -f raylet 2>/dev/null || true
ray stop --force 2>/dev/null || true
rm -rf /tmp/ray/* 2>/dev/null || true
<% endfor %>
run: python tests/ci/gpu_lock_exec.py --count ${{ matrix.info.num_gpus }} -- << config.test_executor | default('python') >> tests/${{ matrix.info.test_file }}
<% endfor %>
Loading
Loading