-
Notifications
You must be signed in to change notification settings - Fork 140
Pull requests: jd-opensource/xllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: support GLM5 on NPU with PD disaggregation, mtp, and reasoning parser
#928
opened Feb 14, 2026 by
sunbaosong
Loading…
refactor: extract MTPWorkerImpl from SpeculativeWorkerImpl.
#926
opened Feb 12, 2026 by
Clement-Wang26
Loading…
feat: adapt for CANN 8.5 and PyTorch 2.7.1 for npu device.
#923
opened Feb 12, 2026 by
haimbb000
Loading…
[Draft] bugfix: temporary workaround of mtp draft models bugs for mlu.
#919
opened Feb 12, 2026 by
a120092009
•
Draft
bugfix: fix the coredump error in chat template when assistant messages are present.
#913
opened Feb 11, 2026 by
DongheJin
Loading…
refactor: clarify multimodal input error messages in callback.
#909
opened Feb 10, 2026 by
xanecdotex
Loading…
feat: add layers that support Qwen3 model on musa device.
#902
opened Feb 10, 2026 by
FleckyFelix
Loading…
refactor: abstract and refactor some similar functions in prepare_batch.
#897
opened Feb 9, 2026 by
weizhehuang0827
Loading…
feat: support index cache transfer in PD separate deployment scenario.
#895
opened Feb 7, 2026 by
sunbaosong
Loading…
feat: adapt for CANN 8.5 and PyTorch 2.7.1 for npu device.
#891
opened Feb 6, 2026 by
haimbb000
Loading…
bugfix: fix mrope position tensor shape mismatch in NPU graph mode.
#885
opened Feb 4, 2026 by
QwertyJack
Loading…
2 tasks done
feat: add VMM submitter APIs for non-blocking vmm::map/unmap.
#874
opened Feb 3, 2026 by
shifengmin
Loading…
feat: implement rpc interface in APIService for xllm service internal usage.
#837
opened Jan 30, 2026 by
weizhehuang0827
Loading…
bugfix: fix MTP k>1 crash by loading embed_tokens weights
#836
opened Jan 29, 2026 by
QwertyJack
•
Draft
3 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-01-15.