-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Support mxfp8 proj gemm weight quant caching
#4489
opened Apr 28, 2026 by
gdengk
Contributor
Loading…
5 tasks
Checkpoint conversion between GPT_model and Hybrid_model
#4482
opened Apr 27, 2026 by
guihong-nv
Contributor
•
Draft
1 of 5 tasks
[dev] [DeepSeek-v4] Part 2: Hash MoE, SwiGLU clamp, and new mHC contract
dev branch
Dev branch related issues and development
Start draft PR for MLA support to Muon optimizer
community-request
#4477
opened Apr 26, 2026 by
Prachi-kushwaha
•
Draft
5 tasks
Start draft PR for get_tensor_device fix
community-request
#4476
opened Apr 26, 2026 by
Prachi-kushwaha
Loading…
feat(attention): Add attention_per_head_gate and rotary_base_per_laye…
#4473
opened Apr 26, 2026 by
shifangx
Contributor
Loading…
5 tasks
docs(moe): correct moe_router_topk_scaling_factor docstring
community-request
complexity: low
waiting-on-customer
Waiting on the original author to respond
[codex] Fix Mamba conv params under fine-grained FSDP gather
complexity: low
#4467
opened Apr 24, 2026 by
ilml
Contributor
Loading…
ci: Fix event name reference in CI workflow condition for merge group
Approved
All necessary approvals have been made
complexity: low
[dev] [DeepSeek-v4] Part 1: Hybrid Attention with CSA and HCA
dev branch
Dev branch related issues and development
Previous Next
ProTip!
no:milestone will show everything without a milestone.