-
Notifications
You must be signed in to change notification settings - Fork 48
Pull requests: HabanaAI/vllm-hpu-extension
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[aice/v1.22.0][WIP] add static moe swiglustep for bf16
#411
opened Mar 11, 2026 by
ranzhejiang
Contributor
Loading…
Add dynamic_quant_for_gaudi2.py script to convert model
#387
opened Oct 29, 2025 by
wenbinc-Bin
Loading…
[SW-238300] Disabling dynamic quantization in mlp module
#383
opened Oct 26, 2025 by
HolyFalafel
Loading…
pass
chunk_size and global_num_experts to the MoE kernel
#369
opened Sep 19, 2025 by
yangulei
Contributor
Loading…
Add flag pin_memory to call from hpu.py in vllm
#325
opened Aug 5, 2025 by
xuechendi
Contributor
Loading…
Fix the fusedsdpa with sliding window alignment issue
#298
opened Jul 17, 2025 by
libinta
Contributor
Loading…
Draft: Proper chunked prefill bucketing
#295
opened Jul 16, 2025 by
kzawora-intel
Collaborator
•
Draft
Allow usage of fused_block_softmax_adjustment for Qwen with Lazy
#246
opened Jun 27, 2025 by
mswiniarsk
Contributor
•
Draft
[SW-225565] Enable triangular softmax with merged prefill
#197
opened May 26, 2025 by
kamil-kaczor
Contributor
•
Draft
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.