-
Notifications
You must be signed in to change notification settings - Fork 2k
Pull requests: GeeeekExplorer/nano-vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(model_runner): cover all decode batch sizes in CUDA graph buckets
#216
opened Apr 25, 2026 by
voidborne-d
Loading…
fix(model_runner): all_reduce num_kvcache_blocks to MIN across TP ranks
#215
opened Apr 24, 2026 by
Anai-Guo
Contributor
Loading…
fix: normalize legacy rope_scaling to rope_parameters
#214
opened Apr 24, 2026 by
cuber726579
Loading…
fix(scheduler): resolve out-of-bounds error during prefix cache hit after sequence preemption
#211
opened Apr 22, 2026 by
wangyuzhuo116
Loading…
2 tasks done
fix: recompute scheduled tokens after prefix-cache allocation
#210
opened Apr 22, 2026 by
tj1235
Loading…
feat: add PD disaggregated prefill stub implementation based on vllm-…
#202
opened Apr 12, 2026 by
SunChenxiang123
Loading…
feat: add CPU offload connector stubs for KV cache GPU↔CPU transfer
#201
opened Apr 12, 2026 by
SunChenxiang123
Loading…
fix: pass rope_scaling=None for Qwen3 to avoid unhashable dict error
#198
opened Apr 9, 2026 by
CruxZhou
Loading…
[Feat] Add PyTorch Profiler support for performance analysis
#193
opened Mar 31, 2026 by
RagingSilence
Loading…
fix: update download command for model weights in README
#185
opened Mar 12, 2026 by
SYaoJun
Loading…
feat: INT8 KV cache quantization (~48% memory reduction)
#184
opened Mar 9, 2026 by
dzhengAP
Loading…
fix: rope_scaling unhashable dict error with transformers>=5.1.0
#182
opened Mar 8, 2026 by
chenwenxiaolive
Loading…
1 task done
refactor(block_manager): replace numpy with array for token ID hashing
#180
opened Mar 7, 2026 by
fly1989
Loading…
Avoid per-step allocations in CUDA-graph decode(fix #175)
#176
opened Feb 23, 2026 by
MrAnayDongre
Loading…
fix: clean up hash_to_block_id mapping when deallocating blocks
#153
opened Jan 6, 2026 by
ggboooy
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.