GeeeekExplorer / nano-vllm Public

Notifications You must be signed in to change notification settings
Fork 2k
Star 13.2k

Code
Issues 23
Pull requests 40
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: GeeeekExplorer/nano-vllm

Labels 9 Milestones 0

New pull request New

40 Open 77 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix(model_runner): cover all decode batch sizes in CUDA graph buckets

#216 opened Apr 25, 2026 by voidborne-d

Loading…

fix(model_runner): all_reduce num_kvcache_blocks to MIN across TP ranks

#215 opened Apr 24, 2026 by Anai-Guo Contributor

Loading…

fix: normalize legacy rope_scaling to rope_parameters

#214 opened Apr 24, 2026 by cuber726579

Loading…

fix(scheduler): resolve out-of-bounds error during prefix cache hit after sequence preemption

#211 opened Apr 22, 2026 by wangyuzhuo116

Loading…

2 tasks done

fix: recompute scheduled tokens after prefix-cache allocation

#210 opened Apr 22, 2026 by tj1235

Loading…

Fix RMSNorm fp32 input mutation

#205 opened Apr 14, 2026 by JxKim

Loading…

feat: add PD disaggregated prefill stub implementation based on vllm-…

#202 opened Apr 12, 2026 by SunChenxiang123

Loading…

feat: add CPU offload connector stubs for KV cache GPU↔CPU transfer

#201 opened Apr 12, 2026 by SunChenxiang123

Loading…

fix: pass rope_scaling=None for Qwen3 to avoid unhashable dict error

#198 opened Apr 9, 2026 by CruxZhou

Loading…

[Feat] Add PyTorch Profiler support for performance analysis

#193 opened Mar 31, 2026 by RagingSilence

Loading…

Fix CUDA graph block_tables shape mismatch

#191 opened Mar 24, 2026 by ilrewrite

Loading…

Feature/support llama3

#188 opened Mar 21, 2026 by wudong5

Loading…

fix: update download command for model weights in README

#185 opened Mar 12, 2026 by SYaoJun

Loading…

feat: INT8 KV cache quantization (~48% memory reduction)

#184 opened Mar 9, 2026 by dzhengAP

Loading…

docs: add Chinese README and language links

#183 opened Mar 8, 2026 by LJS1124

Loading…

fix: rope_scaling unhashable dict error with transformers>=5.1.0

#182 opened Mar 8, 2026 by chenwenxiaolive

Loading…

1 task done

refactor(block_manager): replace numpy with array for token ID hashing

#180 opened Mar 7, 2026 by fly1989

Loading…

add a Dockerfile for nano-vllm

#178 opened Mar 3, 2026 by pacoxu

Loading…

[Doc]Add Repository Architecture Overview Document

#177 opened Feb 26, 2026 by CalvinXKY

Loading…

Avoid per-step allocations in CUDA-graph decode(fix #175)

#176 opened Feb 23, 2026 by MrAnayDongre

Loading…

fix: modify input when input is fp32

#171 opened Feb 8, 2026 by philhuan

Loading…

fix(rms_norm): add copy for residual

#169 opened Jan 28, 2026 by tpoisonooo

Loading…

fix: clean up hash_to_block_id mapping when deallocating blocks

#153 opened Jan 6, 2026 by ggboooy

Loading…

With detailed Chinese comments for easy learning

#138 opened Nov 29, 2025 by lioZ129

Loading…

[ADD] Add TTFT, TPOT metrics in tqdm bars.

#133 opened Nov 16, 2025 by mumupika

Loading…

Previous 1 2 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!