vllm-project/vllm
hotA high-throughput and memory-efficient inference and serving engine for LLMs
Python
View on GitHub
Stars
83,551
Forks
18,317
Open issues
2,015
24h
+118
+0.1%
7d
+777
+1.0%
Refresh
30m
Star history (7 days)
Last checked
18m ago
Last pushed
19m ago
Next check
just now