xlite-dev/awesome-llm-inference
steady📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Python
View on GitHub
Stars
5,368
Forks
416
Open issues
1
24h
+3
+0.1%
7d
+22
+0.4%
Refresh
2h
Star history (7 days)
Last checked
1h ago
Last pushed
23 Jun 2026
Next check
just now