openai/evals
steadyEvals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Python
View on GitHub
Stars
18,839
Forks
3,006
Open issues
126
24h
+1
+0.0%
7d
+59
+0.3%
Refresh
1h
Star history (7 days)
Last checked
39m ago
Last pushed
14 Apr 2026
Next check
just now