openai

openai/evals

steady

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python View on GitHub

Stars

18,839

Forks

3,006

Open issues

126

24h

+1

+0.0%

7d

+59

+0.3%

Refresh

1h

Star history (7 days)

Last checked

39m ago

Last pushed

14 Apr 2026

Next check

just now