-
227
MMLU-Pro Leaderboard
๐ฅMore advanced and challenging multi-task evaluation
-
51
Stick To Your Role! Leaderboard
๐ญBenchmarking LLMs on the stability of simulated populations
-
53
ZeroEval Leaderboard
๐Embed and use ZeroEval for evaluation tasks
-
26
Decentralized Arena Leaderboard
๐ฅView and compare LLM evaluations across various domains
Hristo Panev
hppdqdq
AI & ML interests
None yet
Recent Activity
liked
a model
3 days ago
moondream/moondream3-preview
liked
a model
10 days ago
Jinx-org/Jinx-gpt-oss-20b-GGUF
liked
a Space
11 days ago
TheDrummer/directory
Organizations
None yet