MathArena: Evaluating LLMs on Uncontaminated Math Competitions

MathArena introduces a rigorous benchmark for evaluating LLMs on math competitions, specifically addressing data contamination and proof-writing capabilities...

Level: advanced

By Unknown

Category: research