A Rosetta Stone for AI Benchmarks

This research introduces a statistical framework that unifies heterogeneous AI benchmarks onto a single numerical scale, enabling robust cross-task compariso...

Level: advanced

By Anson Ho, Jean-Stanislas Denain, David Atanasov, Samuel Albanie, Rohin Shah

Category: research