SurfaceBench introduces a rigorous benchmark to test if self-evolving LLMs can derive 3D scientific equations, revealing significant gaps in their compositio...
Level: advanced
By Sanchit Kabra, Shobhnik Kriplani, Parshin Shojaee, Chandan K. Reddy
Category: research