Explore FeynmanBench, a rigorous new benchmark testing multimodal LLMs on complex Feynman diagram reasoning and physical constraint enforcement.
Level: advanced
By Zeyu Wang
Category: research