FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

Explore FeynmanBench, a rigorous new benchmark testing multimodal LLMs on complex Feynman diagram reasoning and physical constraint enforcement.

Level: advanced

By Zeyu Wang

Category: research