AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy
Explore AstroMMBench, a rigorous benchmark designed to evaluate how well Multimodal Large Language Models interpret complex astronomical imagery across six d...