The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS

Explore the MUSE Benchmark, a rigorous framework evaluating how well audio LLMs perceive music and reason about auditory relationships. This research highlig...

Level: advanced

By Brandon James Carone, Iran R. Roman, Pablo Ripollés

Category: research