VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual Reasoning

This research identifies fine-grained visual perception as a critical bottleneck in multimodal large language models during abstract visual reasoning. It int...

Level: advanced

By Hao Yan and 13 other authors

Category: research