M3-Bench introduces a rigorous, model-agnostic framework for evaluating multimodal agents through multi-hop, multi-threaded workflows. This research details ...
Level: advanced
By Unknown
Category: research