RO-Bench: Large-scale robustness evaluation of MLLMs with text-driven counterfactual videos
Explore RO-Bench, a novel framework for evaluating the robustness of Multimodal Large Language Models using counterfactual video data in dynamic out-of-distr...