Reward Design for Physical Reasoning in Vision-Language Models

This research investigates how specific reward designs enhance physical reasoning in Vision-Language Models using GRPO training, revealing that attention-bas...

Level: advanced

By Derek Lilienthal

Category: research