Reinforcing Structured Chain-of-Thought for Video Understanding

This research introduces Summary-Driven Reinforcement Learning (SDRL), a novel framework designed to overcome thinking drift and weak temporal comprehension ...

Level: advanced

By Peiyao Wang

Category: research