Reinforcement Fine-Tuning of Flow-Matching Policies for Vision-Language-Action Models

This research introduces FPO, an advanced online reinforcement fine-tuning framework for Vision-Language-Action models that leverages stable conditional flow...

Level: expert

By Unknown

Category: research