G$^2$RPO: Granular GRPO for Precise Reward in Flow Models

Explore G^2RPO, a novel framework leveraging singular stochastic sampling and multi-granularity advantage integration to solve sparse reward challenges in fl...

Level: advanced

By Unknown

Category: research