Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?

This research introduces the Oracle Performance Gap to expose critical flaws in current RL benchmarks, proposing three principles for designing evaluations t...

Level: advanced

By Unknown

Category: research