HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

Explore HardcoreLogic, a rigorous benchmark designed to test the true reasoning capabilities of Large Reasoning Models through complex, long-tail logic puzzl...

Level: advanced

By Unknown

Category: research