RegexPSPACE: A Benchmark for Evaluating LLM Reasoning on PSPACE-complete Regex Problems

Explore RegexPSPACE, a rigorous benchmark designed to test the reasoning limits of LLMs and LRMs on PSPACE-complete regex problems. This research highlights ...

Level: advanced

By Unknown

Category: research