Explore TetrisBench, a novel benchmark revealing how Large Language Models struggle with long-horizon planning and dynamic adaptation in real-time spatial ta...
Level: advanced
By Yoko Li
Category: research