This research introduces PHANTOM RECALL, a benchmark exposing how Large Language Models fail at logic puzzles despite linguistic fluency, highlighting critic...
Level: advanced
By Unknown
Category: research