SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

SimBench introduces a rigorous benchmark for evaluating how well Large Language Models simulate human behavior, revealing critical insights into architectura...

Level: advanced

By Unknown

Category: research