DSGym introduces a novel execution-verified framework for training and evaluating data science agents, addressing critical gaps in benchmark validity and dat...
Level: advanced
By Fan Nie and 8 other authors
Category: research