Explore the Token Games framework, an unsupervised method using pairwise programming duels and Elo ratings to rigorously evaluate large language model reason...
Level: advanced
By Unknown
Category: research