This research explores a novel approach to evaluating Large Language Models using 200x less data by leveraging genetic algorithms for benchmark compression. ...
Level: advanced
By Unknown
Category: research