Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

This research introduces a Computerized Adaptive Testing framework grounded in Item Response Theory to efficiently evaluate Large Language Models in medical ...

Level: advanced

By Tianpeng Zheng

Category: research