Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models

Explore VivaBench, a benchmark designed to test Large Language Models' ability to perform sequential clinical reasoning through multi-turn dialogue, revealin...

Level: advanced

By Unknown

Category: research