Explore CGBench, a specialized benchmark evaluating how well language models interpret clinical genetics data, highlighting critical reasoning gaps and hallu...
Level: advanced
By Unknown
Category: research