RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models

Explore RefusalBench, a novel framework designed to evaluate how grounded language models selectively refuse queries, revealing critical gaps in their reliab...

Level: advanced

By Unknown

Category: research