Explore RefusalBench, a novel framework designed to evaluate how grounded language models selectively refuse queries, revealing critical gaps in their reliab...
Level: advanced
By Unknown
Category: research