Models Know Their Shortcuts: Deployment-Time Shortcut Mitigation

This research introduces Shortcut Guardrail, a deployment-time framework that mitigates shortcut learning in pretrained language models using gradient-based ...

Level: advanced

By Jiayi Li

Category: research