Mitigating Many-Shot Jailbreaking

This research investigates Many-Shot Jailbreaking, a sophisticated attack exploiting long context windows in LLMs, and evaluates combined mitigation strategi...

Level: advanced

By Christopher M. Ackerman, Nina Panickssery

Category: discussion