Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover

This research establishes a theoretical framework using spin-glass systems to explain the polynomial-to-exponential scaling of adversarial attacks on Large L...

Level: expert

By Indranil Halder, Annesya Banerjee, Cengiz Pehlevan

Category: research