Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover
This research establishes a theoretical framework using spin-glass systems to explain the polynomial-to-exponential scaling of adversarial attacks on Large L...
Level: expert
By Indranil Halder, Annesya Banerjee, Cengiz Pehlevan