Exact Attention Sensitivity and the Geometry of Transformer Stability

This research establishes a rigorous stability theory for transformers by deriving the exact operator norm of the softmax Jacobian, revealing that stability ...

Level: expert

By Seyed Morteza Emadi

Category: research