Rank-Aware Spectral Bounds on Attention Logits for Stable Low-Precision Training

This research establishes rank-aware spectral bounds to stabilize low-precision training in large language models, eliminating overflow risks through geometr...

Level: expert

By Seyed Morteza Emadi

Category: research