Direct Quantized Training of Language Models with Stochastic Rounding

This research introduces a method for training large language models using stochastic rounding to achieve high performance with low-precision ternary weights...

Level: advanced

By Unknown

Category: research