AdaPM: a Partial Momentum Algorithm for LLM Training

Explore AdaPM, a novel memory-efficient optimizer that leverages partial momentum and bias correction to drastically reduce training costs for large language...

Level: advanced

By Unknown

Category: research