Explore Xmodel-2.5, a 1.3B-parameter SLM leveraging maximal-update parameterization and the Muon optimizer to achieve high reasoning accuracy with minimal da...
Level: advanced
By Unknown
Category: research