L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts

Explore L-MoE, a novel architecture merging Mixture of Experts with Low-Rank Adaptation to achieve end-to-end training with 10% of the parameters of dense mo...

Level: advanced

By Shihao Ji, Zihui Song

Category: research