MLPMoE: Zero-Shot Architectural Metamorphosis of Dense LLM MLPs into Static Mixture-of-Experts

Explore MLPMoE, a training-free method that transforms dense LLM feed-forward layers into static Mixture-of-Experts configurations using tensor operations an...

Level: advanced

By Ivan Novikov

Category: research