Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

This research investigates how to accelerate masked diffusion language models by identifying which denoising steps are robust enough to use smaller models, a...

Level: advanced

By Ivan Sedykh

Category: research