Re-Mask and Redirect: Exploiting Denoising Irreversibility in Diffusion Language Models

This research introduces TrajHijack, a novel trajectory-level attack that exploits denoising irreversibility to compromise safety alignment in diffusion lang...

Level: advanced

By Arth Singh

Category: research