Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models
This research addresses critical reasoning flaws in diffusion multimodal models by introducing Position and Step Penalty and Visual Reasoning Guidance to enh...