-
Notifications
You must be signed in to change notification settings - Fork 6
Description
This is a great GitHub project, and I'm very grateful for your meticulous work. I'd like to recommend a paper from our ICLR 2026 DLLM project.
《Context Tokens are Anchors: Understanding the Repeat Curse in dMLLMs from an Information Flow Perspective》
Paper: https://arxiv.org/pdf/2601.20520
Code: https://github.com/ErikZ719/CoTA
Main Contribution: Analyzes the token duplication problem in diffusion MLLMs after using caching technology from an information flow perspective, and proposes a training-free mitigation method.
Current diffusion MLLMs face inference latency challenges and rely on caching technology to accelerate decoding. However, we found that current caching methods often introduce unnecessary duplicate token generation, which we call the "curse of duplication." In response, we analyzed the causes of the "curse of repetition" from the perspective of information flow, revealing three key findings: (1) Context tokens aggregate semantic information as anchors and guide the final prediction; (2) As information propagates between layers, the entropy of context tokens tends to converge in deeper layers, reflecting the stable increase in the model's confidence when making predictions; (3) The generation of repeated tokens is usually related to the disruption of the information flow pattern of context tokens and the inability of their entropy to converge in deeper layers. Based on these findings, we propose CoTA, a plug-and-play method for mitigating repeated tokens.