Difference between self-attention and cross-attention in diffusion model unet #8555

Ahmad-Omar-Ahsan · 2025-09-03T16:28:35Z

Ahmad-Omar-Ahsan
Sep 3, 2025

Hello everyone,

I am training a 2D conditional diffusion model on different labels. At the moment, I am only changing the number of classes parameter in the U-Net. I noticed that there is a context-embed argument, which goes along with the with_conditioning argument. Going through the code, it looks like if with_conditioning is set to True, then it calls cross-attention; otherwise, it calls self-attention.

Which would be better, cross-attention or self-attention? Secondly, if I decide to use cross-attention, what should the size of my context embedding be?

NabJa · 2025-09-04T07:20:42Z

NabJa
Sep 4, 2025

Hi,

I haven't worked with this exact implementation. But generally, if you only have a few discrete labels, self-attention is usually fine, the model will learn to condition on those. Cross-attention is really good when your conditioning input has more structure (clinical features, text, etc.), since it lets the network focus dynamically instead of treating the label as a simple embedding.

For context_embed, typically you would pick something in line with your U-Net hidden dim (often the first channels entry, e.g. 64 or 128). If you use cross-attention, embed your labels into that size first.

Hope this helps :)

1 reply

Ahmad-Omar-Ahsan Sep 4, 2025
Author

Thank you, this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Difference between self-attention and cross-attention in diffusion model unet #8555

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Difference between self-attention and cross-attention in diffusion model unet #8555

Uh oh!

Ahmad-Omar-Ahsan Sep 3, 2025

Replies: 1 comment · 1 reply

Uh oh!

NabJa Sep 4, 2025

Uh oh!

Ahmad-Omar-Ahsan Sep 4, 2025 Author

Ahmad-Omar-Ahsan
Sep 3, 2025

Replies: 1 comment 1 reply

NabJa
Sep 4, 2025

Ahmad-Omar-Ahsan Sep 4, 2025
Author