Skip to content

+ learned attention, - fused loss, #194

Merged
rufinv merged 33 commits intomainfrom
modules-pr-only
Jan 8, 2026
Merged

+ learned attention, - fused loss, #194
rufinv merged 33 commits intomainfrom
modules-pr-only

Conversation

@RolandBERTINJOHANNET
Copy link
Collaborator

@RolandBERTINJOHANNET RolandBERTINJOHANNET commented Dec 8, 2025

Adding the changes from the attention paper, where they are relevant !

the general idea is I :

  • removed the fused loss (it now counts as dcy)

  • we still log all the losses individually (by encoded and decoded modalities)

  • separated the cycle loss from the broadcast loss function (without re-encoding anything, the broadcast loss returns the elements required for cycling)

  • added the paper's attention version + a helper for the user to switch attention to trained after training a RandomSelection gw

  • added tests for stability.

  • we decided against including the branching between different selection mechanisms inside of the broadcast loss, which was only useful for the case where we were doing augmentation on MM-IMDB

@rufinv rufinv merged commit 5cce857 into main Jan 8, 2026
3 checks passed
@RolandBERTINJOHANNET RolandBERTINJOHANNET deleted the modules-pr-only branch January 26, 2026 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants