You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why would you consider continue using Pre-training data like COYO, LAION, and drop out LAION-COCO? I think it is more reasonable to add more caption dataset like TextCaps, Laion-coco and COCOCaps. Besides, what if you do not use the pre-training data in the multi-task pre-training stage? Would the representation of visual encoder being affected?
The text was updated successfully, but these errors were encountered:
Why would you consider continue using Pre-training data like COYO, LAION, and drop out LAION-COCO? I think it is more reasonable to add more caption dataset like TextCaps, Laion-coco and COCOCaps. Besides, what if you do not use the pre-training data in the multi-task pre-training stage? Would the representation of visual encoder being affected?
The text was updated successfully, but these errors were encountered: