In the network.py code, you do the following for the fidelity branch:
fidelity_score = self.fidelity_branch(src_video, edit_video)
However, as per the code given in fidelity.py, it would take prompts = edit_video, which is not what is needed. Could you please confirm is this is the correct code for this? If not, could you provide the corrected code?
Also, in fidelity.py, when inference is True, you use:
for key in self.backbone_preserve_keys:...
Whereas when False, you use:
for key in vclips:...
If both get the same keys (aesthetic and technical for fidelity and traditional branch, text for text), then why is the implementation different? Could you please clarify?
In the network.py code, you do the following for the fidelity branch:
fidelity_score = self.fidelity_branch(src_video, edit_video)However, as per the code given in fidelity.py, it would take prompts = edit_video, which is not what is needed. Could you please confirm is this is the correct code for this? If not, could you provide the corrected code?
Also, in fidelity.py, when inference is True, you use:
for key in self.backbone_preserve_keys:...Whereas when False, you use:
for key in vclips:...If both get the same keys (aesthetic and technical for fidelity and traditional branch, text for text), then why is the implementation different? Could you please clarify?