You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed in your paper that you mentioned two control methods:
"UniAnimate supports human video animation using only a reference image and a target pose sequence, as well as the input of a first frame."
For long videos, you mentioned:
"For subsequent segments, we use the reference image along with the first frame of the previous segment to initiate the next generation."
Does this mean that during inference for subsequent windows, I should use the last frame of the previous result as the reference image, or should I just pass it in as local_image?
Additionally, I noticed in your code that long video processing still uses the overlap approach. What is the reasoning behind this choice?
Thank you
The text was updated successfully, but these errors were encountered:
Hello, thanks for such great algorithms and code!
I noticed in your paper that you mentioned two control methods:
"UniAnimate supports human video animation using only a reference image and a target pose sequence, as well as the input of a first frame."
For long videos, you mentioned:
"For subsequent segments, we use the reference image along with the first frame of the previous segment to initiate the next generation."
Does this mean that during inference for subsequent windows, I should use the last frame of the previous result as the reference image, or should I just pass it in as local_image?
Additionally, I noticed in your code that long video processing still uses the overlap approach. What is the reasoning behind this choice?
Thank you
The text was updated successfully, but these errors were encountered: