Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frames are not smooth #7

Open
sumansid opened this issue Nov 20, 2024 · 9 comments
Open

Frames are not smooth #7

sumansid opened this issue Nov 20, 2024 · 9 comments

Comments

@sumansid
Copy link

Is there any way to smooth out the frames? When I run it, the change of frames is very visible and the cutoff from one expression to another is harsh.

@xuyangcao
Copy link
Collaborator

Is there any way to smooth out the frames? When I run it, the change of frames is very visible and the cutoff from one expression to another is harsh.

Yes, in many cases, the continuity between frames is not smooth enough. Currently, we have added some smoothing loss during training and applied EMA smoothing to the inferred motion sequence during post-processing, but the results are not yet optimal. We are considering releasing the training code to facilitate community collaboration for further optimization.

@sumansid
Copy link
Author

cool, thanks for the reply. Are the frames not smooth because of too many motions in every frame generated by your model before running it through liveportrait? Have you tried lip-sync only ? or lip-sync and eyes only ?

@xuyangcao
Copy link
Collaborator

cool, thanks for the reply. Are the frames not smooth because of too many motions in every frame generated by your model before running it through liveportrait? Have you tried lip-sync only ? or lip-sync and eyes only ?

Good question. We smoothed the head motions before running the LivePortrait model.

I think another possible reason for residual inconsistencies might be that, during training and inference, we treated different dimensions of the motion sequence as independent units rather than as integrated sequences, as they may have correlations with each other.

We have tried lip-sync only and used head motions from real-life videos, and it works better than predicting whole motions.

@wvinzh
Copy link

wvinzh commented Nov 27, 2024

I have also done work similar to JoyVASA, and the results I have obtained may not be smooth enough. Sometimes there may even be mutations in Pose. I have tried smoothing the loss, but almost none of them have effectively improved it. Further experiments revealed that this was not caused by changes in Pose, but rather by the delta of the expression, as even if I used a predefined Pose to only change the delta of the expression, this situation would still occur. I speculate that the motion data extracted by LivePortrait is not very good, and there may be some flaws in certain KeyPoints in LP, which needs further verification. Looking forward to the author solving this problem.

@sumansid
Copy link
Author

Have you tried smoothing the keypoints ?

@wvinzh
Copy link

wvinzh commented Nov 27, 2024

Have you tried smoothing the keypoints ?

Yes, during training, I tried several levels of Smooth, including the Coefficients Smooth, the Keypoints Smooth, and the Pose Smooth, but the problem still persists. During inference, I directly reuse the smooth from LivePortrait. videoHere are my results.

@sumansid
Copy link
Author

sumansid commented Nov 27, 2024

wow, your results actually look much better than mine.

@sumansid
Copy link
Author

I think the main issue is due to sudden pose or keypoint changes, I wonder if there's a way to smooth them out without affecting lipsync.

@xuyangcao
Copy link
Collaborator

I have also done work similar to JoyVASA, and the results I have obtained may not be smooth enough. Sometimes there may even be mutations in Pose. I have tried smoothing the loss, but almost none of them have effectively improved it. Further experiments revealed that this was not caused by changes in Pose, but rather by the delta of the expression, as even if I used a predefined Pose to only change the delta of the expression, this situation would still occur. I speculate that the motion data extracted by LivePortrait is not very good, and there may be some flaws in certain KeyPoints in LP, which needs further verification. Looking forward to the author solving this problem.

Exactly, I also think the motion data is not good enough, see these issues: 1. KwaiVGI/LivePortrait#439 2. KwaiVGI/LivePortrait#433

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants