Replies: 3 comments 2 replies
-
I want to train a model, but it is not in our immediate plans. However, if you like to do it, I'd be more than happy to help. Previously we trained some models with @Edresson and get good results but VCTK was easier to train and gave better results, so we did not release a LibriTTS model. |
Beta Was this translation helpful? Give feedback.
-
Hi, I guess Fastspeech2 and hifigan is sensitive for data quality and scale, when I finetuned the CSMSC single-speaker model (fs2+hifigan) by 3 hrs data from a single speaker, the inferenced speech sounds metallic and buzzy... |
Beta Was this translation helpful? Give feedback.
-
Any estimate on how many hours it would take to train VITS on VCTK and LibriTTS using Nvidia a100 80gb? |
Beta Was this translation helpful? Give feedback.
-
Hi,
I came from the TensorFlowTTS project, tried with their LibrTTS recipe of multispeakers model training. However, the results I got from their Fastspeech2 model is not plausible. It sounds metallic and buzzy. I wonder it is the dataset problem, e.g. hard to train, or it is the multispeaker problem.
I wonder is there any plan in the project to having a Libristts model.
Beta Was this translation helpful? Give feedback.
All reactions