Synthesizing Audio with Unseen Speakers Using Pre-trained VITS Model #3741
Unanswered
adil-ahmed
asked this question in
General Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been using a pre-trained VITS model (VCTK dataset) for text-to-speech synthesis. I've successfully obtained a list of available speakers using the command:
!tts --model_name tts_models/en/vctk/vits --list_speaker_idxs
Additionally, I've synthesized audio from one of the speakers (p234) using the following code:
Now, I'm facing a challenge where I need to synthesize audio from the same pre-trained model but with the voice of a speaker who wasn't present in the dataset during training. I understand that I need to provide a reference audio for this purpose (Zero shot).
Can someone guide me on how to achieve this? Any suggestions or code examples would be highly appreciated.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions