Speech synthesis results #1

athenasaurav · 2022-12-10T16:43:08Z

Liked your work in Transfer TTS and SC VITS. I have trained a model up to 350000 steps using LibriTTS train clean 100 dataset only but when I synthesize results using some random audio file the speech is not clear.

So, my question is:

How many steps did you train your model?
What should be the length (duration) of audio files while passing to inference.py.
Also should the reference audio be a part of the training data speaker, or can it be unseen?
Do you have any demo page where we can see the comparison of Transfer TTS generated audio with VITS?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech synthesis results #1

Speech synthesis results #1

athenasaurav commented Dec 10, 2022 •

edited

Loading

Speech synthesis results #1

Speech synthesis results #1

Comments

athenasaurav commented Dec 10, 2022 • edited Loading

athenasaurav commented Dec 10, 2022 •

edited

Loading