You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all,
I create zero-shot synthetic speech and it is not even close to the reference speaker voice (sometimes different gender)
I use tts.synthesise function to pass the name of the reference speaker file and the produces audio different for different reference speakers never similar to the target.
Any idea what can be wrong?
Just for reference - I use more than 1 min of audio from Multilingual LibriSpeech database (english part)
The text was updated successfully, but these errors were encountered:
I found the problem - the path to the reference speaker audio should be absolute. In my case the code was not able to find the reference and without any warning used random speaker voice.
Hi all,
I create zero-shot synthetic speech and it is not even close to the reference speaker voice (sometimes different gender)
I use tts.synthesise function to pass the name of the reference speaker file and the produces audio different for different reference speakers never similar to the target.
Any idea what can be wrong?
Just for reference - I use more than 1 min of audio from Multilingual LibriSpeech database (english part)
The text was updated successfully, but these errors were encountered: