-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Document how to build speech synthesis system for new languages #18
Comments
Hi thank you for this work. |
Yes. I think you will need to annotate the accent information (rising/failing tones) as the HTS-style label. |
Hi, 山本さん。I'm trying to synthesize mandarin using your tool. To my knowledge, i need to do forced alignment manually in previous. And then writing a frontend that adapts the language to extract linguistic features. So does that mean i only need to replace the frontend part? And could i using other forced alignment tools such as "montreal" at other alignment level which is neither 'state' nor 'phone', for example, 'syllable'? |
こんにちは、 @attitudechunfeng !
Yes, you can reuse other parts. You can also reuse a part of frontend (https://r9y9.github.io/nnmnkwii/latest/references/frontend.html#frontend) to convert your linguistic features to its numeric representation at either phone, state or frame-level if you use the HTS-style label format.
You could, but then you cannot reuse https://r9y9.github.io/nnmnkwii/latest/references/frontend.html#frontend, since it assumes state or phone-level alignment. |
本当にありがとう!I'll try it. |
Alternatively, you could consider end-to-end approach, which doesn't require alignment as well as linguistic feature extraction (the hard part of the TTS!). See https://github.com/r9y9/deepvoice3_pytorch if you are interested. |
Thank u. In fact, i'm also following your other excellent tts projects. However, i'm now trying do some work about offline usage, end-to-end models are not convenient to be transferred to mobile devices and its speed on cpu is also can't be guaranteed. So i have to use traditional method. |
I see. I hope you find something useful. Let me know if you find something should be improved. |
okay, if there're something interesting, i'll report it. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I have wav files of punjabi language. Please guide me to generate full context labels and HTS-style question file |
All you need is that
With those all prepared, it should be very straightforward to implement.
The text was updated successfully, but these errors were encountered: