cascaded speech-to-speech translation (STST), mapping from source speech in any language to target speech in German using my German TTS model.
This repository demonstrates cascaded speech-to-speech translation (STST), which involves mapping source speech in any language to target speech in German. The demo utilizes the following models:
- Whisper Base: OpenAI's model for speech translation
- My German TTS: My text-to-speech model for generating German speech
The cascaded STST process involves two steps:
-
Speech Translation (Source Language to German Text): The Whisper Base model translates source speech in any language into German text.
-
Text-to-Speech (German Text to Target Speech): The German text generated by Whisper Base is then input to the My German TTS model to produce the final target speech in German.
You can use it directly from my huggingface space link: https://huggingface.co/spaces/Salama1429/speech-to-speech-translation