Live Demo on Hugging Face Spaces
Note: Due to limitations on cloud server deployments, the microphone functionality is not available on the hosted version. The app can only transcribe and translate uploaded audio files. For full functionality, please follow the local setup instructions below.
Python, Streamlit, Google Cloud Speech-to-Text, Google Cloud Text-to-Speech
OpenAI GPT, Hugging Face Spaces, Git
The Real Time Healthcare Translation App is a powerful tool designed to assist healthcare professionals in multilingual environments. It uses state-of-the-art AI technology to transcribe speech, translate it into a target language, and convert the translated text back into speech in real-time.
- Speech-to-Text: Transcribes spoken language into text using Google Cloud Speech-to-Text.
- Text Translation: Translates the transcribed text into a target language using OpenAI GPT.
- Text-to-Speech: Converts the translated text back into speech using Google Cloud Text-to-Speech.
- Audio File Upload: Upload MP3 or WAV files to be transcribed and translated.
- Language Selection: Supports multiple source and target languages for both transcription and translation.
This app is designed to help healthcare workers communicate more effectively across different languages, making it an invaluable tool in diverse and fast-paced medical environments.
You can try the Healthcare Translation App hosted on Hugging Face Spaces:
Healthcare Translation App - Live Demo
Important Note: Due to the limitations of cloud server environments, the hosted version cannot make real time updates. If you want to explore the full functionality, please follow the instructions below to set up the app locally.
To run the Healthcare Translation App locally and access all features (including microphone support), follow these steps:
- Python 3.7+: Ensure Python is installed on your local machine.
- Google Cloud Account: You will need a Google Cloud account with access to the Speech-to-Text and Text-to-Speech APIs. Download your credentials (in JSON format).
- OpenAI API Key: You'll need an API key from OpenAI to handle text translation.
- Streamlit: Used to create the app’s web interface.
- Clone the Repository:
Open your terminal and run the following command to clone the repository:
git clone https://github.com/your-username/Healthcare-Translation-App.git cd Healthcare-Translation-App
- Create a Virtual Environment:
Install the required Python libraries by running:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate
- Install Dependencies:
It's best to create a virtual environment to manage dependencies. Run:
pip install -r requirements.txt
- Set Up Environment Variables:
In the project directory, create a .env file to store your API keys and credentials. Add the following variables to the .env file:
- OPENAI_API_KEY: Your OpenAI API key.
- GCP_KEY: The path to your Google Cloud service account credentials (JSON file). Example .env file:
OPENAI_API_KEY=your_openai_api_key_here GCP_KEY='your_google_cloud_service_account_json_here'
- Run the Application: Launch the app using Streamlit by running the following command:
streamlit run app.py
- Access the App: After running the above command, open your browser and visit http://localhost:8501 to access the app.
- Select the source language: This is the language you want to transcribe from. You can choose from a variety of languages.
- Select the target language: Choose the language you want to translate the transcribed text into. The app supports a wide range of languages for translation.
- Click "Start Real-Time Recording" to begin transcribing audio from your microphone. The app will listen to the spoken language in real-time and begin transcribing it into text.
- Once the speech is transcribed, the app will translate the text into your selected target language.
- The translated text will then be converted back into speech using the Google Cloud Text-to-Speech API, and you will hear the translated audio in real-time.
- Click "Stop Real-Time Recording" to end the recording session.
- Click "Upload an audio file" to upload an MP3 or WAV file for transcription and translation.
- Once the file is uploaded, the app will transcribe the audio, translate the transcribed text into the target language, and convert the translated text back into speech.
- Click "Play Audio" to listen to the translated speech.
- Click "Clear" to reset both the source and target language transcripts. This will erase any previously transcribed or translated text.
- After the translation is complete, you can click the "Play Audio" button to hear the translated text spoken aloud.
- The app uses Google Cloud Text-to-Speech to convert the translated text into a natural-sounding voice, allowing you to hear the translation as it would be spoken by a native speaker.
- Google Speech-to-Text: For transcribing spoken language into text.
- OpenAI API: For translating text from the source language to the target language.
- Google Text-to-Speech: For converting the translated text back into audio.
- Streamlit: The app's front-end framework for building interactive web applications.
- Hugging Face Spaces: For hosting the application.
- Streamlit Cloud: For deployment of the app.
- Git: For version control.
- GitHub: For code hosting and collaboration.
- Multi-threading: The application uses threading to handle real-time audio transcription and translation simultaneously.
The app has been deployed on Hugging Face Spaces.
Note: Due to cloud server limitations, microphone access is not available, and thus the app can only process uploaded audio files on Hugging Face Spaces. For full functionality, including real-time speech-to-text and translation, we recommend setting up the app locally.
To run the app locally and access all its features, follow these steps:
- Python 3.x
- A virtual environment (recommended)
- Dependencies listed in the
requirements.txt
file
-
Clone the Repository:
git clone https://github.com/your-repository-url.git cd your-repository-folder
-
Set Up a Virtual Environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install Dependencies:
pip install -r requirements.txt
-
Set Up Environment Variables:
- You will need to set up environment variables for your Google Cloud and OpenAI API keys.
- Create a
.env
file in the project directory and add the following variables:
OPENAI_API_KEY_MEDICAL_TRANSLATOR=your_openai_api_key GCP_KEY_MEDICAL_TRANSLATOR=your_google_cloud_credentials_json_string
You can obtain the
GCP_KEY_MEDICAL_TRANSLATOR
from the Google Cloud Console. -
Run the App:
streamlit run app.py
-
Open your browser and visit the URL shown in the terminal to interact with the app locally.
- Thanks to Google Cloud for their Speech-to-Text and Text-to-Speech APIs.
- Thanks to OpenAI for providing the GPT model for translation.
- Thanks to Streamlit for making web app development so easy and interactive.
Enjoy using the Healthcare Translation App and feel free to contribute!