In this project, we trained/finetuned a text classification model to predict the medical specialties based on the transcription text entered by users. The application is deployed on Streamlit.
You can find the Video Demonstration and the Presentation here
-
Clone repository and go into the directory.
git clone https://github.com/Fennec2000GH/StarHack-medical-classification.git
-
Install required dependencies in Python.
pip install -r requirements.txt
-
Download required classifier models as Pickle (
.pkl
) files. The links to each are given below.
-
Make sure the three (3) Pickle files are directly in the cloned repository root (
StarHack-medical-classification/
).
- Run the app.
streamlit run app.py
- Downloaded the Medical Transcriptions Dataset from Kaggle
- Split the data into random train and test subsets
- Preprocessed the transcription column of text with tokenization
- Trained by SVM, KNN, Random Forest models from scikit-learn
- Created the application using Streamlit framework and deployed it
- Check out the Devpost project page for Dr.Jarvis
- Link to Trained Models (RandomForest, Support Vector Machine, K Nearest Neighbour)