This project is developed for CS889: Advanced Topics in HCI, Interfaces for Human-AI Interaction at the University of Waterloo in Winter 2025.
Project Contributors:
- Mohammad Abolnejadian
- Shakiba Amirshahi
This Streamlit-based prototype aims to help doctors make more informed decisions by generating on-the-fly insights grounded in historical data while having conversations with patients. The system can detect key information from patient conversations, including:
- The patient's medical problems
- Additional contextual information
- Solutions proposed by the doctor
The sample data is collected from Open Government Canada. The system uses embeddings stored in a ChromaDB database. A sample database is provided, but it can be augmented by running the preprocessing file.
The backend uses a pipeline built with LangChain that integrates several Azure AI services:
- Speech-to-Text for transcribing conversations
- Embedding models for semantic understanding
- Chat completion models for generating insights
To run this code, you need an Azure account with the following models deployed:
- Speech-to-Text (STT)
- Embedding model
- Chat completion model
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your Azure configuration
Required environment variables include:
- Azure OpenAI endpoints and API keys
- Azure Speech-to-Text configuration
- Azure Chat Completion model settings
To augment the sample embedding database with additional data:
python preprocess.py
This will process text and CSV files from the sample_data directory and add them to the embedding database.
Start the Streamlit application:
streamlit run run.py
-
Start Screen: The application begins with a welcome screen where users can start a new session.
-
Interaction Screen: During the session, the application:
- Listens to the doctor-patient conversation
- Transcribes the audio in real-time
- Identifies medical problems and relevant context
- Generates insights based on historical data
- Captures solutions proposed by the doctor
-
Complete Conversation: As the conversation evolves, different parts of the UI are populated with detected information and insights.
The interactive screen after conversation analysis is complete
- agent/: Contains the LangChain pipeline and conversation processing logic
- view/: UI components and Streamlit interface code
- sample_data/: Example medical data for testing
- sample_embedding_db/: Pre-built database of embeddings
- preprocess.py: Script for processing additional data into embeddings
- run.py: Main application entry point
When running locally, the application is accessible at: