RAG Voice Assistant Project

This project implements a Retrieval-Augmented Generation (RAG) Voice Assistant composed of three main services. Each service plays a distinct role in the overall functionality of the assistant. Below, you will find a brief overview of these services, followed by detailed setup and usage instructions.

Overview of Services

Livekit Service (Agent):
- A Python-based WebRTC agent that connects to the Livekit cloud.
- Handles real-time voice communication and retrieves document-related embeddings from Pinecone to answer user questions.
- Integrates Speech-to-Text (STT), Text-to-Speech (TTS), and Language Learning Models (LLM).
FastAPI Service (Embeddings API):
- A backend server that processes uploaded files, converts text to embeddings using OpenAI, and stores them in Pinecone.
- Summarizes the document for use in the voice assistant's system prompt.
- Stores namespaces of embeddings in a database for easy document retrieval.
Next.js Frontend:
- A web interface based on the LiveKit Next.js template, enhanced with ShadCN components.
- Allows users to upload files, select a document namespace, and interact with the assistant in real-time.

Documentation and Resources

Livekit:
- Documentation: Livekit Docs
- To set up a project and get environment variables: Livekit Cloud
Deepgram:
- For Speech-to-Text (STT) and Text-to-Speech (TTS): Deepgram
Pinecone:
- For vector database management: Pinecone

Livekit Service - (Agent)

Overview

The Livekit Service serves as the WebRTC agent connected to the Livekit cloud. This service is implemented in Python and is responsible for facilitating voice interactions and retrieving embeddings to answer user questions about uploaded documents.

Requirements

The service dependencies can be installed using either poetry or requirements.txt.

Using Poetry

Install Poetry if not already installed:
```
pip install poetry
```
Install dependencies:
```
poetry install
```
Activate the virtual environment:
```
poetry shell
```

Using `requirements.txt`

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Environment Variables

The .env file must be configured with the following variables:

LIVEKIT_URL: The URL of the Livekit server.
LIVEKIT_API_KEY: The API key for authenticating with Livekit.
LIVEKIT_API_SECRET: The secret key for authenticating with Livekit.
DEEPGRAM_API_KEY: The API key for Deepgram (if used as the STT/TTS provider).
CARTESIA_API_KEY: The API key for Cartesia (if used as the TTS provider).
ELEVENLABS_API_KEY: The API key for ElevenLabs (if used as the TTS provider).
OPENAI_API_KEY: The API key for OpenAI (used to process embeddings and generate responses).
PINECONE_API_KEY: The API key for Pinecone.
PINECONE_INDEX_NAME: The name of the Pinecone index used to store embeddings.

Running the Service

To start the Livekit Service, run the following command:

python main.py dev

FastAPI Service - (Embeddings API)

Overview

The Embeddings API is a FastAPI server that processes uploaded files, generates embeddings using OpenAI, and stores them in Pinecone. It also generates a document summary for use in the assistant's system prompt.

Features

Accepts file uploads in PDF, DOCX, and HTML formats.
Converts text to embeddings using OpenAI and stores them in Pinecone.
Generates a unique namespace for each file and stores namespace references in a PostgreSQL database.
Summarizes the document's first chunk for context.

Requirements

The service requires a PostgreSQL database to store namespaces and document metadata. A docker-compose.yml file is provided to set up the database.

Using Poetry

Install Poetry if not already installed:
```
pip install poetry
```
Install dependencies:
```
poetry install
```
Activate the virtual environment:
```
poetry shell
```

Using `requirements.txt`

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Environment Variables

The .env file for this service must include the following variables:

OPENAI_API_KEY: The API key for OpenAI (used to generate embeddings and summaries).
PINECONE_API_KEY: The API key for Pinecone.
PINECONE_INDEX_NAME: The name of the Pinecone index used to store embeddings.
POSTGRES_USER: The username for the PostgreSQL database.
POSTGRES_PASSWORD: The password for the PostgreSQL database.
POSTGRES_DB: The name of the PostgreSQL database.
POSTGRES_HOST: The host where the PostgreSQL database is running.
POSTGRES_PORT: The port for connecting to the PostgreSQL database.

Running the Service

Set up the PostgreSQL database using Docker Compose:
```
docker-compose up -d
```
Start the FastAPI server:
```
python main.py
```

Endpoints

The Embeddings API provides the following endpoints:

POST /textfiles: Upload a file and process its embeddings.
GET /textfiles: List all uploaded files and their namespaces.

d

Next.js Frontend

Overview

The Next.js frontend is based on the LiveKit Next.js base template, enhanced with ShadCN components. It serves as the user interface for interacting with the Voice Assistant and Embeddings API.

Requirements

The frontend uses pnpm for package management. Ensure pnpm is installed before proceeding.

Environment Variables

The .env file for the frontend must include the following variables:

LIVEKIT_URL: The WebSocket URL for connecting to the Livekit server.
LIVEKIT_API_KEY: The API key for Livekit authentication.
LIVEKIT_API_SECRET: The secret key for Livekit authentication.
API_URL: The base URL for the Embeddings API.

Running the Service

Install dependencies:
```
pnpm install
```
Start the development server:
```
pnpm dev
```
Open your browser and navigate to http://localhost:3000 to view the application.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/assets		.github/assets
agent		agent
embeddings-api		embeddings-api
frontend		frontend
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Voice Assistant Project

Overview of Services

Documentation and Resources

Livekit Service - (Agent)

Overview

Requirements

Using Poetry

Using `requirements.txt`

Environment Variables

Running the Service

FastAPI Service - (Embeddings API)

Overview

Features

Requirements

Using Poetry

Using `requirements.txt`

Environment Variables

Running the Service

Endpoints

Next.js Frontend

Overview

Requirements

Environment Variables

Running the Service

About

Releases

Packages

Languages

License

andrew-sameh/rag-voice-assistant

Folders and files

Latest commit

History

Repository files navigation

RAG Voice Assistant Project

Overview of Services

Documentation and Resources

Livekit Service - (Agent)

Overview

Requirements

Using Poetry

Using requirements.txt

Environment Variables

Running the Service

FastAPI Service - (Embeddings API)

Overview

Features

Requirements

Using Poetry

Using requirements.txt

Environment Variables

Running the Service

Endpoints

Next.js Frontend

Overview

Requirements

Environment Variables

Running the Service

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Using `requirements.txt`

Using `requirements.txt`

Packages