GitHub - phaniteja5789/Intelligent-Document-Retrieval-and-Question-Answering-System-with-RAG-Approach

This project is Document Search using Vector Embeddings and Vector Data Base using LLM.

Framework Used 1.) LangChain ==> To Connect to LLM 2.) Stream-lit ==> For developing UI

Models Used

OpenAI HuggingFace

Embeddings Model

**text-embedding-ada-002
all-MiniLM-L6-v2**

OpenAI

**gpt-turbo-3.5**

Vector DataBase

Chroma

This project mainly deals with 4 stages 1.) Stage-1 ==> Validation of API Key ==> Here we are providing leverage to the user to either use OpenSource Models from Hugging Face or OpenAI models using OpenAI API Key 2.) Stage-2 ==> Based on the User Selection of Model, He needs to select the documents of any type for which the embeddings need to be performed and the embeddings will be stored in the vector database 3.) Stage-3 ==> Once the vectors have been stored, the user needs to query the keywords on which the similar documents will be fetched from the database 4.) Stage-4 ==> We also provide, the feasibility to the user, to test the knowledge by using RAG(pattern) (Retrieval Augmented Generation)

Stage-1

In the Left side Pane, we are providing the user either to select the OpenAI or Hugging Face. Based on the selection, the text box below and the button will vary, either to provide an OpenAI API key or the HuggingFace API key If an OpenAI key is provided, then the OpenAI key will be validated, if the API key is valid then the Right Pane Controls will be enabled If the Hugging Face key is provided, then the Hugging Face will be validated, if the Hugging Face key is valid then the Right Pane Controls will be enabled The below screenshot is for reference

Stage-2 In the Right side Pane, we allow the user to browse the files of different file types like (pptx,pdf,csv,txt) Once the files are selected, the respective file names will be populated below the control, Once the documents have been uploaded, then we need to embed the documents and store the embeddings in the database

We allow the user to embed the documents. Once the documents have been embedded the below message will be shown

Stage-3 In the Retrieve Document Similarity Page, the below UI will be shown

Here we are providing the user to enter the query so that based on the query, the Top 3 similar documents will be fetched. Here we are using Cosine Similarity

Once the user enters the query, the below UI will be for the reference, with the retrieved documents

Stage-4 In the Query RAG Page, here we are providing the user with the required details he needs Based on this project we can retrieve the documents that are similar based on the user-entered index and also the able to answer the questions user asks

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
DocumentSearch		DocumentSearch
__pycache__		__pycache__
db1		db1
pages		pages
.gitattributes		.gitattributes
CountDown.py		CountDown.py
DemoVideo.mp4		DemoVideo.mp4
InitializationOfSessionVariables.py		InitializationOfSessionVariables.py
KeysValidation.py		KeysValidation.py
LoadDocumentsIntoVectorDB.py		LoadDocumentsIntoVectorDB.py
README.md		README.md
RetrievalOfFullPath.py		RetrievalOfFullPath.py
SourceCode.py		SourceCode.py
ValidateHuggingFaceKey.py		ValidateHuggingFaceKey.py
ValidateOpenAIKey.py		ValidateOpenAIKey.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

phaniteja5789/Intelligent-Document-Retrieval-and-Question-Answering-System-with-RAG-Approach

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages