rag_llama_demo

Retrieval Augmented Generation (RAG) for Llama models using e5 vector embedding model for private LLM usage. Origial blogpost on brandonharris.io.

Notebook, raw .py and requirements.txt for reference. Using llama_index and ehartford/Wizard-Vicuna-13B-Uncensored for interactive retrieval and e5 for embeddings.

The book used as an example in this code and hosted on this repo is public domain and was sourced from Project Gutenberg.