Compress LLM Prompts and save 80%+ on GPT-4 in Python
-
Updated
Jan 17, 2024 - Python
Compress LLM Prompts and save 80%+ on GPT-4 in Python
This repository is the official implementation of Generative Context Distillation.
Enhance the performance and cost-efficiency of large-scale Retrieval Augmented Generation (RAG) applications. Learn to integrate vector search with traditional database operations and apply techniques like prefiltering, postfiltering, projection, and prompt compression.
Add a description, image, and links to the prompt-compression topic page so that developers can more easily learn about it.
To associate your repository with the prompt-compression topic, visit your repo's landing page and select "manage topics."