⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
retrieval
chatbot
rag
habana
large-language-model
chatpdf
llm-inference
4-bits
speculative-decoding
llm-cpu
streamingllm
intel-optimized-llamacpp
neural-chat
neural-chat-7b
autoround
gaudi3
-
Updated
Oct 8, 2024 - Python