Possible privateGPT integration ? #1592

darkstorm2150 · 2023-05-25T07:39:17Z

Wonder if it is possible to implement https://github.com/imartinez/privateGPT into llama.cpp, The reason I see this is because the inference speed with cublas is insanely fast with the current models 13b, having features of digesting any information would skyrocket llama.cpp for real-time applications.

SlyEcho · 2023-05-26T07:27:12Z

Follow the instructions for llama-cpp-python installation, which it uses: https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast

SlyEcho closed this as completed May 26, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible privateGPT integration ? #1592

Possible privateGPT integration ? #1592

darkstorm2150 commented May 25, 2023 •

edited

Loading

SlyEcho commented May 26, 2023

Possible privateGPT integration ? #1592

Possible privateGPT integration ? #1592

Comments

darkstorm2150 commented May 25, 2023 • edited Loading

SlyEcho commented May 26, 2023

darkstorm2150 commented May 25, 2023 •

edited

Loading