Add info about CUDA_VISIBLE_DEVICES #1682

SlyEcho · 2023-06-03T13:24:33Z

Add a sentence about GPU selection on CUDA.

Relevant: #1546

roperscrossroads · 2023-06-03T13:58:04Z

Does this actually work? I struggled with this a few days ago using a slightly older version of llama.cpp. It kept loading into my internal mobile 1050ti and running out of memory instead of using my 3090 (egpu). I was doing something like this:

CUDA_VISIBLE_DEVICES=1 ./main -ngl 60 -m models/model.bin

It always went to the internal GPU with id 0.

Tomorrow morning I will give it another try, but I do not think it works if you set the ID from nvidia-smi (in my case: 1050ti: 0, 3090: 1).

JohannesGaessler · 2023-06-03T14:28:14Z

CUDA_VISIBLE_DEVICES does work on my test machine using the master branch. In any case, it should be possible to just control this via a CLI argument before long. My current plan is to add something like a --tensor-split argument for the compute-heavy matrix multiplication tensors and a --main-gpu argument for all other tensors where multi GPU wouldn't be worthwhile.

Add info about CUDA_VISIBLE_DEVICES

4c88864

SlyEcho requested review from JohannesGaessler and Green-Sky June 3, 2023 13:24

JohannesGaessler approved these changes Jun 3, 2023

View reviewed changes

SlyEcho merged commit d8bd001 into master Jun 3, 2023

SlyEcho deleted the docs-update branch June 3, 2023 13:35

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add info about CUDA_VISIBLE_DEVICES #1682

Add info about CUDA_VISIBLE_DEVICES #1682

SlyEcho commented Jun 3, 2023

roperscrossroads commented Jun 3, 2023

JohannesGaessler commented Jun 3, 2023

Add info about CUDA_VISIBLE_DEVICES #1682

Add info about CUDA_VISIBLE_DEVICES #1682

Conversation

SlyEcho commented Jun 3, 2023

roperscrossroads commented Jun 3, 2023

JohannesGaessler commented Jun 3, 2023