-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Running Intro Notebook on WSL #122
Comments
Hi @ashleemilton, I haven't tried working with CUDA through WSL but it seems like there are some setup steps that NVIDIA lists here in case you haven't seen these before: https://docs.nvidia.com/cuda/wsl-user-guide/index.html |
I did do the additional steps for working with CUDA through WSL and torch prints true from is_available(). My windows environment is Windows 11, in case that matters. Here is the output from conda list in colbert environment: And nvidia-smi: +-----------------------------------------------------------------------------+ |
Ah it seems from this thread that single-gpu support on WSL is only available on NCCL version 2.10.3, and multi-gpu support is only available with NCCL version 2.11.4: NVIDIA/nccl#442 |
Closing due to inactivity, please re-open if this is still an issue. |
Hello,
I am having issues trying to run the provided intro notebook for ColBERTv2. I am working in an anaconda environment, created using the commands provided, in a Ubuntu WSL virtual machine. I am running a single CUDA-compatible Nvidia GPU. When I try and index the collection with nranks=1, I encounter the error:
NCCL version 2.7.8
ncclSystemError: System call (socket, malloc, munmap, etc) failed.
I have tried tracking down the cause of the error but the only information I can find is that it could be due to trying to run it don't a single GPU. I am stuck on any fixes for this and would greatly appreciate any guidance on resolving the issue. Further, I am trying to ultimately fine-tune ColBERT so I am also interested in the response to issue #121.
The text was updated successfully, but these errors were encountered: