Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11 #4

Open
selfAndrewKB opened this issue Feb 24, 2024 · 7 comments

Comments

@selfAndrewKB
Copy link

selfAndrewKB commented Feb 24, 2024

A warning upon first running the whisper model clued me in to it not using hardware acceleration:

UserWarning: FP16 is not supported on CPU; using FP32 instead

All I had to do in order to enable CUDA support was first uninstall Torch:
python -m pip3 uninstall torch

And reinstall with this command:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Confirm that CUDA is available in Python by running:
import torch
torch.cuda.is_available()

monkeyplug/whisper should now correctly use your GPU to significantly speed up operations. A youtube video with a runtime of 10:42 took 13 minutes and 42 seconds to process on my CPU with the medium.en model. After successfully enabling CUDA support, that same video took 3 minutes and 13 seconds to process on an RTX 3070. With noticeable accuracy over the default base.en.

I caught several warning messages that were raised during a job (might be related to generating timestamps?), but they don't seem to affect the operation at all:

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation...
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...`
warnings.warn(

C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\site-packages\whisper\timing.py:146: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower DTW implementation...
warnings.warn(

Noticed that #3 might be in the works, which might help, but I thought it could be wise/helpful to share my findings regardless in the meantime.

PS: Whisper really is another tier of accuracy and is much appreciated.

@mmguero
Copy link
Owner

mmguero commented Feb 25, 2024

Interesting, on my Linux machine it was using the GPU right out of the gate just with pip install openai-whisper without any other steps on my end (double-checked with nvidia-smi during processing).

@selfAndrewKB
Copy link
Author

Oh and If it helps, this is a fresh install of Windows 11 and I actually used that very same command to install whisper following Python 3.12. Strange indeed.

@selfAndrewKB selfAndrewKB changed the title Whisper defaults to CPU instead of utilizing Nvidia GPU Whisper defaults to CPU instead of utilizing Nvidia GPU on Windows 11 Feb 25, 2024
@bradyj04
Copy link

Are you still having this issue any, I tried your steps and mine persisted.

@mmguero
Copy link
Owner

mmguero commented Apr 25, 2024

Right now I don't have access to a Windows machine with a GPU, so I don't have any way to confirm or look into this.

@selfAndrewKB
Copy link
Author

Are you still having this issue any, I tried your steps and mine persisted.

Sorry to hear. It's been working just fine ever since. Could you provide more info about your setup? Operating system, whether you tried torch.cuda.is_available(), what it returns, any error messages you might've seen, etc.

@bradyj04
Copy link

Windows 11, getting the exact same error messages as you get in your original one. I'm currently just using a separate whisper program instead so no big deal, and yes torch returns true.

@therealmichaelberna
Copy link

therealmichaelberna commented Nov 15, 2024

@selfAndrewKB @bradyj04

For windows, I had to install an Nvidia triton windows compiler build from here : https://huggingface.co/madbuda/triton-windows-builds
Command:
pip install https://huggingface.co/madbuda/triton-windows-builds/resolve/main/triton-3.0.0-cp312-cp312-win_amd64.whl

If you have CUDA 12.6 or higher, this bugfix needs to be applied also.

https://github.com/triton-lang/triton/pull/4588/files (see the changed files tabs and note the added and removed lines)

For me, the file I had to edit was located in
C:\Users\User.conda\envs\monkeyplug_312\Lib\site-packages\triton\backends\nvidia\compiler.py

After this and doing the pytorch CUDA re-install, it worked for Windows.

Thanks for creating this and I hope this info can help someone.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants