-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[QST] #384
Comments
I tried the same code on V100 GPU on Google Colab, and it is still not using the GPU, and extremely slow. Still running on laptop since last week now, and a few hours ago on Google Colab saying clearly that the GPU is not used, and I should switch to standard runtime. Can you please advise how I can use cuda data frame to read 62GB dataset and train RAPIDs algorithms on it, Thank you, |
I was able to run the notebook on a T4 in colab with no issues: https://colab.research.google.com/drive/11hrLRui_Mi11mfe3V7LyrF2su-RHHGxD?usp=sharing I would check your GPU software |
okay, so first, your GPU is far too small - it has only 1.5GB of usable GPU memory (probably the 2GB variant of the T500). This notebook was meant to run on a 32GB or larger GPU. In fact, we recommend that you have a 16GB GPU to run our examples, however, we try to make accommodations for the 11GB x080s. HOWEVER, I made this colab with a much smaller dataset a while back (just updated it with the new pip install): https://colab.research.google.com/drive/1DnzZk42PNc_Y-bItYJSvjLyWkx4-6jKE. Other thoughts:
|
Hi
Following the example in: https://github.com/rapidsai-community/notebooks-contrib/blob/main/community_tutorials_and_guides/census_education2income_demo.ipynb
I have a laptop with 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz, 64 GB RAM. In addition to the Intel TigerLake-LP GT2 [Iris Xe Graphics], there is an Nvidia GPU as follows:
3D controller TU117GLM [Quadro T500 Mobile]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress bus_master cap_list rom
configuration: driver=nvidia latency=0
when I create a cluster I get only one worker, and when I compute anything I see in the dashboard the CPU only working:
I am trying to read the backblaze 2022/2023 dataset. These are 730 csv files, on disk totaling to 62.6 GB using this code:
with one CPU worker and 8 threads, I am having impossible bottlenecks to do any computation, such as count_values takes a few hours:
counts = df.model.value_counts(dropna=False).compute()
min and max of columns took a complete day to finish one column, and still is going for others:
I need advise on how to get the GPU cores working to speed up the processing. I also need an advise on purchasing the cheapest option for home GPU cluster using something around these options and in this price range:
External PCI-E chassis to connect to my laptop (although this one does not seem suitable to NVIDIA GPUs, please advise):
https://www.amazon.co.uk/gp/product/B0BJL7HKD8/ref=ox_sc_act_image_2?smid=A3P5ROKL5A1OLE&psc=1
and GPUs such as (or advise on best fastest value for money alternatives):
https://www.amazon.co.uk/gp/product/B0C8ZQTRD7/ref=ox_sc_act_image_1?smid=A20CAXEAQ9QIMK&psc=1
Thank you very much in advance,
Manal
The text was updated successfully, but these errors were encountered: