Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CUDA/CuDNN related errors occur in Titan-RTX environments #39

Open
dogyoonlee opened this issue Sep 10, 2020 · 0 comments
Open

CUDA/CuDNN related errors occur in Titan-RTX environments #39

dogyoonlee opened this issue Sep 10, 2020 · 0 comments

Comments

@dogyoonlee
Copy link

hello.

I changed my environment in many ways,
but I couldn't get a solution for running your code...

First, my GPU is Titan-RTX
and my attempts are follows.

I also tried to run the code on CUDA 8.0 environments before, but the errors occurs as
almost same as on CUDA 9.0 environments


  1. ---environment---
    ubuntu 18.04
    CUDA 9.0
    CuDNN 7.1
    torch 0.3.1 / 0.4.0
    ==>
    error message :
    Found GPU0 TITAN RTX which requires CUDA_VERSION >= 9000 for
    optimal performance and fast startup time, but your PyTorch was compiled
    with CUDA_VERSION 8000. Please install the correct PyTorch binary
    using instructions from http://pytorch.org

warnings.warn(incorrect_binary_warn % (d, name, 9000, CUDA_VERSION))

and process is "Killed" when data are load to the gpu, specifically operating conv2d() command in
55 line of pointnet2_modules.py, self.mlp[i] - _PointnetSAModuleBase function

  1. ---environment---
    ubuntu 18.04
    CUDA 9.0
    CuDNN 7.1
    torch 0.3.1 / 0.4.1
    ==>
    error message :
    RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:663

  2. ---environment---
    ubuntu 18.04
    CUDA 9.0
    CuDNN 7.1
    torch 0.3.1 / 0.4.1

and I additionally revised train_cls.py as

torch.backends.cudnn.benchmark = False

==>
Traceback (most recent call last):
File "train_cls.py", line 217, in
main()
File "train_cls.py", line 125, in main
train(train_dataloader, test_dataloader, model, criterion, optimizer, lr_scheduler, bnm_scheduler, args, num_batch)
File "train_cls.py", line 167, in train
pred = model(points)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/SSD1/dogyoon/Relation-Shape-CNN-master/models/rscnn_ssn_cls.py", line 102, in forward
return self.FC_layer(features.squeeze(-1))
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/modules/batchnorm.py", line 66, in forward
exponential_average_factor, self.eps)
File "/home/mvpserverone/.conda/envs/rscnn/lib/python3.5/site-packages/torch/nn/functional.py", line 1251, in batch_norm
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size [1, 512]


I really hope to find the solution of this problem as soon as possible
thank you very much

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant