Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Requiring CUDA10+ precludes running racon-gpu on the (current?) ONT PromethION #2

Closed
SamStudio8 opened this issue Jul 19, 2019 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@SamStudio8
Copy link

A recent change in the README specifies a requirement for CUDA10+, but this precludes racon-gpu being run on our ONT PromethION as the driver it ships with (v384.130) does not support the CUDA toolkit beyond v9.0 on Linux x86_64 (doc).

With a little hacking I've managed to compile racon-gpu with CUDA 9.0.176 and run it on our PromethION, so I was wondering if this version constraint was intentional?

@vellamike
Copy link

Thanks for reporting this, it's something we are aware of. What did you change exactly to run with CUDA 9.0.176 ?

@SamStudio8
Copy link
Author

SamStudio8 commented Jul 19, 2019

@vellamike It involved a load of trial and error but in summary:

  • I provisioned a VM with Ubuntu 16.04; compiled cmake v3.15.0 from source to circumvent the OS Image having a version of cmake that is too old.
  • I installed gcc-6 and g++-6 from PPA as CUDA 9.0 needs at least gcc-6 to install and the default gcc for the image was too old.
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-6 g++-6
  • I installed CUDA 9 via local deb following these instructions. Note the downloaded file ends with -deb, not .deb when running the dpkg command. I didn't install the patches.
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
  • I downloaded the racon-gpu repository according to the instructions and set cmake to use gcc-6.
git clone --recursive https://github.com/clara-genomics/racon-gpu.git
cd racon-gpu
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON -DCMAKE_C_COMPILER=gcc-6 -DCMAKE_CXX_COMPILER=g++-6 ..
make

Oddly, this seemed to initially build a version of racon that didn't have the CUDA functionality enabled, so I ran make again, this time, the process aborted with the following stack trace:

/home/ubuntu/racon-gpu/vendor/ClaraGenomicsAnalysis/3rdparty/spdlog/include/spdlog/details/pattern_formatter.h:665:66: error: expected primary-expression before ‘const’
     log_clock::time_point last_update_{std::chrono::seconds(0)};
                                                                  ^    
/home/ubuntu/racon-gpu/vendor/ClaraGenomicsAnalysis/3rdparty/spdlog/include/spdlog/details/pattern_formatter.h:665:66: error: expected ‘}’ before ‘const’
/home/ubuntu/racon-gpu/vendor/ClaraGenomicsAnalysis/3rdparty/spdlog/include/spdlog/details/pattern_formatter.h:665:66: error: could not convert ‘{<expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’
/home/ubuntu/racon-gpu/vendor/ClaraGenomicsAnalysis/3rdparty/spdlog/include/spdlog/details/pattern_formatter.h:665:66: error: expected ‘;’ before ‘const’
CMake Error at cudapoa_generated_cudapoa_kernels.cu.o.Release.cmake:279 (message):
  Error generating file
  /home/ubuntu/racon-gpu/build/ClaraGenomicsAnalysis/cudapoa/CMakeFiles/cudapoa.dir/src/./cudapoa_generated_cudapoa_kernels.cu.o

Out of sheer frustration and morbid curiosity, I simply commented pattern_formatter.h:665 out of the source as it seemed the variable it declares was only used in the #ifwin32 block anyway (at least in the scope of the file). I ran make again and it seems I've gotten away with it. Update It also seems to work if you remove just the std::chrono::seconds(0) bit.

  • I copied the new racon binary to the prom, along with the C libraries, and set the LD_LIBRARY_PATH.
scp sam-cuda:/usr/lib/gcc/x86_64-linux-gnu/6/libstdc++* /home/prom/sam/lib/
export LD_LIBRARY_PATH=/home/prom/sam/lib/:$LD_LIBRARY_PATH

My racon-gpu test hasn't finished yet, but it's using all the GPU and seems to be working so far!

Update The job finished successfully, but there were some debugging messages in my FASTA. I've tracked down and patched the printf in question.

@SamStudio8
Copy link
Author

If it's useful to anyone, here's my binary (2.6M, 4ae25c5c2323f220917cf2912e4d2280).

@tijyojwad
Copy link

@SamStudio8 can you try out with the fix in NVIDIA-Genomics-Research/GenomeWorks#46 ?

@SamStudio8
Copy link
Author

I've now containerised this for anyone following along at home. The recipe is on gist.

@tijyojwad
Copy link

Thanks, @SamStudio8 ! This is very useful indeed. We're working on adding official support for CUDA 9.0 (with CI support), and expect it to be available within a week or two

@tijyojwad tijyojwad transferred this issue from another repository Aug 15, 2019
@tijyojwad
Copy link

@SamStudio8 latest release of racon-gpu works with CUDA 9.0. Can we close this issue?

@tijyojwad tijyojwad added the enhancement New feature or request label Aug 29, 2019
@SamStudio8
Copy link
Author

@tijyojwad Sounds good! Presumably this means we don't have to worry about the pattern_formatter.h now?

tijyojwad pushed a commit that referenced this issue Sep 2, 2019
[packaging] add deb packaging support for racon
@tijyojwad
Copy link

@SamStudio8 That's right, no need to worry about pattern_formatter.h in CUDA 9.0 builds anymore

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants