Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

GPU installation fix #463

Merged
merged 12 commits into from
Sep 29, 2023
Merged

GPU installation fix #463

merged 12 commits into from
Sep 29, 2023

Conversation

tsampazk
Copy link
Collaborator

@tsampazk tsampazk commented Sep 18, 2023

Fixes #461

This PR fixes GPU detectron2 and torch installation in install.sh and introduces a temporary fix by removing problematic tools (Continual SLAM and Panoptic Segmentation). Detectron2 installation taken from here (CUDA 11.3) and PyTorch installation taken from here (v1.13.1 -> Linux and Windows -> CUDA 11.6).

GPU installation was tested on develop branch with:

  • NVIDIA drivers 460.106.00:
    1. Software & Updates
    2. Additional Drivers
  • CUDA 11.2 (required for mxnet):
    1. cuda_11.2.0_460.27.04_linux.run from here
    2. Linux -> x86_64 -> Ubuntu-> 20.04 -> runfile local
    3. Skip driver installation
  • cuDNN installed through here
    1. Download cuDNN v8.1.0 (January 26th, 2021), for CUDA 11.0,11.1 and 11.2
    2. cuDNN Runtime Library for Ubuntu20.04 x86_64 (Deb)

Installation of Panoptic Segmentation and Continual SLAM fail during building probably due to mismatches between CUDA versions (system 11.2 vs torch 11.6), so install.sh entirely removes their src directories.

Detectron2 installs fine but seems to fail the tests:

Spoiler warning

ERROR: test_single_demo_grasp (unittest.loader._FailedTest)

ImportError: Failed to import test module: test_single_demo_grasp
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/unittest/loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/unittest/loader.py", line 377, in _get_module_from_name
import(name)
File "/home/runner/work/opendr/opendr/tests/sources/tools/control/single_demo_grasp/test_single_demo_grasp.py", line 20, in
from detectron2.modeling import build_model
File "/home/runner/work/opendr/opendr/venv/lib/python3.8/site-packages/detectron2/modeling/init.py", line 2, in
from detectron2.layers import ShapeSpec
File "/home/runner/work/opendr/opendr/venv/lib/python3.8/site-packages/detectron2/layers/init.py", line 3, in
from .deform_conv import DeformConv, ModulatedDeformConv
File "/home/runner/work/opendr/opendr/venv/lib/python3.8/site-packages/detectron2/layers/deform_conv.py", line 11, in
from detectron2 import _C
ImportError: /home/runner/work/opendr/opendr/venv/lib/python3.8/site-packages/detectron2/_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefIlEENS2_8optionalINS2_10ScalarTypeEEENS5_INS2_6LayoutEEENS5_INS2_6DeviceEEENS5_IbEE

Temporarily disabled single demo grasp tests.

Any suggestions and/or independent testing are welcome.

@tsampazk tsampazk added the bug Something isn't working label Sep 18, 2023
@tsampazk tsampazk added test sources Run style checks test tools Test the toolkit methods labels Sep 18, 2023
@tsampazk tsampazk marked this pull request as ready for review September 18, 2023 10:24
@tsampazk tsampazk mentioned this pull request Sep 19, 2023
@tsampazk tsampazk linked an issue Sep 25, 2023 that may be closed by this pull request
@tsampazk
Copy link
Collaborator Author

Test failure is unrelated to this PR, more information #462 (comment).

Copy link
Collaborator

@omichel omichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can go ahead and merge this PR now.

Copy link
Collaborator

@passalis passalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@tsampazk tsampazk merged commit f1bb55a into develop Sep 29, 2023
@tsampazk tsampazk deleted the fix_installation branch September 29, 2023 12:01
lucamarchionni pushed a commit to lucamarchionni/opendr that referenced this pull request Jun 10, 2024
* Added qt5-default, modified torch and detectron2 installation

* Modified detectron2 installation in single demo grasp

* Added version for torch in hyperparameter tuner

* Temporary fix removing tools that conflict with installation

* Added pip installation of lark which is required for building ROS2 workspace with colcon

* Disable single_demo_grasp test

* Disable single demo grasp test

---------

Co-authored-by: Olivier Michel <Olivier.Michel@cyberbotics.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working test sources Run style checks test tools Test the toolkit methods
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPU installation broken
3 participants