Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Installation issues #88

Closed
Rollinrolan opened this issue Feb 12, 2020 · 4 comments
Closed

Installation issues #88

Rollinrolan opened this issue Feb 12, 2020 · 4 comments

Comments

@Rollinrolan
Copy link

Rollinrolan commented Feb 12, 2020

Hi!

I've tried to install MinkowskiEngine via pip and have had different issues on every machine I've tried. To try isolate this install from any previous install/uninstalls, I'm trying to make a docker container that runs Minkowski Engine. Below is my current Dockerfile.

FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
LABEL maintainer=""

RUN apt-get update

RUN DEBIAN_FRONTEND=noninteractive apt-get install keyboard-configuration -y

RUN apt-get install -y\
      build-essential \
      software-properties-common \
      apt-utils \
      ca-certificates \
      wget \
      git \
      vim \
      libssl-dev \
      curl \
      unzip \
      unrar

RUN apt-get install software-properties-common -y
RUN add-apt-repository ppa:deadsnakes/ppa -y
RUN apt install python3.7 -y
RUN apt install python3-pip -y
RUN python3.7 -m pip install pip

RUN apt-get install -y libsm6 libxrender1 libfontconfig1 libpython3.7-dev libopenblas-dev

RUN python3.7 -m pip install numpy \
 && python3.7 -m pip install scipy \
 && python3.7 -m pip install cython \
 && python3.7 -m pip install scikit-image \
 && python3.7 -m pip install sklearn \
 && python3.7 -m pip install opencv-python==4.1.1.26 \
 && python3.7 -m pip install torch==1.4.0 \
 && python3.7 -m pip install torchvision==0.5.0

RUN python3.7 -m pip install -U MinkowskiEngine

WORKDIR /root

EXPOSE 8888
ARG USER_NAME
ARG USER_ID
RUN adduser --uid ${USER_ID} --gecos '' ${USER_NAME}
RUN adduser ${USER_NAME} sudo
RUN passwd -de ${USER_NAME}

CMD ["/bin/bash"]

Building this without RUN python3.7 -m pip install -U MinkowskiEngine works fine and posts appropriate nvidia-smi and nvcc -V results (Driver Version: 418.87.00 CUDA Version: 10.1) but running the ME pip install gives errors (middle part removed as it's just loads of similar 'In file included from...' errors):

Step 15/23 : RUN python3.7 -m pip install -U MinkowskiEngine
 ---> Running in 99a69db68df6
Collecting MinkowskiEngine
  Downloading https://files.pythonhosted.org/packages/dd/cd/0c763e1be0bebe50552bccb8196052c4d9ec4e028304857bcbed76f1efc0/MinkowskiEngine-0.4.1.tar.gz (106kB)
Requirement already up-to-date: numpy in /usr/local/lib/python3.7/dist-packages (from MinkowskiEngine)
Requirement already up-to-date: torch in /usr/local/lib/python3.7/dist-packages (from MinkowskiEngine)
Building wheels for collected packages: MinkowskiEngine
  Running setup.py bdist_wheel for MinkowskiEngine: started
  Running setup.py bdist_wheel for MinkowskiEngine: finished with status 'error'
  Complete output from command /usr/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-gdp352tc/MinkowskiEngine/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpf9kbaochpip-wheel- --python-tag cp37:
  Failed building wheel for MinkowskiEngine
  No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
  
  Using openblas
  make: python: Command not found
  make: python: Command not found
  make: python: Command not found
  make: python: Command not found
  make: python: Command not found
  make: python: Command not found
  make: python: Command not found
  CXX src/pooling_max.cpp
  CXX src/convolution_transpose.cpp
  CXX src/convolution.cpp
  CXX src/coords_manager.cpp
  CXX src/broadcast.cpp
  CXX src/union.cpp
  CXX src/pooling_avg.cpp
  CXX src/region.cpp
  In file included from /usr/include/c++/7/utility:68:0,
                   from /usr/include/c++/7/array:38,
                   from src/common.hpp:27,
                   from src/pooling_avg.cpp:25:
  /usr/include/x86_64-linux-gnu/c++/7/bits/c++config.h:250:27: error: #if with no expression
   #if _GLIBCXX_USE_CXX11_ABI
                             ^
  In file included from /usr/include/c++/7/utility:68:0,
                   from /usr/include/c++/7/array:38,
                   from src/common.hpp:27,
                   from src/convolution.cpp:25:
  /usr/include/x86_64-linux-gnu/c++/7/bits/c++config.h:250:27: error: #if with no expression
   #if _GLIBCXX_USE_CXX11_ABI
                             ^
  In file included from /usr/include/c++/7/utility:68:0,
                   from /usr/include/c++/7/array:38,
                   from src/common.hpp:27,
                   from src/broadcast.cpp:25:
  /usr/include/x86_64-linux-gnu/c++/7/bits/c++config.h:250:27: error: #if with no expression
   #if _GLIBCXX_USE_CXX11_ABI
`
...
`    In file included from /usr/include/c++/7/list:64:0,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/core/dispatch/OperatorEntry.h:6,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/core/dispatch/Dispatcher.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/core/TensorMethods.h:10,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/Tensor.h:12,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/Context.h:4,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/ATen/ATen.h:5,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include/torch/all.h:4,
                     from /usr/local/lib/python3.7/dist-packages/torch/include/torch/extension.h:4,
                     from src/common.hpp:32,
                     from src/broadcast.cpp:25:
    /usr/include/c++/7/bits/list.tcc:178:27: error: #if with no expression
     #if _GLIBCXX_USE_CXX11_ABI
                               ^
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/convolution_transpose.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/coords_manager.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/broadcast.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/union.hpp:28,
                     from src/union.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/convolution.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/pooling_avg.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/include/cublas_v2.h:65:0,
                     from src/gpu.cuh:28,
                     from src/gpu_memory_manager.hpp:8,
                     from src/coords_manager.hpp:46,
                     from src/common.hpp:34,
                     from src/pooling_max.cpp:25:
    /usr/include/cublas_api.h:72:10: fatal error: driver_types.h: No such file or directory
     #include "driver_types.h"
              ^~~~~~~~~~~~~~~~
    compilation terminated.
    Makefile:161: recipe for target 'objs/convolution.o' failed
    make: *** [objs/convolution.o] Error 1
    make: *** Waiting for unfinished jobs....
    Makefile:161: recipe for target 'objs/convolution_transpose.o' failed
    make: *** [objs/convolution_transpose.o] Error 1
    Makefile:161: recipe for target 'objs/coords_manager.o' failed
    make: *** [objs/coords_manager.o] Error 1
    Makefile:161: recipe for target 'objs/pooling_max.o' failed
    make: *** [objs/pooling_max.o] Error 1
    Makefile:161: recipe for target 'objs/pooling_avg.o' failed
    make: *** [objs/pooling_avg.o] Error 1
    Makefile:161: recipe for target 'objs/union.o' failed
    make: *** [objs/union.o] Error 1
    Makefile:161: recipe for target 'objs/broadcast.o' failed
    make: *** [objs/broadcast.o] Error 1
    Makefile:161: recipe for target 'objs/region.o' failed
    make: *** [objs/region.o] Error 1
    /usr/lib/python3.7/distutils/dist.py:274: UserWarning: Unknown distribution option: 'long_description_content_type'
      warnings.warn(msg)
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.7
    creating build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiUnion.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiBroadcast.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiPruning.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiOps.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/Common.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiNormalization.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiConvolution.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiFunctional.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/__init__.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiCoords.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiPooling.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiNetwork.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/MinkowskiNonlinearity.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    copying ./MinkowskiEngine/SparseTensor.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine
    creating build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/gradcheck.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/init.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/collation.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/__init__.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/quantization.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    copying ./MinkowskiEngine/utils/coords.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/utils
    creating build/lib.linux-x86_64-3.7/MinkowskiEngine/modules
    copying ./MinkowskiEngine/modules/resnet_block.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/modules
    copying ./MinkowskiEngine/modules/__init__.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/modules
    copying ./MinkowskiEngine/modules/senet_block.py -> build/lib.linux-x86_64-3.7/MinkowskiEngine/modules
    running build_ext
    building 'MinkowskiEngineBackend' extension
    C compiler: x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fdebug-prefix-map=/build/python3.7-WA8NgD/python3.7-3.7.6=. -fstack-protector-strong -Wformat -Werror=format-security -g -fdebug-prefix-map=/build/python3.7-WA8NgD/python3.7-3.7.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC
    
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/pybind
    compile options: '-I./ -I/usr/include/python3.7m/.. -I/usr/local/lib/python3.7/dist-packages/torch/include -I/usr/local/lib/python3.7/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.7/dist-packages/torch/include/TH -I/usr/local/lib/python3.7/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.7m -c'
    extra options: '-Wno-deprecated-declarations -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MinkowskiEngineBackend -D_GLIBCXX_USE_CXX11_ABI=0'
    x86_64-linux-gnu-gcc: pybind/minkowski.cpp
    x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fdebug-prefix-map=/build/python3.7-WA8NgD/python3.7-3.7.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.7/pybind/minkowski.o -Lobjs -L/usr/local/cuda/lib64 -lminkowski -lcusparse -lopenblas -lopenblas -lcudart -o build/lib.linux-x86_64-3.7/MinkowskiEngineBackend.cpython-37m-x86_64-linux-gnu.so
    /usr/bin/ld: cannot find -lminkowski
    collect2: error: ld returned 1 exit status
    error: Command "x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fdebug-prefix-map=/build/python3.7-WA8NgD/python3.7-3.7.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.7/pybind/minkowski.o -Lobjs -L/usr/local/cuda/lib64 -lminkowski -lcusparse -lopenblas -lopenblas -lcudart -o build/lib.linux-x86_64-3.7/MinkowskiEngineBackend.cpython-37m-x86_64-linux-gnu.so" failed with exit status 1
    
    ----------------------------------------
Command "/usr/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-gdp352tc/MinkowskiEngine/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-0nsl_ai8-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-gdp352tc/MinkowskiEngine/
The command '/bin/sh -c python3.7 -m pip install -U MinkowskiEngine' returned a non-zero code: 1

Any ideas? I'd love the help!

@chrischoy
Copy link
Contributor

chrischoy commented Feb 17, 2020

It seems that python is not available make: python: Command not found. Make sure you have python available to the system. For instance, make a symlink from python3.7.

Also, the error says that No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'.
Make sure cuda runtime is available and define CUDA_HOME.

@Rollinrolan
Copy link
Author

Rollinrolan commented Feb 27, 2020

I've found that i can get a working install if I run this docker script without the pip3 install MinkowskiEngine, but instead allow it to finish, run the dockerfile, git clone the repository, and edit the Makefile as the following

diff --git a/Makefile b/Makefile
index 86d8267..f9cdac8 100644
--- a/Makefile
+++ b/Makefile
@@ -8,8 +8,7 @@ Q ?= @
 # CPU_ONLY := 1
 
 CXX ?= g++
-PYTHON ?= python
-
+PYTHON ?= python3.7
 EXTENSION_NAME := minkowski
 
 # BLAS choice:
@@ -38,7 +37,7 @@ INCLUDE_DIRS += $(PYTORCH_INCLUDES)
 LIBRARY_DIRS := $(PYTORCH_LIBRARIES)
 
 # Determine ABI support
-WITH_ABI := $(shell python -c 'import torch; print(int(torch._C._GLIBCXX_USE_CXX11_ABI))')
+WITH_ABI := $(shell $(PYTHON) -c 'import torch; print(int(torch._C._GLIBCXX_USE_CXX11_ABI))')
 
 # Determine platform
 UNAME := $(shell uname -s)
@@ -58,8 +57,8 @@ endif
 ifneq ($(CPU_ONLY), 1)
        # CUDA ROOT DIR that contains bin/ lib64/ and include/
        # CUDA_DIR := /usr/local/cuda
-       CUDA_DIR := $(shell python -c 'from torch.utils.cpp_extension import _find_cuda_home; print(_find_cuda_home())')
-       
+       CUDA_DIR := $(shell $(PYTHON) -c 'from torch.utils.cpp_extension import _find_cuda_home; print(_find_cuda_home())')
+
        INCLUDE_DIRS += ./ $(CUDA_DIR)/include
        LIBRARY_DIRS += $(CUDA_DIR)/lib64
 endif
@@ -105,7 +104,7 @@ else ifeq ($(BLAS), blas)
 else
        # ATLAS
        LIBRARIES += atlas
-       ATLAS_PATH := $(shell python -c "import numpy.distutils.system_info as si; ai = si.atlas_info(); [print(p) for p in ai.get_lib_dirs()]")
+       ATLAS_PATH := $(shell $(PYTHON) -c "import numpy.distutils.system_info as si; ai = si.atlas_info(); [print(p) for p in ai.get_lib_dirs()]")
        LIBRARY_DIRS += $(ATLAS_PATH)
 endif

then I can run python3.7 setup.py install and it all works fine.

When I googled symlinking python3.7 to python I saw stack exchanges of warnings not to since it could mess up some low level ubuntu stuff?I should mention I am only using python 3.7 here over 3.6 as I want this install to be able to run FCGF which states that it needs 3.7. I wondered what you thought about making these changes to the Makefile on pip as this seems to just grab the correct version of python?

@chrischoy
Copy link
Contributor

Thanks for the updated Makefile.

I'll update the pip package in a few days.

@rancheng
Copy link
Contributor

rancheng commented Mar 4, 2020

strange, when I execute the line:

python -c 'from torch.utils.cpp_extension import _find_cuda_home; print(_find_cuda_home())'

it output:

/usr

my python-3.6, cuda-toolkit-10.1, pytorch and torchvision are managed inside conda.

After uncommented the CUDA_DIR=/usr/local/cuda I got the following compile error:

Using mkl
CXX src/region.cpp
...
NVCC src/pooling_avg.cu
nvcc fatal   : Path to libdevice library not specified
Makefile:165: recipe for target 'objs/cuda/union.o' failed

I solved the issue by export CUDA library to PATH, add the following line to .bashrc worked for me:

export PATH="/usr/local/cuda-10.1/bin:$PATH"

The reason it can't be found is that I installed the cuda libary with .run file, I tried in docker with cuda environment, it is perfectly fine.

next time just install by deb, less hassle.

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin
sudo mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1604-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
sudo apt-get updatesudo apt-get -y install cuda

Tanazzah pushed a commit to Tanazzah/MinkowskiEngine that referenced this issue Feb 9, 2024
* modelnet40

* WIP

* training working

* Working

* Save checkpoint

* InfSampler

* No removed print

* cache / iter based training

* test logit move

* reduce size

* num worker update

* remove reset_seed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants