Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Install error "relocation R_X86_64_PC32" when "python setup.py build develop" #94

Open
WesternHill opened this issue Nov 5, 2022 · 0 comments

Comments

@WesternHill
Copy link

WesternHill commented Nov 5, 2022

I'm now trying to run SE-SSD on docker. But I'm facing same issue for more than a week.
Could I have some advise ?
or I appreciate to tell me your successed environment.

Goal

  • Train/Inference SE-SSD on docker

Issue

$ docker compose up --build
.... OMITTED  because it so long ...
se-ssd-env  | nvcc: det3d/ops/pointnet2/src/sampling_gpu.cu
se-ssd-env  | creating build/lib.linux-x86_64-3.6/det3d/ops/pointnet2
se-ssd-env  | g++ -pthread -shared build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/interpolate.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/ball_query.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/group_points.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/bindings.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/sampling.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/interpolate_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/ball_query_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/group_points_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/sampling_gpu.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.6/det3d/ops/pointnet2/PN2.cpython-36m-x86_64-linux-gnu.so
se-ssd-env  | /usr/bin/ld: build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/sampling.o: relocation R_X86_64_PC32 against symbol `_ZN3c10eqERKNS_12TensorTypeIdES2_' can not be used when making a shared object; recompile with -fPIC
se-ssd-env  | /usr/bin/ld: final link failed: Bad value
se-ssd-env  | collect2: error: ld returned 1 exit status
se-ssd-env  | error: Command "g++ -pthread -shared build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/interpolate.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/ball_query.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/group_points.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/bindings.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/sampling.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/interpolate_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/ball_query_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/group_points_gpu.o build/temp.linux-x86_64-3.6/det3d/ops/pointnet2/src/sampling_gpu.o -L/usr/local/cuda/lib64 -lcudart -o build/lib.linux-x86_64-3.6/det3d/ops/pointnet2/PN2.cpython-36m-x86_64-linux-gnu.so" failed with exit status 1
se-ssd-env exited with code 1

How it can be reproduced?

  • Write docker-compose.yml and Dockerfile as below, then hit "docker-compose up --build"
  • docker-compose.yml
version: "3.8"
services:
  se-ssd_env:
    build:
      context: .
      dockerfile: Dockerfile
    image: se-ssd-env
    container_name: se-ssd-env
    network_mode: host
    tty: true
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
  • Dockerfile
FROM nvidia/cuda:10.0-devel-ubuntu18.04
ENV DEBIAN_FRONTEND noninteractive
RUN rm -rf /etc/apt/sources.list.d/cuda.list /etc/apt/sources.list.d/nvidia-ml.list
WORKDIR "/root/"

# Install cudnn 7.5.0 as oficially tested
# Cudnn .deb file can be obtained from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
# Reference: https://medium.com/repro-repo/install-cuda-10-1-and-cudnn-7-5-0-for-pytorch-on-ubuntu-18-04-lts-9b6124c44cc
COPY materials/libcudnn7_7.5.0.56-1+cuda10.0/* ./
RUN dpkg -i libcudnn7_7.5.0.56-1+cuda10.0_amd64.deb libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb libcudnn7-doc_7.5.0.56-1+cuda10.0_amd64.deb

# Install python3.6 and its pip
# I refered below but test when "make" would be stucked, so changed make build-all to skip test.
# https://gist.github.com/sonhmai/57bca33dc5d03dafb82bafe334f3dd21
ENV PYTHON_VERSION=3.6.5
ENV PYTHON_DOWNLOAD_URL=https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz
RUN apt clean && apt update && apt install -y software-properties-common libssl-dev libreadline-dev libbz2-dev libsqlite3-dev zlib1g-dev python-minimal wget
RUN wget "$PYTHON_DOWNLOAD_URL" -O python.tar.tgz && tar -zxvf python.tar.tgz
RUN ls && cd Python-$PYTHON_VERSION && ./configure --enable-optimizations --enable-loadable-sqlite-extensions && make -j8 build_all && make -j8 install
RUN pip3 install --upgrade pip
RUN python3 --version && pip3 --version

# Install pytorch 1.1
RUN pip3 install torch==1.1

# Install spconv
RUN apt install -y libboost-all-dev git
RUN pip3 install cmake --upgrade
RUN pip3 install wheel
RUN git clone https://github.com/poodarchu/spconv --recursive && cd spconv && git checkout 73427720a539caf9a44ec58abe3af7aa9ddb8e39 #3f5b4b716c5d40d
RUN cd spconv && python3 setup.py bdist_wheel
RUN cd spconv/dist && pip3 install *

# Install nuscenes-devkit
RUN pip3 install nuscenes-devkit

# Install dependencies for installing SE-SSD
RUN apt update && apt install -y ninja-build
RUN pip3 install Cython pythran pytest-runner cmake==3.17.3
RUN pip3 install setuptools==39.1.0
RUN git clone https://github.com/Vegeta2020/SE-SSD.git
RUN git clone https://github.com/jackd/ifp-sample.git && pip3 install -e ifp-sample

RUN echo "cd SE-SSD/det3d/core/iou3d $$ python3 setup.py install" >> docker-entrypoint.sh
RUN echo "cd SE-SSD && python3 setup.py build develop" >> docker-entrypoint.sh
ENTRYPOINT ["/bin/bash","docker-entrypoint.sh"]

What I already tried and didn't worked

  • Add "-Xcompiler","-fPIC" inside of nvcc: [ ] at every "extra_compiler_args" where is like here
    • This simply didn't make any change at a result.
  • Add "-fPIC" inside of cxx: []
    • This causes nvcc fatal : Unknown option "-fPIC" even though the option is for cxx, not for nvcc.
  • Change base-image of docker from "nvidia/cuda:10.0" to 10.1 or 10.2. Neither made change on the result.
  • Change pytorch version among 1.1,1.3,1.4,1.6. Any of these version didn't make change.
  • Change setuptools version
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant