Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BUG]build failed on gtest! #442

Closed
SeekPoint opened this issue Feb 12, 2024 · 5 comments
Closed

[BUG]build failed on gtest! #442

SeekPoint opened this issue Feb 12, 2024 · 5 comments
Assignees
Labels
bug It's a bug / potential bug and need verification

Comments

@SeekPoint
Copy link

Describe the bug
A clear and concise description of what the bug is.

root@83e8355fd506:/share/yk_repo/HugeCTR/HugeCTR# git branch

  • (HEAD detached at v23.08.00)
    root@83e8355fd506:/share/yk_repo/HugeCTR/HugeCTR# cmake -DCMAKE_BUILD_TYPE=Release -DSM="80;90" -DENABLE_MULTINODES=ON

====ok

root@83e8355fd506:/share/yk_repo/HugeCTR/HugeCTR# make -j

。。。。。

[ 61%] Built target rdkafka++
[ 61%] Linking CXX static library ../../../lib/libgtest.a
[ 61%] Built target gtest
[ 61%] Building CXX object third_party/googletest/googletest/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
[ 61%] Building CXX object third_party/googletest/googlemock/CMakeFiles/gmock.dir/src/gmock-all.cc.o
In file included from /share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/include/gmock/gmock.h:59,
from /share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/src/gmock-all.cc:39:
/share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/include/gmock/gmock-actions.h:342:5: error: ISO C++ forbids declaration of 'GTEST_DISALLOW_COPY_AND_ASSIGN_' with no type [-fpermissive]
342 | GTEST_DISALLOW_COPY_AND_ASSIGN_(FixedValueProducer);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/include/gmock/gmock-actions.h:353:5: error: ISO C++ forbids declaration of 'GTEST_DISALLOW_COPY_AND_ASSIGN_' with no type [-fpermissive]
353 | GTEST_DISALLOW_COPY_AND_ASSIGN_(FactoryValueProducer);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/include/gmock/gmock-actions.h:427:3: error: ISO C++ forbids declaration of 'GTEST_DISALLOW_COPY_AND_ASSIGN_' with no type [-fpermissive]
427 | GTEST_DISALLOW_COPY_AND_ASSIGN_(ActionInterface);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/share/yk_repo/HugeCTR/HugeCTR/third_party/googletest/googlemock/include/gmock/gmock-actions.h:686:27: error: expected identifier before '!' token

。。。。。

To Reproduce
Steps to reproduce the behavior:

sudo docker build --build-arg BASE_IMAGE=merlinbase -f dockerfile.ctr .

sudo docker run -it --entrypoint=/bin/bash -v /home/amd00:/share -v /data:/data --name hugectr_dev_c --shm-size="50G" hugectr_dev

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: [e.g. Ubuntu xx.yy]
  • Graphic card: [e.g. a single NVIDIA V100 or NVIDIA DGX A100]
  • CUDA version: [e.g. CUDA 11.x]
  • Docker image

Additional context
Add any other context about the problem here.

@zehuanw zehuanw added the bug It's a bug / potential bug and need verification label Feb 12, 2024
@zehuanw
Copy link
Collaborator

zehuanw commented Feb 12, 2024

Thank you for your feedback! @minseokl @shijieliu to check when they come back from Luna New Year holiday vacation.

@EmmaQiaoCh
Copy link
Collaborator

Hi @SeekPoint, thanks for the finding.
Could you give more information: How did you build image 'merlinbase'? Which merlin branch was used for 'dockerfile.merlin' and 'dockerfile.ctr'?
I can't reproduce it when using merlin-base:23.08 and 'v23.08.00' of HugeCTR.

@SeekPoint
Copy link
Author

SeekPoint commented Feb 22, 2024

amd00@MZ32-00:~/yk_repo/HugeCTR/Merlin/docker$ git branch

  • (HEAD detached at v23.08.00)
    main

since I got network issue in China, I have do some change with:
diff --git a/docker/dockerfile.merlin b/docker/dockerfile.merlin
index 8f9aa3df..59b3abb6 100644
--- a/docker/dockerfile.merlin
+++ b/docker/dockerfile.merlin
@@ -102,10 +102,10 @@ RUN pip install --no-cache-dir --upgrade pip; pip install --no-cache-dir "cmake<
xgboost==1.6.2 lightgbm
lightfm implicit
numba "cuda-python>=11.5,<12.0" fsspec==2022.5.0 llvmlite \

  •            pynvml==11.4.1
    

-RUN pip install --no-cache-dir treelite==2.4.0 treelite_runtime==2.4.0
-RUN pip install --no-cache-dir numpy==1.22.4 protobuf==3.20.3 onnx onnxruntime pycuda
-RUN pip install --no-cache-dir dask==${DASK_VER} distributed==${DASK_VER} dask[dataframe]==${DASK_VER}

  •            pynvml==11.4.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
    

+RUN pip install --no-cache-dir treelite==2.4.0 treelite_runtime==2.4.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
+RUN pip install --no-cache-dir numpy==1.22.4 protobuf==3.20.3 onnx onnxruntime pycuda -i https://pypi.tuna.tsinghua.edu.cn/simple
+RUN pip install --no-cache-dir dask==${DASK_VER} distributed==${DASK_VER} dask[dataframe]==${DASK_VER} -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install --no-cache-dir onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com

Triton Server

@@ -299,7 +299,7 @@ COPY --chown=1000:1000 --from=dlfw /usr/local/lib/python${PYTHON_VERSION}/dist-p
COPY --chown=1000:1000 --from=dlfw /usr/local/lib/python${PYTHON_VERSION}/dist-packages/numba-.dist-info /usr/local/lib/python${PYTHON_VERSION}/dist-packages/numba.dist-info/
COPY --chown=1000:1000 --from=dlfw /usr/local/lib/python${PYTHON_VERSION}/dist-packages/cubinlinker-
.dist-info /usr/local/lib/python${PYTHON_VERSION}/dist-packages/cubinlinker.dist-info/

-RUN pip install --no-cache-dir jupyterlab notebook pydot testbook numpy==1.22.4
+RUN pip install --no-cache-dir jupyterlab notebook pydot testbook numpy==1.22.4 -i https://pypi.tuna.tsinghua.edu.cn/simple

ENV JUPYTER_CONFIG_DIR=/tmp/.jupyter
ENV JUPYTER_DATA_DIR=/tmp/.jupyter
amd00@MZ32-00:~/yk_repo/HugeCTR/Merlin/docker$

it means I add -i https://pypi.tuna.tsinghua.edu.cn/simple for the 'pip install'

amd00@MZ32-00:~/yk_repo/HugeCTR/Merlin/docker$ sudo docker build --pull -t merlinbase -f dockerfile.merlin .

then:
sudo docker build --build-arg BASE_IMAGE=merlinbase -f dockerfile.ctr .

@EmmaQiaoCh
Copy link
Collaborator

Hi @SeekPoint ,Sorry, I still can't reproduce it although I checkout merlin v23.08.00 to build as the commands which you provided.
Could you check/provide these info:

  1. Did these lines(https://github.com/NVIDIA-Merlin/Merlin/blob/release-23.08/docker/dockerfile.ctr#L57-L58) executed in when executing docker build dockerfile.ctr?
  2. Could you try to pass these args when docker build: --build-arg 'HUGECTR_VER=v23.08.00' --build-arg 'HUGECTR_BACKEND_VER=v23.08.00'
  3. Could you attach the dockerfile.merlin, dockerfile.ctr and the build output for command 'sudo docker build --build-arg BASE_IMAGE=merlinbase -f dockerfile.ctr .'
    Thanks a lot!

@SeekPoint
Copy link
Author

@EmmaQiaoCh

I try again and passed on gtest, but failed on another error:

[ 28%] Building CXX object third_party/rocksdb/CMakeFiles/rocksdb-shared.dir/env/fs_remap.cc.o
/usr/include/rmm/logger.hpp(116): error: namespace "fmt" has no member class "ostream_formatter"
  struct fmt::formatter<rmm::detail::bytes> : fmt::ostream_formatter {

I can fix by:

git clone https://github.com/fmtlib/fmt
mkdir build
cd fmt/
mkdir build
cd build/
cmake ..
make -j32
make install

git clone https://github.com/gabime/spdlog.git
cd spdlog && mkdir build && cd build
cmake ..
make -j32
make install

thanks you;)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug It's a bug / potential bug and need verification
Projects
None yet
Development

No branches or pull requests

4 participants