Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr (only CUDA fails, while C++ succeeds) #627

Closed
valassi opened this issue Apr 3, 2023 · 3 comments · Fixed by #625
Assignees

Comments

@valassi
Copy link
Member

valassi commented Apr 3, 2023

SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr

See for instance susy gg to tt in PR #625, and in particular 586195d

The log says

The HRDCOD=1 build now fails with

ccache /usr/local/cuda-12.0/bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -Xcompiler -fPIC -c gcheck_sa.cu -o gcheck_sa.o
../../src/Parameters_MSSM_SLHA2.h(310): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(310): note #2703-D: cannot call non-constexpr function "atan(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(4137): here
../../src/Parameters_MSSM_SLHA2.h(725): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(725): note #2703-D: cannot call non-constexpr function "cos(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(553): here
../../src/Parameters_MSSM_SLHA2.h(726): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(726): note #2703-D: cannot call non-constexpr function "sin(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(520): here
3 errors detected in the compilation of "gcheck_sa.cu".
@valassi valassi changed the title SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr (only CUDA fails, while C++ succeeds) Feb 14, 2024
@valassi
Copy link
Member Author

valassi commented Feb 14, 2024

Update Feb 2024. Actually, it is only nvcc that complains that sin/cos/atan are not constexpr. The C++ only build (CUDA_HOME=none using the old setup) succeeds, and the runtime tests are ok.

Maybe we can try to hardcode the gcc implementation of constexpr sin/cos/atan into our nvcc code?

See #625 (comment)

valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 14, 2024
…for mdl_G__exp__2 in FPTYPE=f

'make HRDCOD=0 FPTYPE=f -j' builds now succeed and the tests succeed too.

'make HRDCOD=0 FPTYPE=m -j' builds now also succeed and the tests succeed too.

'CUDA_HOME=none make HRDCOD=1' builds and tests also succeed with FPTYPE=d,f,m

The only pending problem are CUDA HRDCOD=1 builds because of missing constexpr sin/cos/atan (madgraph5#627)

This is probably ready to be backported to CODEGEN and eventually merged
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…ons for HRDCOD=1 cuda builds (madgraph5#627)

Builds still fail because assert is not constexpr (I need to extend the function to cover any value and not just 0 to pi/2)
@valassi valassi self-assigned this Feb 28, 2024
@valassi
Copy link
Member Author

valassi commented Feb 28, 2024

I have made a first prototype in PR #625 here 46694af

Still need some fixes (extend the range beyond 0 to pi/2, add arctan).

The implementation is very simply based on a Taylor series expansion, which is easily done in constexpr.

@valassi valassi linked a pull request Feb 28, 2024 that will close this issue
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…ons for HRDCOD=1 cuda builds (madgraph5#627)

Builds still fail because assert is not constexpr (I need to extend the function to cover any value and not just 0 to pi/2).
Also missing is atan.
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…rig functions for HRDCOD=1 cuda builds (madgraph5#627)

Extend coverage of sin/cos/tan to any value, and add an atan function.

Now the build fails with the following:
HRDCOD=1 make
ccache /usr/local/cuda-12.0/bin/nvcc   -Xcompiler -O3 -lineinfo -I. -I../../src -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -Xcompiler -fPIC -c -x cu CPPProcess.cc -o CPPProcess_cu.o
../../src/Parameters_MSSM_SLHA2.h(818): error: identifier "mg5amcGpu::Parameters_MSSM_SLHA2::mdl_I51x11" is undefined in device code
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…builds (madgraph5#627): it was enought to define mdl_I51x11 as a "__device__" constexpr.

NB: Now the build fully succeeds and runTest.exe also succeeds for HRDCOD=1!
@valassi
Copy link
Member Author

valassi commented Feb 28, 2024

The fix for HRDCOD=1 builds in SUSY is now complete

(SUSY itself is almost completed, only some minor pieces missing to CODEGEN)

cc @oliviermattelaer @roiser

@valassi valassi closed this as completed Feb 28, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
…mplementation of constexpr trig functions for HRDCOD=1 cuda builds (madgraph5#627)

Also add it to output.py in CODEGEN
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
… used madgraph5#627

Also add a comment about std::complex (replace is done twice)

The differences in generated code are now the following:

______________________________________________________________________________
git diff /data/avalassi/GPU2023/madgraph4gpuBis/epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc /tmp/git-blob-dFUXLJ/CPPProcess.cc 0651b576632e9d50c3ed2bf9461bd41309c90821 100644 epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc 0000000000000000000000000000000000000000 100644
79c79
<   __device__ const fptype cIPD[3] = { (fptype)Parameters_MSSM_SLHA2::mdl_MT, (fptype)Parameters_MSSM_SLHA2::mdl_WT, (fptype)Parameters_MSSM_SLHA2::mdl_I51x11 };
---
>   __device__ const fptype cIPD[2] = { (fptype)Parameters_MSSM_SLHA2::mdl_MT, (fptype)Parameters_MSSM_SLHA2::mdl_WT };
83c83
<   __device__ __constant__ fptype cIPD[3];
---
>   __device__ __constant__ fptype cIPD[2];
86c86
<   static fptype cIPD[3];
---
>   static fptype cIPD[2];
505c505
<     const fptype tIPD[3] = { (fptype)m_pars->mdl_MT, (fptype)m_pars->mdl_WT, (fptype)m_pars->mdl_I51x11 };
---
>     const fptype tIPD[2] = { (fptype)m_pars->mdl_MT, (fptype)m_pars->mdl_WT };
508c508
<     gpuMemcpyToSymbol( cIPD, tIPD, 3 * sizeof( fptype ) );
---
>     gpuMemcpyToSymbol( cIPD, tIPD, 2 * sizeof( fptype ) );
511c511
<     memcpy( cIPD, tIPD, 3 * sizeof( fptype ) );
---
>     memcpy( cIPD, tIPD, 2 * sizeof( fptype ) );
514c514
<     //for ( i=0; i<3; i++ ) std::cout << std::setprecision(17) << "tIPD[i] = " << tIPD[i] << std::endl;
---
>     //for ( i=0; i<2; i++ ) std::cout << std::setprecision(17) << "tIPD[i] = " << tIPD[i] << std::endl;

______________________________________________________________________________
git diff /data/avalassi/GPU2023/madgraph4gpuBis/epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h /tmp/git-blob-fVhC4J/Parameters_MSSM_SLHA2.h 2805ef7d81bd8340b2a77b2ec16df7639165e30c 100644 epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h 0000000000000000000000000000000000000000 100644
23,24d22
< #include "constexpr_math.h"
<
295c293
<     __device__ constexpr double mdl_I51x11 = 1.;
---
>     constexpr double mdl_I51x11 = 1.;
370c368
<     constexpr cxsmpl<double> mdl_bb = ( ( -mdl_mHd2 + mdl_mHu2 - mdl_MZ__exp__2 * constexpr_cos( 2. * mdl_beta ) ) * constexpr_tan( 2. * mdl_beta ) ) / 2.;
---
>     constexpr cxsmpl<double> mdl_bb = ( ( -mdl_mHd2 + mdl_mHu2 - mdl_MZ__exp__2 * constexpr_cos( 2. * mdl_beta ) ) * tan( 2. * mdl_beta ) ) / 2.;
801,802c799
<       //const double mdl_I51x11 = Parameters_MSSM_SLHA2::getInstance()->mdl_I51x11; // fix HRDCOD=0 susy builds
<       const fptype mdl_I51x11 = cIPD[2]; // fix HRDCOD=0 susy builds
---
>       const fptype mdl_I51x11;
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
1 participant