Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

valassi · 2023-12-15T12:30:31Z

Hi @hageboeck @roiser (and @oliviermattelaer) this is a WIP PR which takes Jorgen's PR #775 and updates it to upstream/master.

As discussed with Stephan and Stefan via email and partly in PR #797 and PR #753, I would suggest to split the refactoring of makefiles in two parts:

First we simply define the new targets (splitting cuda from cpp builds, and paving the way for HIP), and here we could add a few of Stephan's changes like using native as default (and maybe forcing the use of builddirs? maybe later): this is needed soon, it blocks HIP, it blocks adding SYCL like HIP etc. This is is this WIP PR.
Second, later, camly, we do the full refactoring in PR Test refactoring of Makefiles #753, meaning we move stuff over to make_opts and improve the passing of external flags. The problem I see now with PR Test refactoring of Makefiles #753 is that it has no support for .sa and it is very difficult to put it back: I would probably suggest to rethink it a bit. Essentially, cudacpp.mk and cudacpp_src.mk should continue IMO to be fully functional (in teh .sa dirs) when there is no fortran makefile. Note that these .sa are neded by @zeniheisser for reweighting IIUC.

What I have done so far is only to complete the merge of upstream/master into Jorgen's branch, for CODEGEN and for gg_tt.mad. There are still a few issues to solve (eg 'make' does nothing, 'make all' builds fortran but no cudacpp, 'make cuda' builds cudacpp but no fortran?).

But I already put it here as a basis for concrete discussion.

… branch: go back to *.mad (except gg_tt.mad) and *.sa code before Jorgen's changes In practice, keep only Jorgen's changes in CODEGEN and in gg_tt.mad dirs=$(git ls-tree --name-only HEAD *.mad *.sa | grep -v gg_tt.mad) \rm -rf $dirs git checkout e21d346 $dirs This is the last common commit of upstream/master and of Jorgen's branch

Fix all conflicts: epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/cudacpp.mk epochX/cudacpp/gg_tt.mad/Cards/me5_configuration.txt epochX/cudacpp/gg_tt.mad/Cards/run_card.dat epochX/cudacpp/gg_tt.mad/Source/make_opts epochX/cudacpp/gg_tt.mad/SubProcesses/MatrixElementKernels.cc epochX/cudacpp/gg_tt.mad/SubProcesses/cudacpp.mk epochX/cudacpp/gg_tt.mad/SubProcesses/makefile epochX/cudacpp/gg_tt.mad/bin/internal/banner.py epochX/cudacpp/gg_tt.mad/bin/internal/gen_ximprove.py epochX/cudacpp/gg_tt.mad/bin/internal/launch_plugin.py epochX/cudacpp/gg_tt.mad/bin/internal/madevent_interface.py For some of these conflicts, just check out the version in upstream/master: git checkout upstream/master gg_tt.mad/Cards git checkout upstream/master gg_tt.mad/Source git checkout upstream/master gg_tt.mad/SubProcesses/MatrixElementKernels.cc git checkout upstream/master gg_tt.mad/bin/internal/ For the CODEGEN conflicts: take the upstream/master version, except for the indented cuda block (take Jorgen's) For the gg_tt.mad makefile conflict: take the upstream/master version in case of changes, but include both versions in one occasion For the gg_tt.mad cudacpp.mk conflict: take the upstream/master version, except for the indented cuda block (take Jorgen's, but do include the curand fix from upstream/master)

…ster's version instead of Jorgen's for those 5 lines) NB1: currently, Jorgen's version differs from upstream/master only in three CODEGEN files git diff --name-only upstream/master epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/cudacpp.mk epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/cudacpp_src.mk NB2: code generation for ggtt.mad currently fails on itscrd90 (patchMad..sh fails for makefile - same as seen in PR madgraph5#797?)

…sue seen in makefile.rej

…created by merging and fixing conflicts This confirms that code generation is now succeding (on itscrd90)

…adgraph5#706) into jtmk Fix conflicts in CODEGEN log by checking it out from upstream/master git checkout upstream/master gg_tt.mad/CODEGEN_mad_gg_tt_log.txt

…nges only in codegen log

…efore testing gg_tt.mad

…n of make_opts

…uda-only block to ease comparison with upstream/master

…of cuda-only block to ease comparison with upstream/master

…pp-only block to ease comparison with upstream/master

…boeck to authors): prepare to use BACKEND=cuda/none/sse4... as the main build switch

…mplementation to go closer to the previous version and fix various issues The main changes are the following (use cuda and avx2 as examples) - use 'make BACKEND=avx2' instead of 'make cppavx2': this is closer to the original 'make AVX=avx2', the difference is the addition of 'make BACKEND=cuda' - in particular, remove the use of MAKECMDGOALS in cudacpp.mk - fix parallel builds by restoring the original logic ('make cppall' was not building all AVX modes as expected) - in addition, export AVXFLAGS so that it can be used as-is from cudacpp_src.mk [eventually more variables will be handled in this way]

…go closer to the previous version and sync with cudacpp.mk The file is now almost the same as in upstream/master. Apart from changes in variable names, the main change is the following: - AVXFLAGS is now exported from cudacpp.mk instead of being locally redefined [eventually more variables will be handled in this way]

…e conditional about GTESTLIBS

Now that C++ and cuda build folders are separate, the test cases have to be instantiated on every compiler pass.

…m Stephan (remove the protection against CPU/GPU multiply defined symbols)

…ction against CPU/GPU multiply defined symbols, rather than removing it In practice, no code has been changed, only some comments have been added (including commented out #ifndef/#endif directives)

…ent out the protection against CPU/GPU multiply defined symbols, rather than removing it In practice, no code has been changed, only some comments have been added (including commented out #ifndef/#endif directives)

…and GPU - this also requires the patch in MadgraphTest.h from a previous commit

…in upstream/master) for multi-backend builds... this is no longer needed because fbridge.inc is no longer copied to an INC directory! This completes the patches for gg_tt.sa: now 'make -j bldall' works as expected

…parate C++/CUDA builds using BACKEND variable)

…d in gg_tt.sa

…tion to go closer to the previous version and sync with cudacpp.mk This completes the patches for gg_tt.mad: now 'make -j bldall' works as expected

…FPTYPE=f from 2E-3 to 3E-3 (see madgraph5#843) - now all ok ./tput/teeThroughputX.sh -smeftggtttt -fltonly -makeclean -makej ./tput/teeThroughputX.sh -smeftggtttt -fltonly -makeclean -makej -hrdonly

…plus susy and heft as usual) The pending issues are - gq_ttq (only on HIP) madgraph5#806 - heft_gg_bb (for FPTYPE=f everywhere) madgraph5#833 - susy_gg_tt (everywhere) madgraph5#825 - susy_gg_t1t1 (everywhere) madgraph5#826 (1) all tests but ggttggg STARTED AT Wed 15 May 2024 05:48:13 PM EEST (SM tests) ENDED(1) AT Wed 15 May 2024 06:09:49 PM EEST [Status=0] (BSM tests) ENDED(1) AT Wed 15 May 2024 06:18:39 PM EEST [Status=0] (2) ggttggg tests only STARTED AT Wed 15 May 2024 05:48:49 PM EEST (SM tests) ENDED(1) AT Wed 15 May 2024 08:24:46 PM EEST [Status=0] (BSM tests) ENDED(1) AT Wed 15 May 2024 08:32:16 PM EEST [Status=0] 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt 12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt 12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt 1 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt 16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt 0 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt

…pdates in PR madgraph5#842) into jtmk Fix conflicts: epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/cudacpp.mk epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/cudacpp_src.mk

…oleranceMEs for FPTYPE=f from 2E-3 to 3E-3 (see madgraph5#843)

… previously failing on LUMI where cuda is not installed)

…dated copyright year range, and increased toleranceME in MadgraphTest.h) This essentially completes my latest jtmk patches in the cudacpp code. The only pending issues are Olivier's comments about madevent_interface.py and launch_plugin.py

valassi · 2024-05-15T20:40:35Z

I have rerun all tests and all looks ok.

The only pending issues are Olivier's original two comments on madevent_interface.py and launch_plugin.py.

valassi · 2024-05-15T21:09:04Z

Just one small issue that modifying madevent_interface.py is (really) a bad idea since user might not always use the modified file depending on how they run the code (and/or environment variable), I'm actually not even sure that such change is needed. Can you check without? If you have trouble I can have a look obviously

Hi @oliviermattelaer thanks. As mentioned in the two comments, I have opened a new issue #844 (which is really a fragment of your own issue #756) about madevent_interface.py.

In my opinion, this is a problem but it should not be fixed here in #798, where I am addressing other problems. Rephrase: I would not block the merge of #798 because of that.

Anyway, I requested a new review from you - can you please have a look and approve or suggest something else to do?

Thanks!
Andrea

git checkout 5fec65c tput/logs_* tmad/logs_* This essentially completes my latest jtmk2 patches in the cudacpp code. This will only need a merge of jtmk into master before (assuming nothing else in needed in jtmk PR madgraph5#798).

valassi · 2024-05-17T07:07:05Z

(Note: I have added additional cleanup of makefiles in PR #841 - but let's first get this #798 out of the way)

…de 16 to Node 20

valassi · 2024-05-17T07:53:49Z

I have updated github actions cache to v4 (fixing #848)

…g5amcnlo#97

…d by Olivier), after fixing pkl bug mg5amcnlo#97

…lly no change, but 'G*G' is now '(G)*(G)' in Parameters.h files

valassi · 2024-05-21T21:28:55Z

Hi @oliviermattelaer (I upgraded to the latest gpucpp and removed the special handling of G^2),
do you have any further feedback on this?

Are you satisfied with my suggestion to move the issue with madevent_interface.py to another ticket, and go ahead merging this one? Thanks

oliviermattelaer · 2024-05-22T06:54:13Z

Perfect, thanks Andrea! You can go ahead and merge this,

Thanks a lot,

Olivier

valassi · 2024-05-22T07:15:52Z

Thanks Olivier! I am merging this then.

…graph5#798) into jtmk2

…madgraph5#798 and madgraph5#841) into ewdim6 Fix conflicts: epochX/cudacpp/tmad/allTees.sh epochX/cudacpp/tput/throughputX.sh

Jooorgen and others added 7 commits September 28, 2023 16:00

[CODEGEN] Correctly sets target if madevent is in $MAKECMDGOALS

cf1b21b

[CODEGEN] Regenerated all .sa/.mad processes

75c5be6

[jtmk] in CODEGEN, manually change patch.common to try and fix the is…

b5af0c1

…sue seen in makefile.rej

[jtmk] regenerate gg_tt.mad: this looks more correct than what I had …

e9a7a59

…created by merging and fixing conflicts This confirms that code generation is now succeding (on itscrd90)

valassi marked this pull request as draft December 15, 2023 12:31

valassi and others added 22 commits December 16, 2023 09:45

Merge remote-tracking branch 'upstream/master' (including merged PR m…

7ca6dbe

…adgraph5#706) into jtmk Fix conflicts in CODEGEN log by checking it out from upstream/master git checkout upstream/master gg_tt.mad/CODEGEN_mad_gg_tt_log.txt

[jtmk] regenerate gg_tt.mad (after merging upstream/master) - ok, cha…

e791a5c

…nges only in codegen log

[jtmk] regenerate gg_tt.sa and test the new makefile targets there, b…

5e55e66

…efore testing gg_tt.mad

[jtmk] in gg_tt.sa makefiles, reorder author names in alphabetical order

11d423e

[jtmk] in gg_tt.sa makefiles, improve the comments about the inclusio…

7f65c14

…n of make_opts

[jtmk] in gg_tt.sa cudacpp.mk, (temporarely?) remove indentation of c…

5302064

…uda-only block to ease comparison with upstream/master

[jtmk] in gg_tt.sa cudacpp_src.mk, (temporarely?) remove indentation …

4f55af9

…of cuda-only block to ease comparison with upstream/master

[jtmk] in gg_tt.sa cudacpp.mk, (temporarely?) remove indentation of c…

3293256

…pp-only block to ease comparison with upstream/master

[jtmk] in gg_tt.sa makefiles, replace AVX by BACKEND (and add S. Hage…

1e52888

…boeck to authors): prepare to use BACKEND=cuda/none/sse4... as the main build switch

[jtmk] in gg_tt.sa cudacpp.mk, use the $(if..) statement to inline th…

4af2086

…e conditional about GTESTLIBS

[CODEGEN] Instantiate madgraph test cases also for CUDA builds.

a567cd0

Now that C++ and cuda build folders are separate, the test cases have to be instantiated on every compiler pass.

[jtmk] in gg_tt.sa MadgraphTest.h, manually copy the new template fro…

8c48e50

…m Stephan (remove the protection against CPU/GPU multiply defined symbols)

[jtmk] in gg_tt.sa MadgraphTest.h, add back and comment out the prote…

60f22a0

…ction against CPU/GPU multiply defined symbols, rather than removing it In practice, no code has been changed, only some comments have been added (including commented out #ifndef/#endif directives)

[jtmk] in gg_tt.sa cudacpp.mk, fix the builds of runTest.exe for CPU …

30ec2f1

…and GPU - this also requires the patch in MadgraphTest.h from a previous commit

[jtmk] backport gg_tt.sa cudacpp.mk and cudacpp_src.mk to CODEGEN (se…

10aaf2b

…parate C++/CUDA builds using BACKEND variable)

[jtmk] regenerate gg_tt.sa - all ok, changes only in codegen log

07168da

[jtmk] regenerate gg_tt.mad, using the new cudacpp makefiles develope…

6206480

…d in gg_tt.sa

[jtmk] in gg_tt.mad SubProcesses/makefile, change Jorgen's implementa…

e76656f

…tion to go closer to the previous version and sync with cudacpp.mk This completes the patches for gg_tt.mad: now 'make -j bldall' works as expected

[jtmk] rerun two smeft_gg_tttt tests with increased toleranceMEs for …

70b6b16

…FPTYPE=f from 2E-3 to 3E-3 (see madgraph5#843) - now all ok ./tput/teeThroughputX.sh -smeftggtttt -fltonly -makeclean -makej ./tput/teeThroughputX.sh -smeftggtttt -fltonly -makeclean -makej -hrdonly

valassi mentioned this pull request May 15, 2024

smeft_gg_tttt runTest fails on HIP for FPTYPE=f (must increase precision) #843

Closed

valassi added 5 commits May 15, 2024 22:04

[jtmk] in CODEGEN (backport smeft_gg_tttt) MadgraphTest.h, increase t…

0502af5

…oleranceMEs for FPTYPE=f from 2E-3 to 3E-3 (see madgraph5#843)

[jtmk] bug fix in the default bblds in tput/throughputX.sh (which was…

1b07dcf

… previously failing on LUMI where cuda is not installed)

valassi linked an issue May 15, 2024 that may be closed by this pull request

smeft_gg_tttt runTest fails on HIP for FPTYPE=f (must increase precision) #843

Closed

valassi mentioned this pull request May 15, 2024

Remove gen_ximprove.py and madevent_interface.py from patch.common? #844

Closed

valassi requested a review from oliviermattelaer May 15, 2024 21:05

[jtmk] update to github actions v4 (see madgraph5#848) moving from No…

3138c81

…de 16 to Node 20

valassi linked an issue May 17, 2024 that may be closed by this pull request

Update to actions/cache v4 (GitHub actions: transition from Node 16 to Node 20) #848

Closed

valassi mentioned this pull request May 17, 2024

Update to actions/cache v4 (GitHub actions: transition from Node 16 to Node 20) #848

Closed

valassi added 3 commits May 21, 2024 23:02

[jtmk] Upgrade mg5amcnlo to the latest gpucpp including the fix for m…

93f9564

…g5amcnlo#97

[jtmk] in CODEGEN, remove special handling of mdl_G__exp__2 (now fixe…

49a4ba0

…d by Olivier), after fixing pkl bug mg5amcnlo#97

[jtmk] regenerate all processes after fixing mg5amcnlo#97 - functiona…

c72bbe0

…lly no change, but 'G*G' is now '(G)*(G)' in Parameters.h files

oliviermattelaer approved these changes May 22, 2024

View reviewed changes

valassi merged commit 9770d2a into madgraph5:master May 22, 2024
91 checks passed

valassi added a commit to valassi/madgraph4gpu that referenced this pull request May 22, 2024

Merge remote-tracking branch 'upstream/master' (including jtmk PR mad…

3ab0a88

…graph5#798) into jtmk2

This was referenced May 22, 2024

Includes all the changes to the makefiles, adding new targets for CUDA/cppavx builds #775

Merged

WIP: refactor makefiles (additions to Stephan's and Joorgen's PRs) #797

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

valassi commented Dec 15, 2023

valassi commented May 15, 2024

valassi commented May 15, 2024

valassi commented May 17, 2024

valassi commented May 17, 2024

valassi commented May 21, 2024

oliviermattelaer commented May 22, 2024

valassi commented May 22, 2024

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Conversation

valassi commented Dec 15, 2023

valassi commented May 15, 2024

valassi commented May 15, 2024

valassi commented May 17, 2024

valassi commented May 17, 2024

valassi commented May 21, 2024

oliviermattelaer commented May 22, 2024

valassi commented May 22, 2024