Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

My "vecsize" patches have broken mg5amcnlo "launch" for the default Fortran madevent #629

Closed
valassi opened this issue Apr 3, 2023 · 6 comments · Fixed by #638 or #654
Closed
Assignees

Comments

@valassi
Copy link
Member

valassi commented Apr 3, 2023

Hi @oliviermattelaer this is a bad one. Sorry, it seems that I had not tested my vecsize patches for the simplest thing, namely a Fortran madevent only. This is totally broken.

Using gpucpp 4a54a9af4c883b4384e4d754c550d9bb120bd415 (so before your latest patches, but I guess it makes no difference), something as simple as this fails

MG5_aMC>generate g g > t t~
MG5_aMC>output madevent CODEGEN_gg_tt
MG5_aMC>launch
...
Error detected in "generate_events run_01"
write debug file /data/avalassi/GPU2023/MG5aMC/ghav-mg5amcnlo/CODEGEN_gg_tt/run_01_tag_1_debug.log 
If you need help with this issue please contact us on https://answers.launchpad.net/mg5amcnlo
str : A compilation Error occurs when trying to compile /data/avalassi/GPU2023/MG5aMC/ghav-mg5amcnlo/CODEGEN_gg_tt/SubProcesses/P1_gg_ttx.
        The compilation fails with the following output message:
            gfortran -w -fPIC -O -fbounds-check -ffixed-line-length-132 -w -c symmetry.f -I../../Source/ -I../../Source/PDF/gammaUPC
            coupl.inc:34:43:
            
               34 |       DOUBLE COMPLEX GC_10(VECSIZE_MEMMAX), GC_11(VECSIZE_MEMMAX)
                  |                                           1
            Error: Explicit array shape at (1) must be constant of INTEGER type and not UNKNOWN type
            coupl.inc:36:29:
            
               36 |       COMMON/COUPLINGS/ GC_10, GC_11
                  |                             1
            Error: Symbol ‘gc_10’ at (1) has no IMPLICIT type
            coupl.inc:36:36:
            
               36 |       COMMON/COUPLINGS/ GC_10, GC_11
                  |                                    1
            Error: Symbol ‘gc_11’ at (1) has no IMPLICIT type
            coupl.inc:34:41:
            
               34 |       DOUBLE COMPLEX GC_10(VECSIZE_MEMMAX), GC_11(VECSIZE_MEMMAX)
                  |                                         1
            Error: Symbol ‘vecsize_memmax’ at (1) has no IMPLICIT type
            coupl.inc:34:41:
            
               34 |       DOUBLE COMPLEX GC_10(VECSIZE_MEMMAX), GC_11(VECSIZE_MEMMAX)
                  |                                         1
            Error: Symbol ‘vecsize_memmax’ at (1) has no IMPLICIT type
            make: *** [makefile:80: symmetry.o] Error 1
            
        Please try to fix this compilations issue and retry.
        Help might be found at https://answers.launchpad.net/mg5amcnlo.
        If you think that this is a bug, you can report this at https://bugs.launchpad.net/mg5amcnlo

The problem is ultimately vector.inc and the absence of include guards in Fortran #458

  • I wanted to define a parameter VECSIZE_MEMMAX in vector.inc
  • And then use this in various places including coupl.inc, clusters.inc, driver.f etc
  • The problem is that include 'vector.inc' is a Fortran command which does not go through the cpp preprocessor.. so it is a total nightmare to make sure that the parameter VECSIZE_MEMMAX is included excatly once...

Before my patch. the value eg 16384 was HARDCODED in each file. I guess we need to go back to something like that...

@valassi
Copy link
Member Author

valassi commented Apr 3, 2023

(Just one word of expanation: I am still working on the nprocesses>1 #272 and split_nonidentical_grouping #619. I wanted to reproduce Olivier's plot from #272 (comment) discuused at the meeting last week. This needs "launch". So I tried launch, an dit failed not only for my more complex processes but even for the simplest ggtt process... This is not a blosker for the ther things we are doing in parallel, but it is a blosker for madevent in fortran of all default production users)

@valassi
Copy link
Member Author

valassi commented Apr 6, 2023

Assigning this to myself - I will work on this after the Easter break

@valassi
Copy link
Member Author

valassi commented Apr 7, 2023

Start having a quick look at this.

Analysing the two vecsize (mg5amcnlo/mg5amcnlo#23) and vecsize2 (mg5amcnlo/mg5amcnlo#24) patches upstream

In reverse order, for vecsize2:

commit c1c609bd86af5d0810de4a0095946d556a3ec3bb
Merge: 2c1aef62f 0cc72b321
Author: oliviermattelaer <33414646+oliviermattelaer@users.noreply.github.com>
Date:   Wed Jan 25 16:48:48 2023 +0100
    Merge pull request #24 from valassi/vecsize2

git log 2c1aef62f..0cc72b321 --oneline
0cc72b321 [vecsize2] ** COMPLETE VECSIZE PART 2 ** add NCOMB in auto_dsig_v4.inc to allow sanity check against cudacpp
c75d440de [vecsize2] improve the comment in vector.inc to also cover VECSIZE_MEMMAX=1
9c917c001 [vecsize2] add a comment that DATA VECSIZE_USED/VECSIZE_MEMMAX/ can be changed at runtime
5d2a28acf [vecsize2] In driver.f, set the value of VECSIZE_USED equal to VECSIZE_MEMMAX via a DATA statement Eventually, the value of VECSIZE_USED will be read from user input at runtime
1ea7835cb [vecsize2] add comments about VECSIZE_MEMMAX and VECSIZE_USED in vector.inc
619741428 [vecsize2] remove definition of parameter VECSIZE_USED=VECSIZE_MEMMAX, now VECSIZE_USED is a local variable and function argument
6075a7fbc [vecsize2] add VECSIZE_USED to function calls, turn it into a local variable
3859ef90a [vecsize2] in super_auto_dsig_group_v4.inc replace "IF (VECSIZE_USED.LE.1)" by "IF (VECSIZE_MEMMAX.LE.1)" to detect no-vector mode
10c054120 [vecsize2] in cuts.f,  use VECSIZE_MEMMAX instead of VECSIZE_USED to avoid the need to pass the latter
2c3f3a3b6 [vecsize2] in reweight.f, rename VECSIZE_USED_in as VECSIZE_USED, as VECSIZE_USED will become a function argument everywhere

And for vecsize:

commit 2c1aef62f600d04d52be46ea010af4d686aa9b11
Merge: 90fba36f3 291a2fdb9
Author: Andrea Valassi <Andrea.Valassi@cern.ch>
Date:   Wed Jan 25 14:30:33 2023 +0100
    Merge pull request #23 from valassi/vecsize

git log 90fba36f3..291a2fdb9 --oneline
291a2fdb9 [vecsize] ** COMPLETE VECSIZE PART 1 ** fix a comment about WGT as suggested by Olivier
4e9a0d7af [vecsize] add comments to all 'include vector.inc' as agreed with Olivier
29853eb4c [vecsize] add back 'max' logic in export_v4.py as suggested by Olivier
76feadd2e [vecsize] fix a few more includes in madgraph/iolibs/template_files/auto_dsig_v4.inc
ad1ae4b3a [vecsize] improve comments in the dsample.f and genps.inc templates
7ae11db39 [vecsize] Add "PARAMETER (VECSIZE_USED=VECSIZE_MEMMAX)" to complete the initial implementation of VECSIZE_USED
e06ccb7cb [vecsize] Replace VECSIZE_MEMMAX by VECSIZE_USED where needed
5cf6657dd [vecsize] rename VECSIZE_MAX as VECSIZE_MEMMAX as discussed today
49cd498c0 [vecsize] generate a comment in coupl.inc that vector.inc must be included before it
b729653dd [vecsize] add "include 'vector.inc'" in all files where it is needed (except for reweight.f that needs a special handling)
11906a378 [vecsize] in reweight.f rename VECSIZE_MAX as VECSIZE_MAX_in when it is a function argument variable
4afb5e66d [vecsize] rename NB_PAGE as VECSIZE_MAX in all templates for code generation
53430b0e2 [vecsize] in export_v4.py rename all occurrences of NB_PAGE by VECSIZE_MAX
bc7e9c7bf [vecsize] in export_v4.py replace hardcoded vector_size value by "VECSIZE_MAX" variable name
79367502e [vecsize] add *~ and __pycache__ to new file .gitignore
020cd7af8 [vecsize] cleanup: remove madgraph/iolibs/template_files/gpu/mgOnGpuConfig.h~

I should try to remove only the include vecsize.inc, adding a new parameter elsewhere

@valassi
Copy link
Member Author

valassi commented Apr 7, 2023

The vector.inc changes are quite encapsulated. I should go along the lines of

git revert 4e9a0d7af # succeeds
git revert 76feadd2e # succeeds
#git revert 49cd498c0 # not as such, but something to change here - did not try it
git revert b729653dd # fails with conflicts... and in any case must be replaced by something else

Will check next week

valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…h4gpu#629)

Revert "[vecsize] add comments to all 'include vector.inc' as agreed with Olivier"
This reverts commit 4e9a0d7.
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…h4gpu#629)

Revert "[vecsize] fix a few more includes in madgraph/iolibs/template_files/auto_dsig_v4.inc"
This reverts commit 76feadd.
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…h4gpu#629)

Revert "[vecsize] add "include 'vector.inc'" in all files where it is needed"
This reverts commit b729653.
Fix conflicts: madgraph/iolibs/template_files/auto_dsig_v4.inc
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…adgraph4gpu#629)

Use VECSIZE_MEMMAX_COUPL instead of VECSIZE_MEMMAX, add its definition, and add a comment about it
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…5/madgraph4gpu#629)

Add vector.inc defining VECSIZE_MEMMAX, in addition to coupl.inc defining VECSIZE_MEMMAX_COUPL
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…dgraph4gpu#629)

Add vector.inc defining VECSIZE_MEMMAX, in addition to coupl.inc defining VECSIZE_MEMMAX_COUPL
valassi added a commit to valassi/mg5amcnlo that referenced this issue Apr 11, 2023
…dgraph4gpu#629)

Add a sanity check that VECSIZE_MEMMAX == VECSIZE_MEMMAX_COUPL

This COMPLETES the relevant fixes for madgraph5/madgraph4gpu#629
Now "launch" for a simple process in the standard Fortran madevent backend works again as expected
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
… successful builds after reverting many vecsize changes

In coupl.inc, add VECSIZE_MEMMAX_COUPL duplicating VECSIZE_MEMMAX from vector.inc

In driver.f and auto_dsig1.f, add vector.inc to provide VECSIZE_MEMMAX
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
…at VECSIZE_MEMMAX == VECSIZE_MEMMAX_COUPL
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
…, driver.f to make them easier for upstream codegen
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
…_COUPL instead of VECSIZE_MEMMAX for coupl_write.inc
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
…g the unpatches upstream codegen

(The patches do not change - it is the reference over which they are applied that changed)

./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
git diff --no-ext-diff -R gg_tt.mad/Source/dsample.f gg_tt.mad/Source/genps.inc gg_tt.mad/Source/vector.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 11, 2023
valassi added a commit to valassi/mg5amcnlo that referenced this issue May 16, 2023
…upl.inc in reweight.f

NB: Olivier's patch by itself was enough to fix madgraph5/madgraph4gpu#629
(this is part of commit c1d10f4)

git checkout 8512c16  madgraph/iolibs/template_files/madevent_symmetry.f
@valassi valassi reopened this May 16, 2023
@valassi
Copy link
Member Author

valassi commented May 16, 2023

I had closed this, assuming that the fix would come from
mg5amcnlo/mg5amcnlo#49

In the end, @oliviermattelaer fixed this specific issue with the simpler mg5amcnlo/mg5amcnlo@c1d10f4

So most of that mg5amcnlo/mg5amcnlo#49 becomes unncessary. I am rolling it back. I am upgrading the codegen and all geneated code to that upstream. Thanks Olivier!

Keeping this open while working on it.

valassi added a commit to valassi/madgraph4gpu that referenced this issue May 17, 2023
…for CMS Drell-Yan (madgraph5#645)

NB: Olivier's patch also includes one commit fixing madgraph5#629, making vecsizeFIX unnecessary (at least for the moment)
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 17, 2023
….sh, after removing VECSIZE_MEMMAX_COUPL

Revert "[vecsizeFIX] madgraph5#629 in codegen patchMad.sh, use VECSIZE_MEMMAX_COUPL instead of VECSIZE_MEMMAX for coupl_write.inc"
This reverts commit a541807.
@valassi
Copy link
Member Author

valassi commented May 17, 2023

This is now fixed by #654, which is based on upstream mg5amcnlo/mg5amcnlo#54
Trying again with my vecsize3 branch upstream from there, I get

generate g g > t t~
output madevent CODEGEN_gg_tt
launch

This is building an drunning ok, so the issue can be closed

# for free to join this conversation on GitHub. Already have an account? # to comment