Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

generate_events fails for the FORTRAN backend #690

Closed
valassi opened this issue Jun 11, 2023 · 3 comments · Fixed by #688
Closed

generate_events fails for the FORTRAN backend #690

valassi opened this issue Jun 11, 2023 · 3 comments · Fixed by #688
Assignees

Comments

@valassi
Copy link
Member

valassi commented Jun 11, 2023

I am continuing to debug my MR #688 based on Stefan's #620.

For the cuda and cpp backend, generate_events is ok in tlau/lauX.sh. For the fortran backend it fails.

    INFO: Running Survey
    Creating Jobs
    Working on SubProcesses
    INFO:     P1_gg_ttx
    INFO: Building madevent in madevent_interface.py with 'FORTRAN' matrix elements
    INFO:  Idle: 1,  Running: 0,  Completed: 0 [ current time: 19h44 ]
    INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.041s  ]
    INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.041s  ]
    INFO: End survey
    refine 10000
    Creating Jobs
    INFO: Refine results to 10000
    INFO: Generating 10000.0 unweighted events.
    Error when reading /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/G1/results.dat
    Command "generate_events -f" interrupted with error:
    FileNotFoundError : [Errno 2] No such file or directory: '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/G1/results.dat'
    Please report this bug on https://bugs.launchpad.net/mg5amcnlo
    More information is found in '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/run_01_tag_1_debug.log'.
    Please attach this file to your report.
    quit
    INFO:

I need to debug this more than expected, so I open a ticket

@valassi valassi self-assigned this Jun 11, 2023
@valassi
Copy link
Member Author

valassi commented Jun 11, 2023

I think I understood it - the 'bridge' mode must be set to 0 for fortran. I will change that

@valassi
Copy link
Member Author

valassi commented Jun 11, 2023

Essentially, the bridge_mode currently should be 1 in cudacpp and 0 in fortran. This means that there is an interplay between the value of the cudacpp_backend (formerly exec_mode) card and the fbridgemode card, which is an issue.

In particular, these parameters must currently be set in two places

  • gen_ximprove.py, where they are read from the runcards (and so in principle one can tweak fbridge mode because the backend is also known)
  • refine.sh... here actually @roiser told me that we should use env variables (the current version I took from his wip WIP: towards a workflow #620 is hardcoded to 1)

As a very quick workaround: I will change driver.f to accept bridge mode =1 also in fortran.

Eventually: as suggested by @oliviermattelaer I should actually remove these extra parameters from the input file (#658). I can add them as env variables.

valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
…led as madgraph5#690)

INFO: Running Survey
Creating Jobs
Working on SubProcesses
INFO:     P1_gg_ttx
INFO: Building madevent in madevent_interface.py with 'FORTRAN' matrix elements
INFO:  Idle: 1,  Running: 0,  Completed: 0 [ current time: 19h44 ]
INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.041s  ]
INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.041s  ]
INFO: End survey
refine 10000
Creating Jobs
INFO: Refine results to 10000
INFO: Generating 10000.0 unweighted events.
Error when reading /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/G1/results.dat
Command "generate_events -f" interrupted with error:
FileNotFoundError : [Errno 2] No such file or directory: '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/G1/results.dat'
Please report this bug on https://bugs.launchpad.net/mg5amcnlo
More information is found in '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/run_01_tag_1_debug.log'.
Please attach this file to your report.
quit
INFO:

For comparison, this was CPP

INFO: Running Survey
Creating Jobs
Working on SubProcesses
INFO:     P1_gg_ttx
INFO: Building madevent in madevent_interface.py with 'CPP' matrix elements
INFO:  Idle: 1,  Running: 0,  Completed: 0 [ current time: 19h40 ]
INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.48s  ]
INFO:  Idle: 0,  Running: 0,  Completed: 1 [  0.48s  ]
INFO: End survey
refine 10000
Creating Jobs
INFO: Refine results to 10000
INFO: Generating 10000.0 unweighted events.
sum of cpu time of last step: 1 seconds
INFO: Effective Luminosity 27.314506051301194 pb^-1
INFO: need to improve 2 channels
- Current estimate of cross-section: 439.327 +- 3.240257989049637
    P1_gg_ttx
Building madevent in madevent_interface.py with 'CPP' matrix elements
INFO:  Idle: 8,  Running: 5,  Completed: 0 [ current time: 19h40 ]
INFO:  Idle: 0,  Running: 0,  Completed: 13 [  1.6s  ]
INFO: Combining runs
sum of cpu time of last step: 11 seconds
INFO: finish refine
refine 10000 --treshold=0.9
No need for second refine due to stability of cross-section
INFO: Combining Events
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
…ace bridge_mode by 0 hardcoded to show that this fixes lauX.sh for fortran madgraph5#690 - will revert because this is a ugly hack
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
… and vecsizeused from the runcard and madevent input file

Revert "[runcard] in ggtt.mad refine.sh and gen_ximprove.py, TEMPORARILY replace bridge_mode by 0 hardcoded to show that this fixes lauX.sh for fortran madgraph5#690 - will revert because this is a ugly hack"
This reverts commit 5a3cc1a.
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
… fbridge_mode and vecsize_used and replace them by env variables (madgraph5#658, also related to madgraph5#690)

The two env variables are CUDACPP_RUNTIME_FBRIDGEMODE and CUDACPP_RUNTIME_VECSIZEUSED.
These are meant to be used only by expert developers, not for general users.
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
@valassi valassi linked a pull request Jun 12, 2023 that will close this issue
@valassi
Copy link
Member Author

valassi commented Jun 12, 2023

This is fixed by MR #688 - where in the end I removed the two extra parameters (#658)

I am closing this

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant