Implement helicity recycling in our CUDA/C++ #279

valassi · 2021-10-26T12:18:44Z

This is a followup of issue #276.

In order to compare C++/CUDA and Fortran throughputs, we should make sure that they use the same algorithm. This is presently not the case: we are comparing a faster Fortran with helicity recycling to a slower C++ without helicity recycling.

In issue #276, I will follow up a better estimate of slower Fortran without helicity recycling, that can be directly compared to C++.

But what we really need to do is implement helicity recycling in the CUDA/C++. @oliviermattelaer is this something that would be complicated (an/or maybe is already underway)? Thanks

oliviermattelaer · 2021-11-16T22:08:55Z

I would actually doubt that doing helicity recycling for gpu is a good idea since this blows up the size of the code and the memory requirement. For vectorised cpu, that is obviously an option.

valassi · 2022-03-11T09:13:20Z

Within PR #401, note that I had to introduce this fix while moving to v311
04d4b8e
This indeed picks up a new feature of v311 (#360), but it is only relevant to helicit yrecycling (#279).

In practice

initially I started generating code (madevent only, or madevent+cudacpp) without disabling heliicty recycling
under this configuration, code generation of madevent+cudacpp did not work, and I had to fix it as above
this was especially puxxling because I had to change some code generation lines (for generating cudacpp) which were not used in my normal cudacpp-only version
after fixing it, I got code with helicity recycling, which however did not build (not clear why, but not an issue today): to fix the build I disabled helicity recycling
in retrospective, the reason why I had to fix those lines is thus for helicity recycling code, which in any case I later decided to disable: if I had disabled helicity recycling before trying to generate madevent+cudacpp code, probably that patch would not hav ebeen necessary

valassi mentioned this issue Oct 26, 2021

Reassess ggttgg (and eemumu) throughput in Fortran without helicity recycling #276

Open

valassi mentioned this issue Nov 16, 2021

gridpackX (reassess Fortran throughputs with/without helicity recycling) #298

Merged

This was referenced Jan 21, 2022

cudacpp plugin hardcode some change that modifies code outside of the sandbox of the plugin #341

Closed

Port epochX CODEGEN from 270gpu to the 311lovec branch (and pick up the new features there!) #360

Closed

This was referenced Mar 10, 2022

Prepare Makefile to combine MadEvent + Cudacpp plugin #400

Closed

Incomplete patches ("v311") with progress towards madevent+cudacpp integration #401

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement helicity recycling in our CUDA/C++ #279

Implement helicity recycling in our CUDA/C++ #279

valassi commented Oct 26, 2021

oliviermattelaer commented Nov 16, 2021

valassi commented Mar 11, 2022

Implement helicity recycling in our CUDA/C++ #279

Implement helicity recycling in our CUDA/C++ #279

Comments

valassi commented Oct 26, 2021

oliviermattelaer commented Nov 16, 2021

valassi commented Mar 11, 2022