You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recompute GPU branch efficiency with a non-dummy random helicity/color selection. This is a subissue of #608 - itself related to #607, thanks to a chat with @zeniheisser
The random choices of color and helicity intrinsecally introduce some stochastic branching. This is expected to degrade data parallel performance. On GPUs, some branch efficinecy lower than 100% should appear.
Note that currently all tests in the tput directory indicate 100% branch efficiency in gcheck.exe tests. Example
This might be because 100% is printed out as an integer approximation (unlikely). More likely, it is because check.cc uses always 0 as the random number in input, so all GPU threads go through exactly the same "stochastic" branches (there is no randomness! all 0) and are effectively in lockstep.
Two things could be done here
First, some randomness could be introduced (on demand) in gcheck.exe when running the profiles: it would be enought t o populate the allrndhel and allrndcol arrays with real random numbers, rather than using 0. One could do a run with all 0 and a run with random numbers, and compare performances and profiles for branch efficiency,
Second, maybe more interesting, an actual profiling of gmadevent_cudacpp could be done.
The text was updated successfully, but these errors were encountered:
Recompute GPU branch efficiency with a non-dummy random helicity/color selection. This is a subissue of #608 - itself related to #607, thanks to a chat with @zeniheisser
The random choices of color and helicity intrinsecally introduce some stochastic branching. This is expected to degrade data parallel performance. On GPUs, some branch efficinecy lower than 100% should appear.
Note that currently all tests in the tput directory indicate 100% branch efficiency in gcheck.exe tests. Example
madgraph4gpu/epochX/cudacpp/tput/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
Line 55 in 93bee87
This might be because 100% is printed out as an integer approximation (unlikely). More likely, it is because check.cc uses always 0 as the random number in input, so all GPU threads go through exactly the same "stochastic" branches (there is no randomness! all 0) and are effectively in lockstep.
Two things could be done here
The text was updated successfully, but these errors were encountered: