-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
madeventfortran vs madeventcpp vs madeventcuda #674
Comments
absolutely, I wanted to be least intrusive as possible.
yes this was the idea, it works for e.g. generating the gridpacks for the vectorCPU and Cuda versions
that was also my reasoning behind those changes
yes good idea, I just also wanted to be least intrusive against the current cuda/cpp version, please go ahead, it will also make it more usable / understandable for people outside ;-). I understand why you want to include the "ME" in the name, but in view of the software evolving further and including more stuff on the cpp/cuda side one may consider removing ME from the executable names?
yes I think its a good idea
if that is possible yes, I was just following what I thought are conventions in the run cards. You may want to check, IIRC in some cases the values of the run card options are converted to numbers in the python code, but not all. You may want to check my diffs, I explicitly convert to a number in the python.
My proposal relies on the "Makefile intelligence" to build the correct vector size version, building all of them is a good idea (if we want to build also cuda/cpp/fortran in parallel).
I just wanted to extend the previous question into this direction ;-). My initial idea was to keep the "madevent" (via the symlink) fixed for a given architecture/language/vector size. Thinking about it again and with all the directions our discussion is taking above (i.e. build all possible versions) I am wondering if we need a little layer on top before executing which finds out the capabilities of the platform and sets the symlink madevent->binary version before running it? |
Hi Stefan, thanks :-) Ok I will remove the ME from names, easier. About AVX "intelligence", that intelligence (that I put there) is a bit limited! Also, we do need to be able to build the "not best" (according to that intelligence), to make comparisons. At least I want to make those comparisons. Will think about it a bit. So we need a bit more than you implemented so far. The last point is the most delicate - because essentially it is about destroying assumption "1" that we just use madevent and that's it. One option is to have different build directories (this would also depend a bit on #502, trying to separate the .f from the .o files in different directories, but that would be a big change). But I am just wornderin if it is not better/cleaner (even if it has a LOT of file overhead) to just treat cuda, fortran, avx's builds in separate source directories. Not yet sure, I need to have a look. In case, @oliviermattelaer: would you agree if I try to go in the direction of #502, and separate the build products in different subdirectories, also for the Fortran files? I'll see what I can do ... I need to do some tests/prototyping to understand what makes sense |
PS Just to dump it explicitly on the table: one issue (in favor of #502 or of having separate source directories - and against using a single .o build of fortran files for all cases) is the usual vecsize_memmax problem. It is true that I added thi shorrible hack of vecsize_memmax vs vecsize_used, and in principle we can always use memax=16k for both cuda and cpp, even if in cuda we have used=16k and in cpp we have used=8 for simd. But I think this is inefficient, probably. For c++ where we only need 8-32 events in parallel, it is really cumbersome to have arrays dimensioned for 16k events, which are unused. In my ideal world maybe it is like this?
So again I see only two options, moving .o files away from the .f files in a separate directory (at least for the fortran code that depends on vector.inc), or otherwise duplicating the source code, you have gg_tt.mad_cuda with vector inc 16k and gg_tt.mad_cpp with vector inc 32, or similar... Am I missing other options? PS anyway, maybe this is lower priority: for the very first release we can have memmax=16k in all cases, an dthen cpp will be a bit less efiicient, using too much memory and maybe a bit slower than it could be |
I am taking a different route to this. I started working on the removal of patchMad.sh #656 in MR #675 The idea is that the modified Fortran makefiles will now (still, but permanently) belong to the cudacpp plugin. So it becomes kind of less important how 'polluted' the makefile becomes with cudacpp stuff, because the changes will ONLY be for cudacpp and not in general for unmodified Fortran. |
(One tiny related issue here in build cleanups is #680 - better avoid building CUDA in each C++ AVX mode) |
Do we have a mechanism to "rebase" on top of the proper madgraph repo? |
Another comment, you may not like it ;-). Those ideas about separate build directories are already integral part of CMake. Maybe we shall consider the move before going down the route with GNU Makefiles? |
Hi, boh cmake I am always open to discuss eventually :-) However:
Now, moving ALL of Madgraph to cmake is something I would even like at some point. But it's a lot of work, and demands a complete rewriting of the build infrastrcuture there, and requires all of the core team to agree with it. So I would say, to be discussed... |
My idea here is to include the original file from the madgraph repo in the plugin, and write a simple script that creates the equivalent of patch.P1 and patch.common. So
To be rediscussed later on, but this is definitely the fastest way I see. No need to bother how much we are modifying intrusively the common madgraph for everyone: we just modify the files we need with complete freedom. This is the same freedome as we have done so far with patch.P1 and patch.common, BUT in addition things work out of the box in the plugin without a patchMd.sh a posteriori |
Ok, IIRC we can fix the plugin to only work with a given (set of) version(s) of madgraph? That may give some additional security in that case. |
Ah I had not thought of that yet. I am focusing on getting one version ready now. Anyway, it will HAVE to use the latest gpucpp branch. So the experiments are forced to upgrade to that whether they like it or not. We need and they need the vector interface in fortran to have any cpu vectorization or gpu... |
Hi @roiser (and @oliviermattelaer) this is one part of #658, with WIP by Stefan in MR #620
There are rather different parts in the #658 "remove non standard features in madevent". What I want to discuss here is: how does the launch machinery choose whether to build/run fortran, cpp or cuda MEs.
Stefan has some good points in MR #620.
For instance
237620c
(Two suggestions above: fix in the lst dependency should be g; keep a copy of madevent fortran anyway?)
And then
15b5974
eb1bcb1
If I understand well, the strategy here is:
@roiser, I have not tried this out but I assume this is the general idea right?
To me, this looks very nice :-)
I mean, we need to make a choice whether the "switch" between implementations is at runtime or at build time. It is easier if it is only in one of the two. Having it at runtime is handy, but in production you want the option to ONLY build what you need (or at least, aim for that). So definitely it is better to have the logic at build time - and it looks quite simple here.
I just have a few questions/suggestions. This is mainly/only about the build in point2 (as point 1 essentially is a noop)
madevent_MEfortran
,madevent_MEcpp
andmadevent_MEcuda
or something similar. That is to say: replacecmadevent_cudacpp
bymadevent_MEcpp
for instance.madevent_MEfortran
.madevent_MEsycl
(same for cpp and gpu? hm not sure) ormadevent_MEhip
. Therefore I wonder if it is better/possible to modify Stefan's run inc to use labels rather than numbers: not 0 for fortran, 1 for cpp and 2 for cuda, but 'fortran', 'cpp', 'cuda' explicitly? Then it would become natural to extend it to sycl or hip etc - AND it may look nicer, more flexible and less intrusive in the generic template (the goal is to write that piece of code WITHOUT mentioning cuda cpp etc, just generic labels?)madevent_MEfortran
,madevent_MEcuda
,madevent_MEcppnone
,madevent_cpp512z
?Voila... any thoughts? I am sure there may also be other questions I am forgetting. I think it's better to agree on these goals first.
Thanks
Andrea
The text was updated successfully, but these errors were encountered: