Fix segfaults for MPCD with large particle counts under MPI #1897
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR refactors parts of the MPCD code that relied on generic serialized MPI routines to instead use custom MPI datatypes. This increases the amount of data that can be sent because each object is larger. This is an attempt to fix segfaults reported when either initializing from or taking a snapshot with a large number of MPCD particles.
I replaced most uses of
bcast
withMPI_Bcast
while I was at it since this is a relatively simple MPI call.I also added exceptions when the serialized methods exceed the byte count that can be stored in a signed int.
Motivation and context
Large numbers of particles seem to lead to overflow of the serialized MPI methods.
Resolves #1895
How has this been tested?
Existings tests pass. I will confirm with reporting user (or test on a cluster myself) that this fixes the segfaults they were observing.
Change log
Checklist:
sphinx-doc/credits.rst
) in the pull request source branch.