-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Use simpler rule for deinit point #15041
Conversation
295c00a
to
dce71f9
Compare
…royCalls Some temps are added by the compiler after normalize (e.g. in our out intent temps) and so it makes sense to mark unmarked ones after normalize.
This is a workaround - the better solution is to move lifetime checking after callDestructors.
See e.g. the test test/analysis/alias/array-arguments-const-ref-refid.chpl Otherwise, runtime type temps were being destroyed early, because resolution lowers PRIM_DEFAULT_INIT_VAR for runtime temps later.
See e.g. classes/initializers/promotion/new-promoted.chpl
See test/statements/swap/swapRecordWrappingOwned.chpl test/studies/shootout/nbody/sidelnik/nbody_orig_1.chpl
Shows errors from PR chapel-lang#15041 and others
Shows errors from PR chapel-lang#15041 and others
Noting for posterity / @mppf : This test increased the measured comm counts for the assignSlice.chpl tests in the distribution robustness suite because compiler-introduced temps at module scope were being destroyed within the measured statement rather than at the end of the program. @e-kayrakli and I verified that these comm count increases occurred on master prior to this PR if the statement in question was placed within curly brackets (forcing the destruction to occur within the measured code portion). Engin already had a branch that incorporated these new counts with some other work he was doing, so we're rolling it together rather than updating all the .goods separately. |
Allow bulk transfer of some wide pointer arrays This PR allows some wide pointer arrays to be bulk transferred. The main motivation for this is to reduce the communication while creating distributed domains and arrays. During privatization, arrays of e.g. `Locale`, `LocBlock`, `LocBlockDom`, `LocBlockArr` are copied across locales. All of these are classes and the only stuff that get copied is wide pointers to their instances. This communication, when done one-by-one creates huge amount of communication. Allowing such arrays to be bulk transferred reduces the communication and the time it takes to create distributed domains and arrays significantly. This behavior can be controlled by `config param useBulkPtrTransfer = useBulkTransfer`. It is on by default. This optimization currently has a bunch of limitations that we hope to relax in the near future. There is no particular motivation for any of these other than the fact that we are close to the release and want to limit the optimization to the case we care most about at the point. With this PR, bulk transfer only happens: - between wide pointer arrays and not narrow pointers - when two domains are equal - if two arrays are DefaultRectangulars Some performance tests: #15049 (comment) The last commit on this PR makes some comm count adjustments that are side effects of #15041. [Reviewed by @benharsh] Test: - [x] standard - [x] gasnet - [x] gasnet `-multilocale-only -memleaks`
Final comm count changes from this PR and from #15049 are in this commit |
@e-kayrakli @bradcray - thanks for sorting it out |
Fix comm counts in assignSlice PR #15041 caused comm count changes for the assignSlice tests because a temporary was being destroyed in the measured section. Putting curly braces around the measured section caused the same behavior on master before that PR. This PR updates the comm counts to account for that PR and others yesterday. It does so in two commits - first, it updates the test to put curly braces around the measured section and accepts the comm count increases (this was done with the version just before PR #15041). Then, the 2nd commit accepts improvements from other PRs (most likely #15049 and #15044 but I did not narrow it down). Trivial and not reviewed. - [x] assignSlice tests pass on a ugni with replicated, cyclic, block - [x] assignSlice tests pass on a gasnet with replicated, cyclic, block
Fix two problems in the Futures package module This PR fixes two memory errors in the Futures package module. Resolves #15117. The first is described in issue #15117. This PR fixes that case by waiting for the future to complete if it was not completed when the future is trying to delete the class managing it. A reasonable alternative would be to increment the reference count to that class in the task running the future. The second issue is a problem that surfaces due to PR #15041 that appeared as a test failure for futures-doc-validity.chpl. The problem is with constructions like this: ``` chapel // from assign-then-get.chpl var G: Future(int); assert(G.isValid() == false); G = async(lambda(x: int) { return 10 + x; }, 1); // HERE const g = G.get(); ``` The problem is that the first-class-function is deinited at the end of the statement in which it was declared, at the `HERE` comment above. To fix that, made the futures functions accept task function arguments with the `in` intent and pass them to the begin also with `in` intent. Reviewed by @vasslitvinov - thanks! - [x] test/library/packages/Futures passes with valgrind - [x] full local testing
Move tuple assignment destructuring after end-of-statements are added After PR #15041, two tests were failing under valgrind testing due to issues with tuples and deinit points: release/examples/spec/Tuples/aliasing types/tuple/deitz/assignment/test_tuple_assignment_order3 The problem was that temporaries added for tuple assignment destructuring were being destroyed before they could be used. To fix, this PR moves the tuple destructuring from cleanup to normalize, and in particular the tuple destructuring runs after the end-of-statement markers were added. Additionally the fix required adjusting the logic for finding the last-mention deinit point to ignore scopeless blocks. Reviewed by @lydia-duncan - thanks! - [x] full local testing - [x] primers pass verify and valgrind testing and do not leak
Don't deinit formal temps before end of block Formal temps are more like user variables than normal temps. It makes sense not to deinit them early since they will be used throughout the function, generally speaking. Resolves valgrind failures with test/types/records/intents/out-intent.chpl after PR #15041. Trivial and not reviewed. - [x] full local testing - [x] primers pass under valgrind+verify and do not leak
This PR updates the compiler to use simpler rules for when a variable is
dead. The expiring value analysis (proposed in #13704) did not end up
being as useful as initially expected for copy elision. As a result we
are shifting to a simpler design for when copy elision can occur -
#14874.
That leaves the expiring value analysis impacting the deinitialization
point (#11534 #11492) and the ownership transfer from non-nilable
checking (#13600). However the ownership transfer checking is not
significantly improved with expiring value analysis (it is approximately
as effective when handled by the must-alias analysis in the nil checker).
That leaves the deinit points.
Since a copy elision can cause a variable to be effectively dead in the
current scope after it occurs, there is some relationship between deinit
points and copy elision. That is why #13704 tried to address these two
together.
For example:
However since the copy elision rule is relatively simple, this PR changes
the rule for when a variable is dead to simply include the copy elision
rule. Besides that, it is based upon the simpler rule in #13704 /
#11534 (comment) .
Here is the rule as of this PR:
There are two important things to note about the above rule:
through split-init
variables) as well as local variables. For global variables
Also note that
--no-early-deinit
is available to make alltemps deinited end-of-block.
This PR takes the following steps to implement the change to
deinitialization points:
possiblyInitsDestroyedVariable
(instead of making them global variables) so that this can be
determined later (after split-inits are known)
markTempsDeadLastMention which are called from fixPrimInitsAndAddCasts
in resolveFunction.cpp. This analyzes temp variables by how they are
used to determine if they are used to initialize a user variable.
Besides adding
FLAG_DEAD_LAST_MENTION
orFLAG_DEAD_END_OF_BLOCK
,if the variable is in a module initialization function, it will pull
it out into global scope.
MarkTempsVisitor
rules:
Reviewed by @benharsh - thanks!