-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Should split-init be allowed within a try block? #14748
Comments
For now the compiler will disallow it (option 1) but changing this is arguably a breaking change (it changes the pattern of default-init / assign / deinit calls so could lead to programs behaving differently). |
Options 2 and 3 seem reasonable to me, but I'll admit I haven't thought through it that much. This usage, however, is something that I have wanted a few times. (That's not to say it's a good idea, but it certainly helped motivate #14271.) For example: try {
var x = funcMayThrow();
// ... use x ...
// lots of lines of code ...
}
catch {
// handler for funcMayThrow and nothing else
} There isn't a clean way to denote that the catch logic really only handles the variable initialization despite all the intervening code in the The alternative is to do something like: var x: SomeType = sentinelValue; // Take a default init value.
try {
x = funcMayThrow();
} catch {
// ...
}
if x == sentinelValue {
// handle illegal state
}
// ... lots of lines of code ... Maybe this idiom is sufficient because it makes the Options 2 and 3 cases more explicit in the user code instead of some implicit design decision of the language. But |
Just to make the example more concrete, you're saying perhaps one would want to write: var x;
try {
x = funcMayThrow();
}
catch {
// handler for funcMayThrow and nothing else
x = makeReplacementValue();
}
// ... use x ...
// lots of lines of code ... But in that event don't we still have a "sentinel" value of some sort?
I think that's not necessarily the case. For example, we might have var response: Response = new FailureResponse();
try {
response = processMessage(msg);
} catch e {
response = new FailureResponse(e);
}
sendResponse(response); Here I wouldn't think of Here is a potential split-init version of the same thing: var response;
try {
response = processMessage(msg);
} catch e {
response = new FailureResponse(e);
}
sendResponse(response); |
At the moment I am leaning towards option 3 above (Only allow such split-init with try or try! if the catch blocks also initialize the variable.) My thinking here is:
var response: Response = new FailureResponse(); // wish to leave this init out!
try {
response = processMessage(msg);
} catch e {
response = new FailureResponse(e);
}
sendResponse(response); BTW the "use a wrapper function" strategy would be something like this: var response = generateResponse(msg);
proc generateResponse(msg) {
try {
return processMessage(msg);
} catch e {
return new FailureResponse(e);
}
} I think that option 2 (Only allow such split-init with try or try! if there are no catch blocks) has the drawback of being arguably surprising since Although option 1 (Don't allow split-init to have an initialization point within a nested try or try! block) is implemented now, that's just because it is the easiest to do near-term. It's not the strategy I'd like to see long-term, as I was surprised not to see split-init when working with a code example with a try without catches. |
I'm okay with all of that. I realized that the wrapper-function strategy would also work as a lambda: var response = lambda(msg) {
try {
return processMessage(msg);
} catch e {
return new FailureResponse(e);
}
}(msg); That might be a useful stop-gap for users with this usage pattern but the feature isn't implemented yet. |
Adjust point of deinitialization based on expiring value analysis This PR updates the compiler to use expiring value analysis in order to choose the point at which `deinit` for a local record variable is run. It does this according to the discussion in #13704. This PR includes several minor modification to the rules described in #13704. 1. Variables referred to in `begin` blocks are considered to expire at the end of the function (and issue #14750 discusses this and how it should interact with`sync` blocks) 2. An exception for `=` and `init=` for managed types. Because ownership transfer from non-nil is one of the use cases, we need the analysis to not consider `x` to potentially alias `y` in the below code, when both are `owned` - `var x = y;`. In general, if these are classes or records, they could share a reference to a field. For `owned` however, the right-hand-side of the initialization is left `nil` after the statement, and so there is no potential for aliasing between `x` and `y` after the statement. Additionally, for `shared`, while `x` and `y` refer to the same memory, this has no impact on when `x` or `y` can be deinitialized or a copy could be avoided. Therefore, when considering if a potential alias is created, the analysis ignores assignment and copy-initialization of managed types. Additionally, it improves the split-init implementation. Previous to this PR, an initialization point like `x = functionReturningByValue()` would cause the compiler to emit `init=` when it did not need to. Additionally, the compiler was emitting `deinit` calls for variables that were declared but not yet initialized if there was a return from the block. Lastly, this PR extends split-init to work with arrays (where before it was disabled in resolution). Lastly, it adjusts the lifetime checker to now require a lifetime clause on `=` or `<=>` overloads for types that contain borrows. Implementation steps: * Remove FnSymbol::replaceReturnSymbol - it is unused * rename `_statementLevelSymbol` to `chpl_statementLevelSymbol` and add `astr_chpl_statementLevelSymbol` to the compiler * Print out variables stored directly in ForallStmts as "other variables" in the AST logs * Add flags to disable split init and early deinit; change fLifetimeChecking to fNoLifetimeChecking to reflect the current default * Move addAutoDestroyCalls to after ret-arg transform so it can run after lifetime checking (where the expiring value analysis runs) and make several adjustments to handle ret-args in addAutoDestroyCalls. * Add code to normalize to mark the end-of-statement. Normalize adds a `PRIM_END_OF_STATEMENT` call after the end of each statement that isn't already at the end of a block or that refers to user variables. `PRIM_END_OF_STATEMENT` arguments are SymExprs referring to user variables that should be preserved until the end of that statement, which matters for code that is removed before callDestructors, such as `x.type`. Adjusted the unordered forall optimization and flatten functions to ignore these `PRIM_END_OF_STATEMENT` calls. These `PRIM_END_OF_STATEMENT` calls are removed in callDestructors. * Updated the normalizer code for split-init to properly handle `x = makeRecord()` in addition to `x = new MyRecord()` and to store the type-expr in a temporary that is passed to both `PRIM_INIT_VAR_SPLIT_DECL` and `PRIM_INIT_VAR_SPLIT_INIT` to enable split init of arrays * Updated the normalizer code for split init to explicitly handle TryStmts even though they currently disable split-init * Updated functionResolution to handle split-init of arrays * addAutoDestroyCalls creates a map from a statement to which variables should be destroyed there in order to implement the early deinit. Any variables not in the map will be destroyed according to previous rules (i.e. at the end of the block). Once deinit calls for these are added, the variable is added to the ignore set. * updated addAutoDestroyCalls and AutoDestroyScope to track which variables have been initialized (in case they are split-inited) and to propagate that information to parent blocks (in case a variable is initialized with split-init in a nested block). Test updates: * updating tests that check deinit occurs for end-of-block variables on return/throw/etc to force some variables to be end-of-block * work-arounds for #14746 * updating tests for new deinit behavior and in some cases test both last-mention and end-of-block * new tests of new behavior Reviewed by @vasslitvinov - thanks! - [x] full local testing Future Work: * document deinit points in the language specification - Cray/chapel-private#690 * deinitialize in reverse initialization order rather than reverse declaration order - Cray/chapel-private#691 Related Design Questions: * Expiring value analysis interaction with `=` and `init=` for managed types (described above) * Lifetime checking and = overloads #14757 * Split-init interactions with deinitialization order #14747 * Should split-init be allowed within a try block? #14748 * Should last mention deinitialization point go inside blocks? #14749 * Expiring value analysis and begin blocks #14750 * Split-init interaction with warning about local-from-distributed #14746 * Should unused record variables be removed? #14758
Overall I like option 3, with a fallback to option 1. My concern is the cases where it is unknown at compile time whether initialization will happen. var x;
try {
mayThrow1();
x = initExpr;
mayThrow2();
}
catch {
x = anotherExpr; // initialization if mayThrow1() throws, otherwise assignment
} This issue proposes to deinitialize At the same time I agree that we want to distinguish initialization from assignment at compile time. To resolve this, I propose making the uncertain cases like this one a compile-time error "currently not supported". |
Well, technically, it will deinitalizes x when control flow leaves the
What exactly would you propose as the rule the compiler would use to know the boundary condition here? The compiler could check if there are additional calls (such as The catch blocks will always initialize the outer variable, in that case. Examplesvar x;
try {
mayThrow1();
x = initExpr;
mayThrow2();
}
catch {
x = anotherExpr;
} Error: mayThrow2 might move control flow to catch block after var x;
try {
mayThrow1();
x = initExpr;
}
catch {
x = anotherExpr;
} OK var x;
try {
mayThrow1();
x = mayThrowInitExpr();
}
catch {
x = anotherExpr;
} OK var x;
try {
mayThrow1();
x = mayThrowInitExpr();
cannotThrow();
}
catch {
x = anotherExpr;
} OK |
Here is how I see it. For each By the time the control flow reaches a An example of this without a try block would be: var x;
if (a nonparam condition) {
x = 1;
}
x = 2; // when execution reaches this,
// 'x=1' may or may not have been executed |
It might just be a definitions-of-terms issue, but if the compiler observes "that another Anyway I find it hard to understand what you're advocating for here in a way that would allow me to implement anything. Could you show some try/catch examples along with what they would do and why? (You could start with my examples in my previous comment). |
Hm maybe I misunderstand the current rules. What happens in this case? var x;
if (a nonparam condition) {
x = 1;
}
x = 2; My examples would be the same as your examples (addressing @mppf here), with the same outcomes. I think we are just defining the rule(s) that produce those outcomes. |
[Continuing my previous comment.] My rule applies to the example with a conditional because by the time the control flow reaches Here are some other examples that I can think of: var x;
try {
mayThrow();
x = 1;
} catch {
// does not mention x
} Error: x may be left uninitialized right after the try. var x;
try {
throw ....;
} catch {
x = 1;
} OK -- x is initialized in the catch block -- if this is a catch-all, error otherwise. var x;
try {
throw ...;
} catch MyError {
// does not mention x
} catch {
// catch-all
x = 1;
} Error because x may or may not end up getting initialized. I suspect we can postpone support for split-initialization within a catch-block. For now we could mark those cases with an error "currently unsupported". |
This is a compilation error. If it were
They are not the same and the rules are actually quite different. The split-init rules that are in the specification don't use arbitrary control flow analysis. Instead, they define within which constructs a split-init can occur if there are no mentions of the variable before something that looks like assignment. Here "before" generally means "appears before in the program text" (except for conditionals). Split-init can go into a regular nested block. It can also go into a conditional if both I hope this is clearer in the spec itself which is already updated for split init.
Isn't that Option 2 "Only allow such split-init with try or try! if there are no catch blocks"? Choosing Option 2 now and relaxing it later would be a breaking change. Consider this example: var x: MyRecord;
try {
x = createRecordMaybeThrowing(); // 1
} catch {
x = new MyRecord();
} In choosing 2 or 3, we are saying whether or not We could make this example be a compilation error for now - but that seems to me to unnecessarily prevent code that worked before from working today. I have a different answer about one of your examples:
I don't think this one should compile. I don't think the split-init logic should do arbitrary control flow analysis; rather I think it should look at the structure of the program as written and where What about this example? var x;
try {
x = cannotThrow();
maybeThrow();
} catch {
// no mention of x at all
} I think you would argue that this should compile because a control-flow analysis indicates x is always initialized. I don't think split-init rules should work this way. I don't think it should compile. But anyway I think the main issue is this example: var x: MyRecord;
try {
maybeThrow();
x = new MyRecord(); // initialization
maybeThrow();
} catch {
x = new MyRecord(); // point 2: initialization or assignment?
} I know of the following options: I continue to prefer 3b for this. |
Thank you for the clarifications @mppf. I hope I can now speak your language. Here is how your example is handled under Option 3a, which is my preference: var x: MyRecord;
try {
maybeThrow(); // 1
x = new MyRecord(); // 2
maybeThrow(); // 3
} catch {
x = new MyRecord(); // 4
} The compiler would examine the program using the before-after relationship, making an exception for the catch block, like it does for conditionals. The steps of examining the program are:
I expect it to be possible to formulate the corresponding rules that match the style of the spec.
I meant to say "split-initialization ONLY within a catch block". Sorry for the ensuing confusion. I think if for now we issue an error when it happens, we can lift that error later without it being a breaking change. Or, we can allow it right away, imposing sufficient constraints for it to be sound. For this example: var x;
try {
x = cannotThrow();
maybeThrow();
} catch {
// no mention of x at all
} I am OK if it is an error for now. We can lift this restriction later. |
@vasslitvinov - thanks for your post, yes we are more on the same page now language-wise.
There is a pretty serious problems with this approach. The split-init logic happens in a type-independent way by looking at the structure of the code. That is by design (but also means it's implemented in normalize). It doesn't have the ability to look at a function call and decide if it can possibly throw or not. The result of that is that while we could make the above example result in compilation error, we would only be able to issue that error after the compiler had already made a decision about whether or not to use split-init. That would be surprising for users because code that used to work with default-init / assign would now be a compilation error. In other words, we would like the split-init rules to allow programs to continue to work (even if sometimes default-init/assign is replaced by init) and that happens today by the split-init rules deciding not to split-init in cases that would introduce an error. The way I see it, that leaves us with two options: 2 Only allow such split-init with try or try! if there are no catch blocks 3b In the event that x is inited in the try block and then an error is thrown to the catch blocks, deinit x and then re-initialize it. Separately, I'm looking at extending the split-init logic to tolerate conditionals or catch blocks that throw or return. So 2 would become "Only allow such split-init with try or try! if there are no catch blocks or if all catch blocks throw/return." |
I created a split-init example inspired by earlier work on migrating Arkouda to non-nilable class types. This example depends on being able to do split-init with the initialization in a nested chapel/test/types/records/split-init/split-init-examples-inspired-by-ak.chpl Lines 114 to 121 in f66b7c6
|
I've created #14815 about split-init with conditionals where one side throws or returns. |
Michael and I talked some more about these two "finalists" :
The leading motivation for 3b is that it enables this example: var response; // we want this to be a non-nilable class
try {
response = processMessage(msg);
} catch e {
response = new FailureResponse(e);
}
sendResponse(response); where Given that we allow the compiler to replace TWO init+deinit pairs with ONE by replacing an (initialization + assignment) with a move, in some cases, we can also allow the compiler to replace ONE init+deinit with TWO. The key motivations for 2' are:
The above example would have the desired behavior under 3b. It would be rejected by 2' because of potentially accessing an uninitialized variable We could accept 2' simply on the grounds that the questionable scenario is currently rejected and we can enable it later when there is a compelling use case, hopefully giving us a better idea of what the semantics should be. However, this argument does not hold for a slight modification of the above code: var r: MyRecord;
try {
something;
r = makeMyRecord();
somethingElse;
} catch e {
r = alsoCompute();
} Both 2' and 3b accept this code, however produce different behavior when 2' postulates that 3b possibly initializes We are leaning towards 2' on the grounds of "it is weird to use the catch-clause to initialize an outer variable" and because the other behavior can still be obtained when desired, for example by using a helper function. |
Improve out intent This PR improves out intent in several ways: * out intent transfers a value from the called function to the call site, just like return does * a variable passed to an out intent formal can be split-inited While there, it improves split-init to not consider an unconditional `return` to prevent split-init. PR #14814 and related issue #14748 made split-init allow things like `if something then return` between the declaration and the applicable assignment. But a simple `return` would disable split-init which is counter-intuitive and reflected an implementation shortcoming. Note though that when the out intent formal at the called function is split inited, a `return` before the applicable assignment needs to disable split init. E.g. ``` chapel proc f(out arg) { if something then return; arg = f(); // not a split init because out needs to be inited when return occurs } ``` As a result, it was necessary to adjust the split init implementation to differentiate between throwing and returning and to include the flag `allowReturns` to toggle this behavior for local variables vs. out intent formals. Previous to this PR, all split-init logic is in normalize. However, since split-init for `out` formals is not possible to identify until calls have been resolved to functions, this PR makes implementation changes to allow function resolution to identify split-init. One might be tempted to move all split-init logic into resolution, but split-init cases that identify the type of the initialized variable cannot be handled in resolution, because the calls involving that variable (that possibly involve the `out` intent, say) cannot be resolved until the type is known. For example: ``` chapel { var x; // type not established until split-init is considered x = 1; } ``` As a result, identifying split-init happens both in normalize and resolution. However this PR factors the code so that the same code applies in both cases. In order to do handle split-init in resolution, this PR splits `PRIM_INIT_VAR` resolution into two phases. The first phase is implemented in `resolvePrimInit` and is invoked in `resolveCall` and at that point only types are established. The second phase is called `lowerPrimInit` and it is invoked in `resolveFunction` after split-init determinations are made. This change ran into a design question about whether a default-init with a generic type e.g. `var x: MyGenericRecord;` which is discussed in #8499. This PR makes this pattern an error for the time being. Split-init for `out` intent has a slight difference from split-init for other variables. The type of the initialized variable cannot be determined by the `out` formal because the type of the initialized variable will be used in order to resolve the call. Consider for example: ``` chapel proc f(out arg) { } proc foo() { var x; // x = 1 would be allowed and would set the type of x, but f(x); // results in error because `f` cannot be used to establish the type of `x` } ``` As a result, for `out` intent split-init, the type information is given by the variable declaration and propagated to a generic function. For array formals this adds a further complication with runtime types. For example, consider the following program: ``` chapel proc g(out arg) { writeln(arg.domain); } proc bar() { var A:[1..10] int; g(A); } ``` Here, the type of `arg` in the instantiation of `g` is given by the call site (and A specifically in this case). Therefore, for an array, which has a runtime type (representing the domain `{1..10}` in this case), the runtime type needs to be communicated to the function `g`. The compiler takes the approach of adding a hidden type formal before the `out arg` formal and fixing it up after split inits are determined in resolveFunction. Other implementation details: * migrated split init logic from normalize into splitInit.cpp and adjusts it to return a use that prevented split init (for error reporting). Adjusted the split init logic to include a `bool allowReturns` to behave differently for variables that need to be initialized even with early returns. Added `FOUND_THROW` to the `found_init_t` enum and added handleReturn to look for uses or inits after a return. * in normalize, factors several repeated pieces of error handling into an improved errorIfSplitInitializationRequired function * renamed containsSymExprFor to findSymExprFor for better error messages in split init code * added `GOTO_ERROR_HANDLING_RETURN` which is a variant of `GOTO_RETURN` indicating a `throw` return from a function * moved the removal of elided `on` blocks (that is, `on` blocks in a local compilation configuration) from createTaskFunctions to callDestructors, so that elided on blocks are present when split-init for out is identified during resolution. * renamed shouldAddFormalTempAtCallSite to shouldAddInFormalTempAtCallSite now that we add temps for out intent formals at call sites * adjusted AutoDestroyScope / addAutoDestroyCalls to gather the write-backs to formal temporaries and add these write-back calls just before destroying them. Other than that, callDestructors now treats formal temps as regular variables. addFormalTemps and findFormalTempAssignBack gather the write-back calls. The write-back for formals is added, in declaration order, in variablesDestroy, just before the destructors are called on a given path. By moving this from the function epilogue, the code can now treat returns for error-handling differently from regular returns. For regular returns, the write-back needs to occur; but for a `throw`, it should not. * In generics.cpp, added fixupUntypedOutArgRTTs which adds a type formal marked with `FLAG_TYPE_FORMAL_FOR_OUT` to functions accepting an `out` intent argument without a type. Adjusted ResolutionCandidate.cpp to ignore the hidden type arguments for untyped out formals (marked with `FLAG_TYPE_FORMAL_FOR_OUT`). * adjusted wrappers.cpp to add handleOutIntents which adds some logic at the call site for out intent formals. This also includes logic for passing runtime types to untyped out formals but mainly amounts to adding a dummy argument that will be fixed up in AddOutIntentTypeArgs after split-init handling * adjusted resolveFunction to include a new function fixPrimInitsAndAddCasts (to be called now instead of insertAndResolveCasts) which runs a new SplitInitVisitor, AddOutIntentTypeArgs, insertAndResolveCasts, and FixPrimInitsVisitor. Adjusted addLocalCopiesAndWritebacks to transfer to `out` formals, similarly to `return`. * adjusted lvalue checking to ignore the formal temps for `out` at the call site * adjusted resolveInitVar to allow avoiding a copy for `FLAG_EXPR_TEMP` values * adjusted findZeroArgInitFn to handle more cases that lowerPrimInit does * adjusts checkParsed to add an error for param/type returning functions using out or inout intents Related Future Work: * task to update spec is tracked in Cray/chapel-private#719 * split-init for out intent and folded param conditional code #14996 * Partial Instantiations: expressing the concrete version of 'range' #13566 * Generate type constructors based on initializers #8499 * missing compilation error for record initializer changing type #14945 * typed array argument with in or out intent does not accept a Block distributed array #15014 Reviewed by @benharsh - thanks! - [x] full local futures testing - [x] primers pass with valgrind and do not leak - [x] primers pass with --verify
This is a spin-off from #14271.
Should split-init be able to find an initialization point within a nested
try
ortry!
block?For example:
An error thrown in this situation is similar to an early return and the compiler can know that the variable hasn't been initialized in that case.
If there are catch blocks, the situation becomes more complex. It becomes possible for an error to be thrown and leave the variable uninitialized. For example:
The compiler could take the approach of ensuring that, in the event of an error caught, all variables being initialized in the try section are deinitialized; and then that they are initialized again in the
catch
blocks. It would have to check for uses of split-init variables in catch blocks before they are initialized.So I currently see 3 options:
try
ortry!
block.try
ortry!
if there are nocatch
blockstry
ortry!
if thecatch
blocks also initialize the variable.The text was updated successfully, but these errors were encountered: