Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Should split-init be allowed within a try block? #14748

Closed
mppf opened this issue Jan 14, 2020 · 18 comments
Closed

Should split-init be allowed within a try block? #14748

mppf opened this issue Jan 14, 2020 · 18 comments

Comments

@mppf
Copy link
Member

mppf commented Jan 14, 2020

This is a spin-off from #14271.

Should split-init be able to find an initialization point within a nested try or try! block?

For example:

  var r;
  try {
    r = new R();
  }

An error thrown in this situation is similar to an early return and the compiler can know that the variable hasn't been initialized in that case.

If there are catch blocks, the situation becomes more complex. It becomes possible for an error to be thrown and leave the variable uninitialized. For example:

  var r;
  try {
    r = returnsR(throw=true); // throws an error
  } catch {
    // r not initialized if returnsR threw an error
  }

The compiler could take the approach of ensuring that, in the event of an error caught, all variables being initialized in the try section are deinitialized; and then that they are initialized again in the catch blocks. It would have to check for uses of split-init variables in catch blocks before they are initialized.

So I currently see 3 options:

  1. Don't allow split-init to have an initialization point within a nested try or try! block.
  2. Only allow such split-init with try or try! if there are no catch blocks
  3. Only allow such split-init with try or try! if the catch blocks also initialize the variable.
@mppf
Copy link
Member Author

mppf commented Jan 14, 2020

For now the compiler will disallow it (option 1) but changing this is arguably a breaking change (it changes the pattern of default-init / assign / deinit calls so could lead to programs behaving differently).

@BryantLam
Copy link

Options 2 and 3 seem reasonable to me, but I'll admit I haven't thought through it that much. This usage, however, is something that I have wanted a few times. (That's not to say it's a good idea, but it certainly helped motivate #14271.) For example:

try {
  var x = funcMayThrow();
  // ... use x ...
  // lots of lines of code ...
}
catch {
  // handler for funcMayThrow and nothing else
}

There isn't a clean way to denote that the catch logic really only handles the variable initialization despite all the intervening code in the try block. Split-init would help to bring the var declaration out of the try block while also minimizing the catch logic to only where funcMayThrow is called.

The alternative is to do something like:

var x: SomeType = sentinelValue; // Take a default init value.

try {
  x = funcMayThrow();
} catch {
  // ...
}

if x == sentinelValue {
  // handle illegal state
}

// ... lots of lines of code ...

Maybe this idiom is sufficient because it makes the Options 2 and 3 cases more explicit in the user code instead of some implicit design decision of the language. But var x is technically now an optional because it needs some None state.

@mppf
Copy link
Member Author

mppf commented Jan 15, 2020

Split-init would help to bring the var declaration out of the try block while also minimizing the catch logic to only where funcMayThrow is called.

Just to make the example more concrete, you're saying perhaps one would want to write:

var x;
try {
  x = funcMayThrow();
}
catch {
  // handler for funcMayThrow and nothing else
  x = makeReplacementValue();
}
// ... use x ...
// lots of lines of code ...

But in that event don't we still have a "sentinel" value of some sort?

But var x is technically now an optional because it needs some None state.

I think that's not necessarily the case. For example, we might have

var response: Response = new FailureResponse();
try {
  response = processMessage(msg);
} catch e {
  response = new FailureResponse(e);
}
sendResponse(response);

Here I wouldn't think of FailureResponse as equivalent to None.

Here is a potential split-init version of the same thing:

var response;
try {
  response = processMessage(msg);
} catch e {
  response = new FailureResponse(e);
}
sendResponse(response);

@mppf
Copy link
Member Author

mppf commented Jan 16, 2020

At the moment I am leaning towards option 3 above (Only allow such split-init with try or try! if the catch blocks also initialize the variable.)

My thinking here is:

  • error handling was one of the motivations for split-init (as @BryantLam said above)
  • try/catch is philisophically similar to if/else which we already support
  • the implementation seems achievable
  • it would allow one to avoid using a nilable type, creating an extra allocation, or using a wrapper function in the below example:
var response: Response = new FailureResponse(); // wish to leave this init out!
try {
  response = processMessage(msg);
} catch e {
  response = new FailureResponse(e);
}
sendResponse(response);

BTW the "use a wrapper function" strategy would be something like this:

var response = generateResponse(msg);

proc generateResponse(msg) {
  try {
    return processMessage(msg);
  } catch e {
    return new FailureResponse(e);
  }
}

I think that option 2 (Only allow such split-init with try or try! if there are no catch blocks) has the drawback of being arguably surprising since try without a catch block is similar to try with a catch block - the catch block is optional but doesn't fundamentally change what try is. In other words it would make the distinction between try+catch and try without catch more relevant but that's not necessarily good.

Although option 1 (Don't allow split-init to have an initialization point within a nested try or try! block) is implemented now, that's just because it is the easiest to do near-term. It's not the strategy I'd like to see long-term, as I was surprised not to see split-init when working with a code example with a try without catches.

@BryantLam
Copy link

I'm okay with all of that. I realized that the wrapper-function strategy would also work as a lambda:

var response = lambda(msg) {
  try {
    return processMessage(msg);
  } catch e {
    return new FailureResponse(e);
  }
}(msg);

That might be a useful stop-gap for users with this usage pattern but the feature isn't implemented yet.

mppf added a commit that referenced this issue Jan 16, 2020
Adjust point of deinitialization based on expiring value analysis

This PR updates the compiler to use expiring value analysis in order to
choose the point at which `deinit` for a local record variable is run. It
does this according to the discussion in #13704.

This PR includes several minor modification to the rules described in
#13704.
1. Variables referred to in `begin` blocks are considered to expire at
   the end of the function (and issue #14750 discusses this and how it
   should interact with`sync` blocks)
2. An exception for `=` and `init=` for managed types. Because ownership
   transfer from non-nil is one of the use cases, we need the analysis to
   not consider `x` to potentially alias `y` in the below code, when both
   are `owned` - `var x = y;`. In general, if these are classes or
   records, they could share a reference to a field. For `owned` however,
   the right-hand-side of the initialization is left `nil` after the
   statement, and so there is no potential for aliasing between `x` and
   `y` after the statement. Additionally, for `shared`, while `x` and `y`
   refer to the same memory, this has no impact on when `x` or `y` can be
   deinitialized or a copy could be avoided. Therefore, when considering
   if a potential alias is created, the analysis ignores assignment and
   copy-initialization of managed types.

Additionally, it improves the split-init implementation. Previous to this
PR, an initialization point like `x = functionReturningByValue()` would
cause the compiler to emit `init=` when it did not need to. Additionally,
the compiler was emitting `deinit` calls for variables that were declared
but not yet initialized if there was a return from the block. Lastly,
this PR extends split-init to work with arrays (where before it was
disabled in resolution).

Lastly, it adjusts the lifetime checker to now require a lifetime clause
on `=` or `<=>` overloads for types that contain borrows.

Implementation steps:
* Remove FnSymbol::replaceReturnSymbol - it is unused
* rename `_statementLevelSymbol` to `chpl_statementLevelSymbol` and add
  `astr_chpl_statementLevelSymbol` to the compiler
* Print out variables stored directly in ForallStmts as "other variables"
  in the AST logs
* Add flags to disable split init and early deinit; change
  fLifetimeChecking to fNoLifetimeChecking to reflect the current default
* Move addAutoDestroyCalls to after ret-arg transform so it can run after
  lifetime checking (where the expiring value analysis runs) and make
  several adjustments to handle ret-args in addAutoDestroyCalls.
* Add code to normalize to mark the end-of-statement. Normalize adds a
  `PRIM_END_OF_STATEMENT` call after the end of each statement that isn't
  already at the end of a block or that refers to user variables.
  `PRIM_END_OF_STATEMENT` arguments are SymExprs referring to user
  variables that should be preserved until the end of that statement,
  which matters for code that is removed before callDestructors, such as
  `x.type`. Adjusted the unordered forall optimization and flatten
  functions to ignore these `PRIM_END_OF_STATEMENT` calls. These
  `PRIM_END_OF_STATEMENT` calls are removed in callDestructors.
* Updated the normalizer code for split-init to properly handle `x =
  makeRecord()` in addition to `x = new MyRecord()` and to store the
  type-expr in a temporary that is passed to both
  `PRIM_INIT_VAR_SPLIT_DECL` and `PRIM_INIT_VAR_SPLIT_INIT` to enable split
  init of arrays
* Updated the normalizer code for split init to explicitly handle
  TryStmts even though they currently disable split-init
* Updated functionResolution to handle split-init of arrays
* addAutoDestroyCalls creates a map from a statement to which variables
  should be destroyed there in order to implement the early deinit. Any
  variables not in the map will be destroyed according to previous rules
  (i.e. at the end of the block). Once deinit calls for these are added,
  the variable is added to the ignore set.
* updated addAutoDestroyCalls and AutoDestroyScope to track which
  variables have been initialized (in case they are split-inited) and to
  propagate that information to parent blocks (in case a variable is
  initialized with split-init in a nested block).

Test updates:
 * updating tests that check deinit occurs for end-of-block variables on
   return/throw/etc to force some variables to be end-of-block
 * work-arounds for #14746
 * updating tests for new deinit behavior and in some cases test both
   last-mention and end-of-block
 * new tests of new behavior

Reviewed by @vasslitvinov - thanks!

- [x] full local testing

Future Work:
 * document deinit points in the language specification -
   Cray/chapel-private#690
 * deinitialize in reverse initialization order rather than reverse
   declaration order - Cray/chapel-private#691

Related Design Questions:
 * Expiring value analysis interaction with `=` and `init=` for managed
   types (described above)
 * Lifetime checking and = overloads #14757 
 * Split-init interactions with deinitialization order #14747 
 * Should split-init be allowed within a try block? #14748 
 * Should last mention deinitialization point go inside blocks? #14749 
 * Expiring value analysis and begin blocks #14750 
 * Split-init interaction with warning about local-from-distributed
   #14746
 * Should unused record variables be removed? #14758
@vasslitvinov
Copy link
Member

Overall I like option 3, with a fallback to option 1.

My concern is the cases where it is unknown at compile time whether initialization will happen.
For example:

var x;
try {
  mayThrow1();
  x = initExpr;
  mayThrow2();
}
catch {
  x = anotherExpr;   // initialization if mayThrow1() throws, otherwise assignment
}

This issue proposes to deinitialize x upon entering the catch block if it has been initialized. This way, the = in the catch block is surely initialization. However, on users' behalf, I do not like such extra deinitialization. If I go through the trouble of initializing and populating my data structure, I may not want it to disappear mid-way in my computation. Even if it happens because of an exception.

At the same time I agree that we want to distinguish initialization from assignment at compile time.

To resolve this, I propose making the uncertain cases like this one a compile-time error "currently not supported".

@mppf
Copy link
Member Author

mppf commented Jan 22, 2020

This issue proposes to deinitialize x upon entering the catch block if it has been initialized.

Well, technically, it will deinitalizes x when control flow leaves the try block.

To resolve this, I propose making the uncertain cases like this one a compile-time error "currently not supported".

What exactly would you propose as the rule the compiler would use to know the boundary condition here? The compiler could check if there are additional calls (such as mayThrow2 above) after the initialization in the try block that may throw. Is that what you had in mind?

The catch blocks will always initialize the outer variable, in that case.

Examples
var x;
try {
  mayThrow1();
  x = initExpr;
  mayThrow2();
}
catch {
  x = anotherExpr;
}

Error: mayThrow2 might move control flow to catch block after x is initialized.

var x;
try {
  mayThrow1();
  x = initExpr;
}
catch {
  x = anotherExpr;
}

OK

var x;
try {
  mayThrow1();
  x = mayThrowInitExpr();
}
catch {
  x = anotherExpr;
}

OK

var x;
try {
  mayThrow1();
  x = mayThrowInitExpr();
  cannotThrow();
}
catch {
  x = anotherExpr;
}

OK

@vasslitvinov
Copy link
Member

vasslitvinov commented Jan 22, 2020

Here is how I see it. For each x = ...; in the source code, the compiler needs to tell whether it is an assignment or a split-initialization. This is not possible, so is a compile-time error, when:

By the time the control flow reaches a x = ...; that could be a split-init, the compiler observes that another x = ...; may or may not have been executed.

An example of this without a try block would be:

var x;
if (a nonparam condition) {
  x = 1;
}
x = 2;  // when execution reaches this,
        // 'x=1' may or may not have been executed

@mppf
Copy link
Member Author

mppf commented Jan 23, 2020

It might just be a definitions-of-terms issue, but if the compiler observes "that another x = ...; may or may not have been executed" then it won't be split-init.

Anyway I find it hard to understand what you're advocating for here in a way that would allow me to implement anything. Could you show some try/catch examples along with what they would do and why? (You could start with my examples in my previous comment).

@vasslitvinov
Copy link
Member

if the compiler observes "that another x = ...; may or may not have been executed" then it won't be split-init.

Hm maybe I misunderstand the current rules. What happens in this case?

var x;
if (a nonparam condition) {
  x = 1;
}
x = 2; 

My examples would be the same as your examples (addressing @mppf here), with the same outcomes. I think we are just defining the rule(s) that produce those outcomes.

@vasslitvinov
Copy link
Member

[Continuing my previous comment.]

My rule applies to the example with a conditional because by the time the control flow reaches x=2, x=1 may (if the conditional is true) or may not (otherwise) have been executed.

Here are some other examples that I can think of:

var x;
try {
  mayThrow();
  x = 1;
} catch {
  // does not mention x
}

Error: x may be left uninitialized right after the try.

var x;
try {
  throw ....;
} catch {
  x = 1;
}

OK -- x is initialized in the catch block -- if this is a catch-all, error otherwise.

var x;
try {
  throw ...;
} catch MyError {
  // does not mention x
} catch {
  // catch-all
  x = 1;
}

Error because x may or may not end up getting initialized.

I suspect we can postpone support for split-initialization within a catch-block. For now we could mark those cases with an error "currently unsupported".

@mppf
Copy link
Member Author

mppf commented Jan 23, 2020

if the compiler observes "that another x = ...; may or may not have been executed" then it won't be split-init.

Hm maybe I misunderstand the current rules. What happens in this case?

var x;
if (a nonparam condition) {
  x = 1;
}
x = 2; 

This is a compilation error. If it were var x: MyRecord, it would be default initialized if possible. The compiler interprets x = 1 and x = 2 above as assignments.

My examples would be the same as your examples (addressing @mppf here), with the same outcomes. I think we are just defining the rule(s) that produce those outcomes.

They are not the same and the rules are actually quite different.

The split-init rules that are in the specification don't use arbitrary control flow analysis. Instead, they define within which constructs a split-init can occur if there are no mentions of the variable before something that looks like assignment. Here "before" generally means "appears before in the program text" (except for conditionals). Split-init can go into a regular nested block. It can also go into a conditional if both then and else branches have a valid split-init point (an assignment with no previous mentions).

I hope this is clearer in the spec itself which is already updated for split init.

I suspect we can postpone support for split-initialization within a catch-block. For now we could mark those cases with an error "currently unsupported".

Isn't that Option 2 "Only allow such split-init with try or try! if there are no catch blocks"? Choosing Option 2 now and relaxing it later would be a breaking change.

Consider this example:

var x: MyRecord;
try {
  x = createRecordMaybeThrowing(); // 1
} catch {
  x = new MyRecord();
}

In choosing 2 or 3, we are saying whether or not x will be default-init and then assigned or initialized on line 1. This difference is observable.

We could make this example be a compilation error for now - but that seems to me to unnecessarily prevent code that worked before from working today.

I have a different answer about one of your examples:

var x;
try {
  throw ....;
} catch {
  x = 1;
}

OK -- x is initialized in the catch block -- if this is a catch-all, error otherwise.

I don't think this one should compile. I don't think the split-init logic should do arbitrary control flow analysis; rather I think it should look at the structure of the program as written and where x is mentioned and assigned to.

What about this example?

var x;
try {
  x = cannotThrow();
  maybeThrow();
} catch {
  // no mention of x at all
}

I think you would argue that this should compile because a control-flow analysis indicates x is always initialized. I don't think split-init rules should work this way. I don't think it should compile.

But anyway I think the main issue is this example:

var x: MyRecord;
try {
  maybeThrow();
  x = new MyRecord(); // initialization
  maybeThrow();
} catch {
  x = new MyRecord(); // point 2: initialization or assignment?
}

I know of the following options:
1,2 Don't allow split-init for this example (which is what happens on master with this now)
3a Make this example a compilation error (because it is not knowable at compile time whether or not x = ... at point 2 is initialization or assignment).
3b In the event that x is inited in the try block and then an error is thrown to the catch blocks, deinit x and then re-initialize it.

I continue to prefer 3b for this.

@vasslitvinov
Copy link
Member

Thank you for the clarifications @mppf. I hope I can now speak your language.

Here is how your example is handled under Option 3a, which is my preference:

var x: MyRecord;
try {
  maybeThrow();       // 1
  x = new MyRecord(); // 2
  maybeThrow();       // 3
} catch {
  x = new MyRecord(); // 4
}

The compiler would examine the program using the before-after relationship, making an exception for the catch block, like it does for conditionals. The steps of examining the program are:

  • Statement 1 - does not affect x directly. Since it can throw, examine the catch block. Since x has not been initialized before, Statement 4 is an initialization.

  • Statement 2 - since x has not been initialized before, it is an initialization.

  • Statement 3 - since it can throw, examine the catch block. Since Statement 4 is an initialization and x has already been initialized before (as appears in the program text), this is an error of double-initialization.

I expect it to be possible to formulate the corresponding rules that match the style of the spec.

I suspect we can postpone support for split-initialization within a catch-block. For now we could mark those cases with an error "currently unsupported".

I meant to say "split-initialization ONLY within a catch block". Sorry for the ensuing confusion. I think if for now we issue an error when it happens, we can lift that error later without it being a breaking change. Or, we can allow it right away, imposing sufficient constraints for it to be sound.

For this example:

var x;
try {
  x = cannotThrow();
  maybeThrow();
} catch {
  // no mention of x at all
}

I am OK if it is an error for now. We can lift this restriction later.

@mppf
Copy link
Member Author

mppf commented Jan 24, 2020

@vasslitvinov - thanks for your post, yes we are more on the same page now language-wise.

Statement 3 - since it can throw, examine the catch block. Since Statement 4 is an initialization and x has already been initialized before (as appears in the program text), this is an error of double-initialization.

There is a pretty serious problems with this approach.

The split-init logic happens in a type-independent way by looking at the structure of the code. That is by design (but also means it's implemented in normalize). It doesn't have the ability to look at a function call and decide if it can possibly throw or not.

The result of that is that while we could make the above example result in compilation error, we would only be able to issue that error after the compiler had already made a decision about whether or not to use split-init. That would be surprising for users because code that used to work with default-init / assign would now be a compilation error. In other words, we would like the split-init rules to allow programs to continue to work (even if sometimes default-init/assign is replaced by init) and that happens today by the split-init rules deciding not to split-init in cases that would introduce an error.

The way I see it, that leaves us with two options:

2 Only allow such split-init with try or try! if there are no catch blocks

3b In the event that x is inited in the try block and then an error is thrown to the catch blocks, deinit x and then re-initialize it.

Separately, I'm looking at extending the split-init logic to tolerate conditionals or catch blocks that throw or return. So 2 would become "Only allow such split-init with try or try! if there are no catch blocks or if all catch blocks throw/return."

@mppf
Copy link
Member Author

mppf commented Jan 24, 2020

I created a split-init example inspired by earlier work on migrating Arkouda to non-nilable class types.

This example depends on being able to do split-init with the initialization in a nested try block that has a catch that returns.

proc test2(t: Table, numerators:[] string, denominator:string): string {
var denominatorEntry: borrowed Entry;
try {
denominatorEntry = t.lookup(denominator);
} catch e {
return e.message();
}

@mppf
Copy link
Member Author

mppf commented Jan 24, 2020

I've created #14815 about split-init with conditionals where one side throws or returns.

@vasslitvinov
Copy link
Member

Michael and I talked some more about these two "finalists" :

  • 2' An occurrence of an outer variable x in a catch-block is never an initialization.

  • 3b An x = ...; in a catch block is an initialization if it would be an initialization in simpler code -- when that catch block were just a block occurring instead of the entire try+catch statement. In the event that x is already initialized in the try block and an error is thrown, x is deinitialized then re-initialized in the catch block.

The leading motivation for 3b is that it enables this example:

var response; // we want this to be a non-nilable class
try {
  response = processMessage(msg);
} catch e {
  response = new FailureResponse(e);
}
sendResponse(response);

where response would have to be default-initialized, and so be nilable, without 3b.

Given that we allow the compiler to replace TWO init+deinit pairs with ONE by replacing an (initialization + assignment) with a move, in some cases, we can also allow the compiler to replace ONE init+deinit with TWO.

The key motivations for 2' are:

  • This is a simple rule.

  • The catch block is intended for handling exceptional situations. Therefore it is weird to use it to initialize an outer variable.

  • Vass does not like the idea of suddenly deinitializing something. What if the outer variable in question obtains some resource upon initialization. Then the effect of 3b would be dropping that resource upon the error then re-acquiring it within the catch-clause. If there is no matching catch-clause so that the control flow leaves the scope of the variable, the resource is dropped as well, however it is not re-acquired later.

The above example would have the desired behavior under 3b.

It would be rejected by 2' because of potentially accessing an uninitialized variable response in the catch-block. This happens when processMessage(msg) throws.

We could accept 2' simply on the grounds that the questionable scenario is currently rejected and we can enable it later when there is a compelling use case, hopefully giving us a better idea of what the semantics should be.

However, this argument does not hold for a slight modification of the above code:

var r: MyRecord;
try {
  something;
  r = makeMyRecord();
  somethingElse;
} catch e {
  r = alsoCompute();
}

Both 2' and 3b accept this code, however produce different behavior when something throws:

2' postulates that r = alsoCompute() is a use of r. So we need to default-initialize r at declaration so that it is already initialized if the catch block is reached when something throws.

3b possibly initializes r with makeMyRecord(), which is a different behavior than always default-initializing r.

We are leaning towards 2' on the grounds of "it is weird to use the catch-clause to initialize an outer variable" and because the other behavior can still be obtained when desired, for example by using a helper function.

@mppf
Copy link
Member Author

mppf commented Jan 28, 2020

I understand that @benharsh is on board with 2' and PR #14814 implements it. Therefore I am closing this issue.

@mppf mppf closed this as completed Jan 28, 2020
@mppf mppf mentioned this issue Feb 25, 2020
3 tasks
mppf added a commit that referenced this issue Feb 27, 2020
Improve out intent

This PR improves out intent in several ways:
 * out intent transfers a value from the called function to the call
   site, just like return does
 * a variable passed to an out intent formal can be split-inited

While there, it improves split-init to not consider an unconditional
`return` to prevent split-init. PR #14814 and related issue #14748 made
split-init allow things like `if something then return` between the
declaration and the applicable assignment. But a simple `return` would
disable split-init which is counter-intuitive and reflected an
implementation shortcoming.

Note though that when the out intent formal at the called function is
split inited, a `return` before the applicable assignment needs to
disable split init. E.g.

``` chapel
proc f(out arg) {
  if something then return;
  arg = f(); // not a split init because out needs to be inited when return occurs
}
```

As a result, it was necessary to adjust the split init implementation to
differentiate between throwing and returning and to include the flag
`allowReturns` to toggle this behavior for local variables vs. out intent
formals.

Previous to this PR, all split-init logic is in normalize. However, since
split-init for `out` formals is not possible to identify until calls have
been resolved to functions, this PR makes implementation changes to allow
function resolution to identify split-init. One might be tempted to move
all split-init logic into resolution, but split-init cases that identify
the type of the initialized variable cannot be handled in resolution,
because the calls involving that variable (that possibly involve the
`out` intent, say) cannot be resolved until the type is known. For
example:

``` chapel
{
  var x; // type not established until split-init is considered
  x = 1;
}
```

As a result, identifying split-init happens both in normalize and
resolution. However this PR factors the code so that the same code
applies in both cases.

In order to do handle split-init in resolution, this PR splits
`PRIM_INIT_VAR` resolution into two phases. The first phase is
implemented in `resolvePrimInit` and is invoked in `resolveCall` and at
that point only types are established. The second phase is called
`lowerPrimInit` and it is invoked in `resolveFunction` after split-init
determinations are made. This change ran into a design question about
whether a default-init with a generic type e.g. `var x: MyGenericRecord;`
which is discussed in #8499. This PR makes this pattern an error for the
time being.

Split-init for `out` intent has a slight difference from split-init for
other variables. The type of the initialized variable cannot be
determined by the `out` formal because the type of the initialized
variable will be used in order to resolve the call. Consider for example:

``` chapel
proc f(out arg) { }
proc foo() {
  var x;
  // x = 1 would be allowed and would set the type of x, but
  f(x); // results in error because `f` cannot be used to establish the type of `x`
}
```

As a result, for `out` intent split-init, the type information is given
by the variable declaration and propagated to a generic function.

For array formals this adds a further complication with runtime types.
For example, consider the following program:

``` chapel
proc g(out arg) { writeln(arg.domain); }
proc bar() {
  var A:[1..10] int;
  g(A);
}
```

Here, the type of `arg` in the instantiation of `g` is given by the call
site (and A specifically in this case). Therefore, for an array, which
has a runtime type (representing the domain `{1..10}` in this case), the
runtime type needs to be communicated to the function `g`. The compiler
takes the approach of adding a hidden type formal before the `out arg`
formal and fixing it up after split inits are determined in
resolveFunction.

Other implementation details:
 * migrated split init logic from normalize into splitInit.cpp and
   adjusts it to return a use that prevented split init (for error
   reporting). Adjusted the split init logic to include a `bool
   allowReturns` to behave differently for variables that need to be
   initialized even with early returns. Added `FOUND_THROW` to the
   `found_init_t` enum and added handleReturn to look for uses or inits
   after a return.
 * in normalize, factors several repeated pieces of error handling into
   an improved errorIfSplitInitializationRequired function
 * renamed containsSymExprFor to findSymExprFor for better error messages
   in split init code
 * added `GOTO_ERROR_HANDLING_RETURN` which is a variant of `GOTO_RETURN`
   indicating a `throw` return from a function
 * moved the removal of elided `on` blocks (that is, `on` blocks in a
   local compilation configuration) from createTaskFunctions to
   callDestructors, so that elided on blocks are present when split-init
   for out is identified during resolution.
 * renamed shouldAddFormalTempAtCallSite to
   shouldAddInFormalTempAtCallSite now that we add temps for out intent
   formals at call sites
 * adjusted AutoDestroyScope / addAutoDestroyCalls to gather the
   write-backs to formal temporaries and add these write-back calls just
   before destroying them. Other than that, callDestructors now treats
   formal temps as regular variables. addFormalTemps and
   findFormalTempAssignBack gather the write-back calls. The write-back
   for formals is added, in declaration order, in  variablesDestroy, just
   before the destructors are called on a given path. By moving this from
   the function epilogue, the code can now treat returns for
   error-handling differently from regular returns. For regular returns,
   the write-back needs to occur; but for a `throw`, it should not.
 * In generics.cpp, added fixupUntypedOutArgRTTs which adds a type formal
   marked with `FLAG_TYPE_FORMAL_FOR_OUT` to functions accepting an `out`
   intent argument without a type. Adjusted ResolutionCandidate.cpp to
   ignore the hidden type arguments for untyped out formals (marked with
   `FLAG_TYPE_FORMAL_FOR_OUT`).
 * adjusted wrappers.cpp to add handleOutIntents which adds some logic at
   the call site for out intent formals. This also includes logic for
   passing runtime types to untyped out formals but mainly amounts to
   adding a dummy argument that will be fixed up in AddOutIntentTypeArgs
   after split-init handling
 * adjusted resolveFunction to include a new function
   fixPrimInitsAndAddCasts (to be called now instead of
   insertAndResolveCasts) which runs a new SplitInitVisitor,
   AddOutIntentTypeArgs, insertAndResolveCasts, and FixPrimInitsVisitor.
   Adjusted addLocalCopiesAndWritebacks to transfer to `out` formals,
   similarly to `return`.
 * adjusted lvalue checking to ignore the formal temps for `out` at the
   call site
 * adjusted resolveInitVar to allow avoiding a copy for `FLAG_EXPR_TEMP`
   values
 * adjusted findZeroArgInitFn to handle more cases that lowerPrimInit
   does
 * adjusts checkParsed to add an error for param/type returning functions
   using out or inout intents

Related Future Work:
 * task to update spec is tracked in
   Cray/chapel-private#719
 * split-init for out intent and folded param conditional code #14996
 * Partial Instantiations: expressing the concrete version of 'range'
   #13566
 * Generate type constructors based on initializers #8499
 * missing compilation error for record initializer changing type #14945
 * typed array argument with in or out intent does not accept a Block
   distributed array #15014

Reviewed by @benharsh - thanks!

- [x] full local futures testing
- [x] primers pass with valgrind and do not leak
- [x] primers pass with --verify
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants