Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Bytecode generation for tailrec methods uses temporary variables which confuses debuggers #14773

Closed
ghost opened this issue Mar 25, 2022 · 11 comments · Fixed by #14865
Closed
Assignees
Milestone

Comments

@ghost
Copy link

ghost commented Mar 25, 2022

Compiler version

Scala 3.1.1

Minimized code

import scala.annotation.tailrec
object main {
  @tailrec
  def add(x: Int, y: Int): Int =
    if (x == 0) y else add(x - 1, y + 1)
}

Scala 2.13.8 compiles the code as follows (only the add method bytecode is shown, once in a more compact form, just the bytecodes, and then a verbose form with local variables as well).

// just the code

public int add(int, int);
  Code:
     0: iload_1
     1: iconst_0
     2: if_icmpne     9
     5: iload_2
     6: goto          20
     9: iload_1
    10: iconst_1
    11: isub
    12: iload_2
    13: iconst_1
    14: iadd
    15: istore_2
    16: istore_1
    17: goto          0
    20: ireturn

// same code, verbose with local variables

// access flags 0x1
public add(II)I
  // parameter final  x
  // parameter final  y
 L0
  LINENUMBER 5 L0
 FRAME SAME
  ILOAD 1
  ICONST_0
  IF_ICMPNE L1
  ILOAD 2
  GOTO L2
 L1
 FRAME SAME
  ILOAD 1
  ICONST_1
  ISUB
  ILOAD 2
  ICONST_1
  IADD
  ISTORE 2
  ISTORE 1
  GOTO L0
 L2
 FRAME SAME1 I
  IRETURN
 L3
  LOCALVARIABLE this Lmain$; L0 L3 0
  LOCALVARIABLE x I L0 L3 1
  LOCALVARIABLE y I L0 L3 2
  MAXSTACK = 3
  MAXLOCALS = 3

Please take a note that only 3 local variables are used to compile the method (a reference to this, which is the outer object, irrelevant for the issue, and a local variable for each of the x and y function params.

Now let's look at the bytecode generated by Scala 3.1.1, for the same exact code. Again, the code is shown twice, once in a more compact form, and once in a verbose form that shows the local variables.

// just the code

public int add(int, int);
  Code:
     0: iload_2
     1: istore_3
     2: iload_1
     3: istore        4
     5: iload         4
     7: iconst_0
     8: if_icmpne     15
    11: iload_3
    12: goto          36
    15: iload         4
    17: iconst_1
    18: isub
    19: istore        5
    21: iload_3
    22: iconst_1
    23: iadd
    24: istore        6
    26: iload         5
    28: istore        4
    30: iload         6
    32: istore_3
    33: goto          37
    36: ireturn
    37: goto          5
    40: athrow
    41: athrow

// same code, verbose with local variables

// access flags 0x1
public add(II)I
  // parameter final  x
  // parameter final  y
 L0
  LINENUMBER 5 L0
  ILOAD 2
  ISTORE 3
  ILOAD 1
  ISTORE 4
 L1
 FRAME APPEND [I I]
  ILOAD 4
  ICONST_0
  IF_ICMPNE L2
  ILOAD 3
  GOTO L3
 L2
 FRAME SAME
  ILOAD 4
  ICONST_1
  ISUB
  ISTORE 5
  ILOAD 3
  ICONST_1
  IADD
  ISTORE 6
  ILOAD 5
  ISTORE 4
  ILOAD 6
  ISTORE 3
  GOTO L4
 L3
 FRAME SAME1 I
  IRETURN
 L4
 FRAME APPEND [I I]
  GOTO L1
 L5
 FRAME FULL [] [java/lang/Throwable]
  ATHROW
 L6
 FRAME SAME1 java/lang/Throwable
  ATHROW
 L7
  LOCALVARIABLE this Lmain$; L0 L7 0
  LOCALVARIABLE x I L0 L7 1
  LOCALVARIABLE y I L0 L7 2
  MAXSTACK = 2
  MAXLOCALS = 7

As you can see, the same code is compiled using 7 local variables, 3 named ones (the same as in Scala 2) and 4 unnamed ones. Let's try to figure out a sort of mapping between them.

Right at the beginning of the function code, we have iload_2 and istore_3, which moves the value of the third (0 based indexing) local variable (named y) into an unnamed local variable with index 3.

Following that, we have iload_1 and istore 4, again, moving the value of the first local variable (named x) into an unnamed local variable with index 4. Thus, right at the beginning of the function, both function parameters were just copied over into new local variables.

Later down in the function, we can see that the results of x - 1 and y + 1 are stored again in these 2 unnamed variables (index 4 for x and index 3 for y), so, this means that effectively, the x and y function parameters are unused during the lifecycle of the function (they contain the original values that the function was called with), while the unnamed local variables with index 4 and index 3 serve as the effective stores of value for x and y during the recursive execution of the function.

Furthermore, we have a couple of istore 5 and istore 6 instructions which only transiently store the results of x - 1 and y + 1, right before they are moved to the local variables 4 and 3, respectively, and the function body is executed again. Compared to Scala 2, this function uses 4 variables and ignores 2, to do a job that can be done with just the 2 function parameters as the stores of value.

As a real world effect of this situation, the IntelliJ IDEA debugger for Scala 3 (any version) does not work correctly with tail-recursive functions. The debugger shows the function parameters x and y as stack frame values, and they never change their values, which is supported by the bytecode above. Of course, other local values can be shown, but the experience is not great.

My question thus is, is this an intentional change in Scala 3 (I personally cannot see that being the case, because the code generation scheme produces much larger bytecode with larger stack requirements), or is this an ommission in Scala 3 due to some changes in the bytecode generation in Scala 2 not being ported over to dotty during development?

In any case, how should tooling builders approach this issue? Should we try to issue a fix in Scala 3, or try to work around it in the tooling? I don't believe this is a problem for IntelliJ only, most debuggers would interpret the bytecode as it is produced.

Thanks in advance.

@ghost ghost added itype:bug stat:needs triage Every issue needs to have an "area" and "itype" label labels Mar 25, 2022
@smarter
Copy link
Member

smarter commented Mar 25, 2022

is this an ommission in Scala 3 due to some changes in the bytecode generation in Scala 2 not being ported over to dotty during development?

No idea. But the way we the tailrec phase works in Scala 3 is different from Scala 2: it emits Labeled blocks which can be easily interpreted by the scala.js backend, I don't know how this influences the bytecode generated by the jvm backend /cc @sjrd

I don't believe this is a problem for IntelliJ only, most debuggers would interpret the bytecode as it is produced.

@adpi2 is this also an issue in the metals debugger?

@smarter smarter added area:backend itype:question and removed stat:needs triage Every issue needs to have an "area" and "itype" label itype:bug labels Mar 25, 2022
@ghost
Copy link
Author

ghost commented Mar 25, 2022

I'm seeing the same issue in the Metals debugger with 3.1.1. Switching to 2.13.8 works as expected.

@ghost
Copy link
Author

ghost commented Mar 25, 2022

No idea. But the way we the tailrec phase works in Scala 3 is different from Scala 2: it emits Labeled blocks which can be easily interpreted by the scala.js backend, I don't know how this influences the bytecode generated by the jvm backend

This is confirmed by the bytecode. I understand that it was a conscious choice, but the effects on JVM bytecode might not have been anticipated.

@ghost
Copy link
Author

ghost commented Mar 25, 2022

Here's how the issue is manifested in IntelliJ IDEA.

First video is Scala 2.13.8. Second video is 3.1.1.

Screen.Recording.2022-03-25.at.11.49.37.mov
Screen.Recording.2022-03-25.at.11.50.21.mov

@ghost
Copy link
Author

ghost commented Mar 25, 2022

And here's the same setup in Metals, first video Scala 2.13.8, second video Scala 3.1.1.

Screen.Recording.2022-03-25.at.11.52.55.mov
Screen.Recording.2022-03-25.at.11.53.41.mov

@ghost
Copy link
Author

ghost commented Mar 25, 2022

As another curiosity, both debuggers stop 3 times in Scala 2 ((x, y) have the values (2,3), (1, 4), (0, 5)), while they only stop 2 times in Scala 3.

@sjrd
Copy link
Member

sjrd commented Mar 25, 2022

As @smarter said, tailrec functions are internally transformed in a much different way than in Scala 2. This is during tree transformations, several phases before emitting bytecode. Scala 2 had arbitrary labels and gotos, whereas Scala 3 only has "labeled blocks" (a form similar to label: { statements; } in JavaScript). Scala 2 also allowed its internal trees to set the values of immutable parameters in that case, which is something that Scala 3 does not allow anymore.

This change of scheme has an impact on the JVM bytecode that is emitted, indeed. This was expected. I did not anticipate that it would impair debuggers, however (the semantics are the same, obviously).

It would be possible to alter the bytecode generation to reuse the slots of x and y instead of the mutable temporaries that are used within the implementations (3 and 4 in your analysis). That would require some analysis to determine that x and y are only used to initialize those temporaries, which may not be trivial to do, but doable. IIUC, that should be enough to improve the debugging experience.

Regarding the extra temporaries used to change the values of the temporaries when doing the recursive call, that is a bit more complicated. In theory, a local bytecode optimizer could get rid of them, but this is more difficult to analyze. Unless fixing that would also be required to improve the debugging experience, I would not suggest investing time in that specific case.

@ghost
Copy link
Author

ghost commented Mar 25, 2022

The second case where only the temporary results of x-1 and y+1 are stored doesn't affect the debugger (actually, it would only affect the debugger if the debugger chooses to show them, and AFAIK, this is not the default in any debugger, IntelliJ and Metals, the problem with showing them of course comes down to user experience, since they are unnamed variables that don't exist in source).

@smarter smarter changed the title A question about compiled bytecode of tail recursive functions Bytecode generation for tailrec methods uses temporary variables which confuses debuggers Mar 25, 2022
@lrytz
Copy link
Member

lrytz commented Apr 6, 2022

@vasilmkd-jetbrains can you estimate how difficult it would be to teach the deubbger about this pattern? Maybe it would help if the synthetic local variables had names in the bytecode?

Intuitively, it seems to me Scala 3 is emitting fine bytecode for tailrec. But I see that by assigning to the locals of parameters the way it's done in Scala 2, the debuggers would just work out of the box.

@ghost
Copy link
Author

ghost commented Apr 6, 2022

I guess teaching the debugger is not too difficult, the question remains whether it's an unintended behavior that is going to be changed, or intended and will not change in future Scala versions. Before that is answered, I'm not sure it's wise to do anything on the debugger end.

@sjrd sjrd self-assigned this Apr 6, 2022
sjrd added a commit to dotty-staging/dotty that referenced this issue Apr 6, 2022
…vars.

The tailrec phase generates code that looks like

  class Bar {
    def foo(x1: Int, x2: Int): Int = {
      var this$: Bar = this;
      var taillocal$x1: Int = x1;
      var taillocal$x2: Int = x2;
      ...
      // body where `this`, `x1` and `x2` are replaced by
      // `this$`, `taillocal$x1` and `taillocal$x2`
    }
  }

This generates bytecode where the `this` value and the parameters
never actually change, and are never used. Instead, the synthetic
mutable variables are used instead.

As described in the linked issue, this confuses debuggers, which
only display the never-changing original `this` value and
parameters.

In this commit, we intercept this shape of code in the back-end.
We reliable identify tailrec-generated `ValDef`s from their
semantic names, with an additional safety check that they are
`Synthetic | Mutable`. When we find this shape, we do not allocate
local slots for the `var`s, and instead reuse the slots for `this`
and the parameters. We skip past the `ValDef`s so that code
generation does not re-emit useless assignments.
@sjrd
Copy link
Member

sjrd commented Apr 6, 2022

PR available: #14865.

smarter added a commit that referenced this issue Apr 7, 2022
…rec-methods

Fix #14773: Reuse the param slots for the tailrec local mutable vars.
@sjrd sjrd added this to the 3.2.0-RC1 milestone Apr 7, 2022
michelou pushed a commit to michelou/scala3 that referenced this issue Apr 25, 2022
…vars.

The tailrec phase generates code that looks like

  class Bar {
    def foo(x1: Int, x2: Int): Int = {
      var this$: Bar = this;
      var taillocal$x1: Int = x1;
      var taillocal$x2: Int = x2;
      ...
      // body where `this`, `x1` and `x2` are replaced by
      // `this$`, `taillocal$x1` and `taillocal$x2`
    }
  }

This generates bytecode where the `this` value and the parameters
never actually change, and are never used. Instead, the synthetic
mutable variables are used instead.

As described in the linked issue, this confuses debuggers, which
only display the never-changing original `this` value and
parameters.

In this commit, we intercept this shape of code in the back-end.
We reliable identify tailrec-generated `ValDef`s from their
semantic names, with an additional safety check that they are
`Synthetic | Mutable`. When we find this shape, we do not allocate
local slots for the `var`s, and instead reuse the slots for `this`
and the parameters. We skip past the `ValDef`s so that code
generation does not re-emit useless assignments.
@Kordyjan Kordyjan modified the milestones: 3.2.0-RC1, 3.2.0 Aug 1, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants