Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

JIT: Bad codegen with loop #75442

Closed
jakobbotsch opened this issue Sep 12, 2022 · 7 comments · Fixed by #89241
Closed

JIT: Bad codegen with loop #75442

jakobbotsch opened this issue Sep 12, 2022 · 7 comments · Fixed by #89241
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have regression-from-last-release
Milestone

Comments

@jakobbotsch
Copy link
Member

jakobbotsch commented Sep 12, 2022

// Generated by Fuzzlyn v1.5 on 2022-09-11 14:55:09
// Run on Arm64 MacOS
// Seed: 6196300087286376006
// Reduced from 63.7 KiB to 0.9 KiB in 00:00:14
// Debug: Outputs 1
// Release: Outputs 0
public class C0
{
    public sbyte F6;
}

public class C1
{
    public bool F0;
    public sbyte F1;
    public ushort F7;
    public C1(bool f0)
    {
        F0 = f0;
    }
}

public struct S0
{
    public C0 F2;
    public sbyte F5;
    public S0(C0 f2, sbyte f5): this()
    {
        F2 = f2;
        F5 = f5;
    }
}

public class Program
{
    public static S0 s_3 = new S0(new C0(), 1);
    public static S0 s_13 = new S0(new C0(), 0);
    public static void Main()
    {
        C1 vr0 = new C1(true);
        C1[] vr1 = new C1[]{new C1(true)};
        for (int vr2 = 0; vr2 < 2; vr2++)
        {
            vr1[0] = vr0;
            if (vr1[0].F0)
            {
                vr1[0].F1 = s_3.F5;
            }

            vr1[0].F7 = vr1[0].F7;
            System.Console.WriteLine(s_13.F2.F6);
            s_13.F2.F6 = vr0.F1;
        }
    }
}

Repros on win-x64 too. Does not repro with .NET 6, but repros with .NET 7 from two months ago, so not a recent regression it seems.

category:correctness
theme:value-numbering
skill-level:expert
cost:medium
impact:small

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 12, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 12, 2022
@ghost
Copy link

ghost commented Sep 12, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
// Generated by Fuzzlyn v1.5 on 2022-09-11 14:55:09
// Run on Arm64 MacOS
// Seed: 6196300087286376006
// Reduced from 63.7 KiB to 0.9 KiB in 00:00:14
// Debug: Outputs 1
// Release: Outputs 0
public class C0
{
    public sbyte F6;
}

public class C1
{
    public bool F0;
    public sbyte F1;
    public ushort F7;
    public C1(bool f0)
    {
        F0 = f0;
    }
}

public struct S0
{
    public C0 F2;
    public sbyte F5;
    public S0(C0 f2, sbyte f5): this()
    {
        F2 = f2;
        F5 = f5;
    }
}

public class Program
{
    public static S0 s_3 = new S0(new C0(), 1);
    public static S0 s_13 = new S0(new C0(), 0);
    public static void Main()
    {
        C1 vr0 = new C1(true);
        C1[] vr1 = new C1[]{new C1(true)};
        for (int vr2 = 0; vr2 < 2; vr2++)
        {
            vr1[0] = vr0;
            if (vr1[0].F0)
            {
                vr1[0].F1 = s_3.F5;
            }

            vr1[0].F7 = vr1[0].F7;
            System.Console.WriteLine(s_13.F2.F6);
            s_13.F2.F6 = vr0.F1;
        }
    }
}

Repros on win-x64 too. Does not repro with .NET 6, but repros with .NET 7 from two months ago, so not a recent regression it seems.

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label Sep 12, 2022
@jakobbotsch jakobbotsch self-assigned this Sep 12, 2022
@jakobbotsch jakobbotsch added this to the 7.0.0 milestone Sep 12, 2022
@jakobbotsch
Copy link
Member Author

We seem to hoist the access of vr1[0] out of the loop, even though it is modified inside the loop. Looks pretty similar to #54118.
It was exposed by #58696 which allows us to now reason about this code in VN/hoisting.

@jakobbotsch
Copy link
Member Author

The problem seems to come when we value number

            vr1[0].F7 = vr1[0].F7;

We properly mark vr1[0] as having a loop dependency on the LHS, but on the RHS we hit the MapSelect cache and thus the tree on the right does not get the mark (annotated with some comments):

***** BB04, STMT00012(before)
N021 ( 17, 17) [000062] -A-XG------                           ASG       ushort
N010 (  8,  8) [000061] D--XG--N---                         ├──▌  IND       ushort
N009 (  5,  5) [000194] ----G--N---                           └──▌  ADD       byref 
N007 (  4,  4) [000205] ----G--N---                              ├──▌  COMMA     ref   
N001 (  0,  0) [000198] -----------                                ├──▌  NOP       void  
N006 (  4,  4) [000206] n---G------                                └──▌  IND       ref                       // vr1[0] on the LHS
N005 (  1,  2) [000204] -----------                                   └──▌  ARR_ADDR  byref ref[] $80
N004 (  1,  2) [000203] -------N---                                      └──▌  ADD       byref 
N002 (  1,  1) [000195] -----------                                         ├──▌  LCL_VAR   ref    V01 loc1         u:2
N003 (  1,  1) [000202] -----------                                         └──▌  CNS_INT   long   16
N008 (  1,  1) [000193] -----------                              └──▌  CNS_INT   long   8 Fseq[F7]
N020 (  8,  8) [000060] ---XG------                         └──▌  IND       ushort
N019 (  5,  5) [000208] ----G--N---                            └──▌  ADD       byref 
N017 (  4,  4) [000219] ----G--N---                               ├──▌  COMMA     ref   
N011 (  0,  0) [000212] -----------                                 ├──▌  NOP       void  
N016 (  4,  4) [000220] n---G------                                 └──▌  IND       ref                       // vr1[0] on the RHS
N015 (  1,  2) [000218] -----------                                    └──▌  ARR_ADDR  byref ref[] $80
N014 (  1,  2) [000217] -------N---                                       └──▌  ADD       byref 
N012 (  1,  1) [000209] -----------                                          ├──▌  LCL_VAR   ref    V01 loc1         u:2
N013 (  1,  1) [000216] -----------                                          └──▌  CNS_INT   long   16
N018 (  1,  1) [000207] -----------                               └──▌  CNS_INT   long   8 Fseq[F7]

N001 [000198]   NOP       => $384 {MemOpaque:L00}
N002 [000195]   LCL_VAR   V01 loc1         u:2 => $340 {JitNewArr($43, $1c1, $142)}
N003 [000202]   CNS_INT   16 => $1c2 {LngCns:  16}
N004 [000203]   ADD       => $202 {ADD($1c2, $340)}
    VNForHandle(arrElemType: ref) is $40
N005 [000204]   ARR_ADDR  => $81 {PtrToArrElem($40, $340, $1c3, $1c3)}
  Array element load: elemTypeEq is $40 for ref[]
      AX1: select([$101]store($2c3, $40, $285), $40) ==> $285.
      ==> Updating loop memory dependence of [000206] to L00                  // vr1[0] on the LHS gets the mark
      AX2: $40 != $48 ==> select([$2c4]store($2c3, $48, $286), $40) ==> select($2c3, $40) remaining budget is 97.
      AX1: select([$101]store($2c3, $40, $285), $40) ==> $285.
      ==> Not updating loop memory dependence of [000206]; alrady constrained to L00 nested in L00
    VNForMapSelect($601, $40):mem returns $285 {$243[$340 := $284]@L00}
  GcHeap[elemTypeEq: $40] is $285
      AX1: select([$243]store($285, $340, $284), $340) ==> $284.
      ==> Not updating loop memory dependence of [000206], memory $243 not defined in a loop
    VNForMapSelect($285, $340):mem returns $284 {$244[$1c3 := $180]@L00}
  GcHeap[elemTypeEq][array: $340] is $284
      AX1: select([$244]store($284, $1c3, $180), $1c3) ==> $180.
      ==> Not updating loop memory dependence of [000206], memory $244 not defined in a loop
    VNForMapSelect($284, $1c3):ref returns $180 {JitNew($41, $140)}
  GcHeap[elemTypeEq][array][index: $1c3] is $180
N006 [000206]   IND       => <l:$180 {JitNew($41, $140)}, c:$148 {MemOpaque:L00}>
N007 [000205]   COMMA     => <l:$180 {JitNew($41, $140)}, c:$148 {MemOpaque:L00}>
N008 [000193]   CNS_INT   8 Fseq[F7] => $1c7 {LngCns:  8}
N009 [000194]   ADD       => <l:$206 {ADD($180, $1c7)}, c:$207 {ADD($148, $1c7)}>
N010 [000061]   IND       => <l:$VN.Void, c:$18e {norm=$VN.Void, exc=$18d {NullPtrExc($148)}}>
N011 [000212]   NOP       => $385 {MemOpaque:L00}
N012 [000209]   LCL_VAR   V01 loc1         u:2 => $340 {JitNewArr($43, $1c1, $142)}
N013 [000216]   CNS_INT   16 => $1c2 {LngCns:  16}
N014 [000217]   ADD       => $202 {ADD($1c2, $340)}
    VNForHandle(arrElemType: ref) is $40
N015 [000218]   ARR_ADDR  => $81 {PtrToArrElem($40, $340, $1c3, $1c3)}
  Array element load: elemTypeEq is $40 for ref[]
    VNForMapSelect($601, $40):mem returns $285 {$243[$340 := $284]@L00}             // hits the cache, not marked loop dependent  GcHeap[elemTypeEq: $40] is $285
      AX1: select([$243]store($285, $340, $284), $340) ==> $284.
      ==> Not updating loop memory dependence of [000220], memory $243 not defined in a loop
    VNForMapSelect($285, $340):mem returns $284 {$244[$1c3 := $180]@L00}
  GcHeap[elemTypeEq][array: $340] is $284
      AX1: select([$244]store($284, $1c3, $180), $1c3) ==> $180.
      ==> Not updating loop memory dependence of [000220], memory $244 not defined in a loop
    VNForMapSelect($284, $1c3):ref returns $180 {JitNew($41, $140)}
  GcHeap[elemTypeEq][array][index: $1c3] is $180
N016 [000220]   IND       => <l:$180 {JitNew($41, $140)}, c:$14a {MemOpaque:L00}>
N017 [000219]   COMMA     => <l:$180 {JitNew($41, $140)}, c:$14a {MemOpaque:L00}>
N018 [000207]   CNS_INT   8 Fseq[F7] => $1c7 {LngCns:  8}
N019 [000208]   ADD       => <l:$206 {ADD($180, $1c7)}, c:$208 {ADD($14a, $1c7)}>
    VNForHandle(F7) is $4a, fieldType is ushort, size = 2
      AX2: $4a != $40 ==> select([$2c3]store($101, $40, $285), $4a) ==> select($101, $4a) remaining budget is 98.
      AX2: $4a != $48 ==> select([$2c4]store($2c3, $48, $286), $4a) ==> select($2c3, $4a) remaining budget is 96.
      AX2: $4a != $40 ==> select([$2c3]store($101, $40, $285), $4a) ==> select($101, $4a) remaining budget is 95.
    VNForMapSelect($601, $4a):mem returns $247 {$101[$4a]}
    VNForMapSelect($247, $180):ushort returns $640 {$247[$180]}
N020 [000060]   IND       => <l:$640 {$247[$180]}, c:$641 {norm=$680 {MemOpaque:L00}, exc=$18f {NullPtrExc($14a)}}>
    VNForHandle(F7) is $4a, fieldType is ushort, size = 2
    VNForMapSelect($601, $4a):mem returns $247 {$101[$4a]}
    VNForMapStore($247, $180, $640):mem in BB04 returns $287 {$247[$180 := $640]@L00}
    VNForMapStore($601, $4a, $287):heap in BB04 returns $2c5 {$601[$4a := $287]@L00}
  fgCurMemoryVN[GcHeap] assigned for StoreField at [000062] to VN: $2c5.
N021 [000062]   ASG       => <l:$VN.Void, c:$191 {norm=$VN.Void, exc=$190( {NullPtrExc($148)},  {NullPtrExc($14a)})}>

***** BB04, STMT00012(after)
N021 ( 17, 17) [000062] -A-XG------                           ASG       ushort <l:$VN.Void, c:$191>
N010 (  8,  8) [000061] D--XG--N---                         ├──▌  IND       ushort <l:$VN.Void, c:$18e>
N009 (  5,  5) [000194] ----G--N---                           └──▌  ADD       byref  <l:$206, c:$207>
N007 (  4,  4) [000205] ----G--N---                              ├──▌  COMMA     ref    <l:$180, c:$148>
N001 (  0,  0) [000198] -----------                                ├──▌  NOP       void   $384
N006 (  4,  4) [000206] n---G------                                └──▌  IND       ref    <l:$180, c:$148>
N005 (  1,  2) [000204] -----------                                   └──▌  ARR_ADDR  byref ref[] $81
N004 (  1,  2) [000203] -------N---                                      └──▌  ADD       byref  $202
N002 (  1,  1) [000195] -----------                                         ├──▌  LCL_VAR   ref    V01 loc1         u:2 $340
N003 (  1,  1) [000202] -----------                                         └──▌  CNS_INT   long   16 $1c2
N008 (  1,  1) [000193] -----------                              └──▌  CNS_INT   long   8 Fseq[F7] $1c7
N020 (  8,  8) [000060] ---XG------                         └──▌  IND       ushort <l:$640, c:$641>
N019 (  5,  5) [000208] ----G--N---                            └──▌  ADD       byref  <l:$206, c:$208>
N017 (  4,  4) [000219] ----G--N---                               ├──▌  COMMA     ref    <l:$180, c:$14a>
N011 (  0,  0) [000212] -----------                                 ├──▌  NOP       void   $385
N016 (  4,  4) [000220] n---G------                                 └──▌  IND       ref    <l:$180, c:$14a>
N015 (  1,  2) [000218] -----------                                    └──▌  ARR_ADDR  byref ref[] $81
N014 (  1,  2) [000217] -------N---                                       └──▌  ADD       byref  $202
N012 (  1,  1) [000209] -----------                                          ├──▌  LCL_VAR   ref    V01 loc1         u:2 $340
N013 (  1,  1) [000216] -----------                                          └──▌  CNS_INT   long   16 $1c2
N018 (  1,  1) [000207] -----------                               └──▌  CNS_INT   long   8 Fseq[F7] $1c7

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 12, 2022

#54118 was in the product for years and this seems like an even harder to hit variant of that (that one was only found by fuzzing). I don't think this is needed for 7.0.

@jakobbotsch jakobbotsch modified the milestones: 7.0.0, 8.0.0 Sep 12, 2022
@AndyAyersMS
Copy link
Member

Seems like we just need to add similar logic to the caching side?

@jakobbotsch
Copy link
Member Author

Seems like we just need to add similar logic to the caching side?

Yes, we probably need to introduce a dedicated cache for MapSelect and communicate the memory dependency back from MapSelectWork (and cache it in the dedicated cache). I started working on it but it was not a priori clear to me that there always is a "most conservative" memory dependency to put in the cache. The other alternative is to walk the VN and call optRecordLoopMemoryDependence for all the loop dependencies in the cached case, but seems it would be expensive.
Might also be that we can early out when we hit the first optRecordLoopMemoryDependence, but it wasn't totally clear to me either.

@AndyAyersMS
Copy link
Member

We should fix this in .NET 8.

@jakobbotsch jakobbotsch added the Priority:3 Work that is nice to have label Jun 20, 2023
jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Jul 20, 2023
Hoisting relies on being able to look at which memory dependencies a
candidate is dependent upon. VN tracks these during the map select
logic; however, it fails to do so when the map selection hits the cache.
This changes the logic to make sure the memory dependencies are cached.

Fix dotnet#75442
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 20, 2023
jakobbotsch added a commit that referenced this issue Jul 20, 2023
Hoisting relies on being able to look at which memory dependencies a
candidate is dependent upon. VN tracks these during the map select
logic; however, it fails to do so when the map selection hits the cache.
This changes the logic to make sure the memory dependencies are cached.

Fix #75442
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 20, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 20, 2023
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have regression-from-last-release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants