Skip to content

Methods with struct parameters are not inlined #53783

Open
@atynagano

Description

@atynagano

Description

Based on x64 in sharplab:

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABABgAJiBGAOgCUBXAOwwEt8YaBhCfAB1YAbGFADKIgG6swMXAG4AsAChlxAEzku5AN7Ly5PfsopyAMSoAKAGaCI2DOSsBKcgF4AfGbXWnipUeMva1t7RxcPMwBmHz8A4hNTaJs7B2c3TwBZGIDDfVzyAG0MmAwACwgAEwBJfkELYrLKmr5BAHk+NggmXBoAOQgqpkFWJhGAcycAXXz48izk0LTtcgBfZRWgA=

The following:

using System.Runtime.CompilerServices;

class C {
  
    void F1(float f) => F2(f);
    void F2(float f) => F3(f);
    void F3(float f) => M(f);    
    
    [MethodImpl(MethodImplOptions.NoInlining)]
    void M(float f) { }
}

is compiled as:

; Core CLR v5.0.621.22011 on amd64

C..ctor()
    L0000: ret

C.F1(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.F2(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.F3(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.M(Single)
    L0000: ret

Methods with primitive parameters are inlined, and F1, F2, and F3 output the same code.

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABABgAJiBGAOgCUBXAOwwEt8YaBhCfAB1YAbGFADKIgG6swMXAG4AsAChlxAEzku5AN7Ly+8noO4MUBmAzkAYoIjYM2ygGZyAM1v3yANWyCGMOXIAXyN9UMoUayoAChs7S1cASnIAXgA+azVopMUlAwjM2I8E5PTrJ2zE3PziSKsKuM8k1IyAWUrAg3DwgG1WmAwACwgAEwBJfkFo/qHRib5BAHk+NggmXBoAOQgxpkFWJgOAc0SAXXDa8nbGkp1g5SCgA

But this sample:

using System.Runtime.CompilerServices;

class C {
    
    struct Float{ public float Value; }
    
    void F1(Float f) => F2(f);
    void F2(Float f) => F3(f);
    void F3(Float f) => M(f);    
    
    [MethodImpl(MethodImplOptions.NoInlining)]
    void M(Float f) { }
}

is compiled as:

; Core CLR v5.0.621.22011 on amd64

C..ctor()
    L0000: ret

C.F1(Float)
    L0000: push rax
    L0001: mov [rsp+0x18], rdx
    L0006: mov edx, [rsp+0x18]
    L000a: mov [rsp], edx
    L000d: mov edx, [rsp]
    L0010: add rsp, 8
    L0014: jmp C.M(Float)

C.F2(Float)
    L0000: push rax
    L0001: mov [rsp+0x18], rdx
    L0006: mov edx, [rsp+0x18]
    L000a: mov [rsp], edx
    L000d: mov edx, [rsp]
    L0010: add rsp, 8
    L0014: jmp C.M(Float)

C.F3(Float)
    L0000: mov [rsp+0x10], rdx
    L0005: mov edx, [rsp+0x10]
    L0009: jmp C.M(Float)

C.M(Float)
    L0000: ret

F1 and F2 are longer than F3 and seem to consume unnecessary stack. These methods are expected to work the same, so why this difference?

category:cq
theme:inlining
skill-level:intermediate
cost:medium
impact:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

    Type

    No type

    Projects

    Status

    Backlog (General)

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions