Skip to content

Incorrect handling of lateout pairs in inline asm #57550

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
newpavlov opened this issue Sep 4, 2022 · 3 comments
Open

Incorrect handling of lateout pairs in inline asm #57550

newpavlov opened this issue Sep 4, 2022 · 3 comments

Comments

@newpavlov
Copy link

newpavlov commented Sep 4, 2022

The issue was discovered while looking into source of rust-lang/rust#101346.

The following Rust function:

pub fn foo() -> u32 {
    let t1: u32;
    let t2: u32;
    unsafe {
        asm!(
            "mov {0:e}, 1",
            "mov eax, 42",
            lateout(reg) t1,
            lateout("eax") t2,
            options(nostack),
        );
    }
    t1
}

Gets compiled into this obviously incorrect assembly:

example::foo:
        mov     eax, 1
        mov     eax, 42
        ret

Godbolt link: https://rust.godbolt.org/z/Yb9v7WobM

LLVM incorrectly reuses register for a pair of lateouts if it can see that one of those does not get used later.

@asl
Copy link
Collaborator

asl commented Sep 4, 2022

@newpavlov Please attach LLVM IR that could be used to reproduce the issue. Thanks!

@newpavlov
Copy link
Author

https://llvm.godbolt.org/z/qxrfd7fj3

define i32 @foo() unnamed_addr #0 {
start:
  %0 = tail call { i32, i32 } asm inteldialect "mov ${0:k}, 1\0Amov eax, 42", "=r,={ax},~{dirflag},~{fpsr},~{flags}"() #1, !srcloc !2
  %1 = extractvalue { i32, i32 } %0, 0
  ret i32 %1
}

!0 = !{i32 7, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
!2 = !{i32 0, i32 108, i32 136}

@nikic
Copy link
Contributor

nikic commented Sep 4, 2022

Just to be clear, the problem here is that with an =r,={ax} constraint string, both output registers are allocated to eax.

It looks like we originally get a correct allocation to ecx followed by copy to eax, but the copy is removed by machine copy propagation:

# *** IR Dump After Stack Slot Coloring (stack-slot-coloring) ***:
# Machine code for function foo: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues

0B	bb.0.start:
16B	  INLINEASM &"mov ${0}, 1\0Amov ${1}, 42" [inteldialect], $0:[regdef:GR32], def renamable $ecx, $1:[regdef], implicit-def dead $eax
32B	  $eax = COPY killed renamable $ecx
48B	  RET 0, $eax

# End machine code for function foo.

# *** IR Dump After Machine Copy Propagation Pass (machine-cp) ***:
# Machine code for function foo: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues

bb.0.start:
  INLINEASM &"mov ${0}, 1\0Amov ${1}, 42" [inteldialect], $0:[regdef:GR32], def $eax, $1:[regdef], implicit-def dead $eax
  RET 0, $eax

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants