Skip to content

i32 load + zext is merged into 64-bit load for __ptr32 on amd64 #66873

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
cac03 opened this issue Sep 20, 2023 · 1 comment · Fixed by #67168
Closed

i32 load + zext is merged into 64-bit load for __ptr32 on amd64 #66873

cac03 opened this issue Sep 20, 2023 · 1 comment · Fixed by #67168

Comments

@cac03
Copy link

cac03 commented Sep 20, 2023

Reproduction

https://godbolt.org/z/PGK3PEf64

__attribute((noinline))
uint64_t LoadAsLong(uint32_t* __ptr32 foo) {
	return static_cast<uint64_t>(*foo);
}

I expect that the LoadAsLong will return 42 for the test case in the godbolt listing.
But it does not.

If do not use __ptr32, then the LoadAsLong returns 42

Investigation

Clang produces the following assembly for the LoadAsLong function:

        movslq  %edi, %rax
        movq    (%rax), %rax
        retq

the assembly loads 64 bit value but had to load 32 bits.

As far as I see llvm IR for the function is correct before isel:

define dso_local noundef i64 @LoadAsLong(unsigned int ptr32_sptr*)(ptr addrspace(270) nocapture noundef readonly %foo) local_unnamed_addr {
entry:
  %0 = load i32, ptr addrspace(270) %foo, align 4
  %conv = zext i32 %0 to i64
  ret i64 %conv
}

but after isel:

# Machine code for function LoadAsLong(unsigned int ptr32_sptr*): IsSSA, TracksLiveness
Function Live Ins: $rdi in %0

bb.0.entry:
  liveins: $rdi
  %0:gr64 = COPY $rdi
  %1:gr32 = COPY %0.sub_32bit:gr64
  %2:gr64 = MOVSX64rr32 killed %1:gr32
  %3:gr64 = MOV64rm killed %2:gr64, 1, $noreg, 0, $noreg :: (load (s64) from %ir.foo, align 4, addrspace 270)
  $rax = COPY %3:gr64
  RET 0, $rax

# End machine code for function LoadAsLong(unsigned int ptr32_sptr*).

there is load (s64) from %ir.foo, align 4, addrspace 270

@llvmbot
Copy link
Member

llvmbot commented Sep 20, 2023

@llvm/issue-subscribers-backend-x86

## Reproduction

https://godbolt.org/z/PGK3PEf64

__attribute((noinline))
uint64_t LoadAsLong(uint32_t* __ptr32 foo) {
	return static_cast&lt;uint64_t&gt;(*foo);
}

I expect that the LoadAsLong will return 42 for the test case in the godbolt listing.
But it does not.

If do not use __ptr32, then the LoadAsLong returns 42

Investigation

Clang produces the following assembly for the LoadAsLong function:

        movslq  %edi, %rax
        movq    (%rax), %rax
        retq

the assembly loads 64 bit value but had to load 32 bits.

As far as I see llvm IR for the function is correct before isel:

define dso_local noundef i64 @<!-- -->LoadAsLong(unsigned int ptr32_sptr*)(ptr addrspace(270) nocapture noundef readonly %foo) local_unnamed_addr {
entry:
  %0 = load i32, ptr addrspace(270) %foo, align 4
  %conv = zext i32 %0 to i64
  ret i64 %conv
}

but after isel:

# Machine code for function LoadAsLong(unsigned int ptr32_sptr*): IsSSA, TracksLiveness
Function Live Ins: $rdi in %0

bb.0.entry:
  liveins: $rdi
  %0:gr64 = COPY $rdi
  %1:gr32 = COPY %0.sub_32bit:gr64
  %2:gr64 = MOVSX64rr32 killed %1:gr32
  %3:gr64 = MOV64rm killed %2:gr64, 1, $noreg, 0, $noreg :: (load (s64) from %ir.foo, align 4, addrspace 270)
  $rax = COPY %3:gr64
  RET 0, $rax

# End machine code for function LoadAsLong(unsigned int ptr32_sptr*).

there is load (s64) from %ir.foo, align 4, addrspace 270

e-kud added a commit that referenced this issue Oct 4, 2023
e-kud added a commit to e-kud/llvm-project that referenced this issue Oct 4, 2023
Type extension and truncation is missed when combining loads and stores
for ptr32 and ptr64.

Closes llvm#66873
e-kud added a commit that referenced this issue Oct 11, 2023
…r64 pointers (#67168)

The value extension and truncation were missed during casting __ptr32/__ptr64
pointers to the default address space.

Closes #66873
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants