-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Extra null check in slice iterator's .next().is_none()
#37945
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
.next()
.next().is_none()
|
Assume shouldn't even be needed, should it. Creating a shared reference should be enough of a non-null assertion. |
It's SROA this time, eating define zeroext i1 @sroa_fail(float**) {
entry-block:
%buf0 = alloca float*
%buf1 = alloca i8*
%load0 = load float*, float** %0, align 8
store float* %load0, float** %buf0
%load1 = load float*, float** %buf0, align 8, !nonnull !0
%bc = bitcast float* %load1 to i8*
store i8* %bc, i8** %buf1, align 8
%load2 = load i8*, i8** %buf1, align 8
%ret = icmp eq i8* %load2, null
ret i1 %ret
}
!0 = !{} The first load/store pair comes from a lowered |
I hacked together an LLVM patch that preserves the nonnull metadata on Loads in SROA/mem2reg: luqmana/llvm@da58c74 With that, the extraneous null check is gone: define zeroext i1 @_ZN9sroa_fail10is_empty_117hfeb3b6a38ecff321E(%"core::slice::Iter<f32>"* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 {
entry-block:
%xs.sroa.0.0..sroa_idx = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 0
%xs.sroa.0.0.copyload = load float*, float** %xs.sroa.0.0..sroa_idx, align 8, !nonnull !0
%xs.sroa.4.0..sroa_idx12 = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 1
%xs.sroa.4.0.copyload = load float*, float** %xs.sroa.4.0..sroa_idx12, align 8, !nonnull !0
%1 = icmp eq float* %xs.sroa.0.0.copyload, %xs.sroa.4.0.copyload
ret i1 %1
} |
that particular nonnull, where does it come from? The assume? |
My question asked differently, will this also have an extra null check? pub struct MockIter {
start: *const f32,
end: *const f32,
}
impl MockIter {
unsafe fn next<'a>(&mut self) -> Option<&'a f32> {
if self.start != self.end {
let ptr = self.start;
self.start = self.start.offset(1);
Some(&*ptr)
} else {
None
}
}
}
pub fn is_empty_3(xs: MockIter) -> bool {
unsafe {
{xs}.next().is_none()
}
} I am keen on having stable rust (no |
@bluss The |
Even a simple identity function |
Can I help with getting that patch upstreamed somehow? |
@bluss I've submitted it upstream ( https://reviews.llvm.org/D27114 ) |
Are all the three parts needed in the patch? Since one was apparently not a valid optimization. |
https://reviews.llvm.org/rL294897 was merged upstream, not sure how much/little that solves. |
Ok, I finally committed https://reviews.llvm.org/D27114 upstream. So we can either cherypick the commit or just wait til the next LLVM update. |
Whew! Congratulations and thanks for working on it. |
Using the testcase above #![crate_type="lib"]
use std::slice::Iter;
// weird codegen (aliasing / None vs null bug?)
pub fn is_empty_1(xs: Iter<f32>) -> bool {
{xs}.next().is_none()
}
// good codegen
pub fn is_empty_2(xs: Iter<f32>) -> bool {
xs.map(|&x| x).next().is_none()
} With the LLVM patch, both functions generate the same IR: "core::slice::Iter<f32>" = type { float*, float*, %"core::marker::PhantomData<&f32>" }
%"core::marker::PhantomData<&f32>" = type {}
%"unwind::libunwind::_Unwind_Exception" = type { i64, void (i32, %"unwind::libunwind::_Unwind_Exception"*)*, [6 x i64] }
%"unwind::libunwind::_Unwind_Context" = type {}
; Function Attrs: nounwind uwtable
define zeroext i1 @_ZN3bad10is_empty_117h1c4be4edff235dbfE(%"core::slice::Iter<f32>"* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 {
entry-block:
%xs.sroa.0.0..sroa_idx = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 0
%xs.sroa.0.0.copyload = load float*, float** %xs.sroa.0.0..sroa_idx, align 8, !nonnull !0
%xs.sroa.4.0..sroa_idx13 = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 1
%xs.sroa.4.0.copyload = load float*, float** %xs.sroa.4.0..sroa_idx13, align 8, !nonnull !0
%1 = icmp eq float* %xs.sroa.0.0.copyload, %xs.sroa.4.0.copyload
ret i1 %1
}
; Function Attrs: nounwind uwtable
define zeroext i1 @_ZN3bad10is_empty_217h72beabc012005049E(%"core::slice::Iter<f32>"* noalias nocapture readonly dereferenceable(16)) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
entry-block:
%xs.sroa.0.0..sroa_idx = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 0
%xs.sroa.0.0.copyload = load float*, float** %xs.sroa.0.0..sroa_idx, align 8, !nonnull !0
%xs.sroa.4.0..sroa_idx14 = getelementptr inbounds %"core::slice::Iter<f32>", %"core::slice::Iter<f32>"* %0, i64 0, i32 1
%xs.sroa.4.0.copyload = load float*, float** %xs.sroa.4.0..sroa_idx14, align 8, !nonnull !0
%1 = icmp eq float* %xs.sroa.0.0.copyload, %xs.sroa.4.0.copyload
ret i1 %1
}
; Function Attrs: nounwind
declare i32 @rust_eh_personality(i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*) unnamed_addr #1
attributes #0 = { nounwind uwtable }
attributes #1 = { nounwind }
!0 = !{} The patch is a bit more annoying to backport I now realize, since it uses a relatively new method (copyMetadata on Instructions) that's not in our fork which requires cherry picking two more commits. |
Could you open a PR against rust-lang/llvm ? |
@luqmana: Looks like this patch causes a LLVM assertion: https://bugs.llvm.org/show_bug.cgi?id=32902. Commenting out your changes in SROA.cpp makes assert go away. |
Confirming fixed with #40914. With LLVM 5.0 there's also https://bugs.llvm.org/show_bug.cgi?id=33470, but that does not affect Rust's LLVM 4.0. |
This still does not work on 32-bit archs because of an LLVM limitation, but this is only an optimization, so let's push it on 64-bit only for now. Fixes rust-lang#37945
[LLVM] Avoid losing the !nonnull attribute in SROA Fixes #37945. r? @alexcrichton
This is an interesting codegen bug (Note that this reproduces in a special case setting and in many regular contexts, this problem does not exist).
Expected behavior:
This code should just be a pointer comparison (
ptr == end
leads toNone
being returned)Actual behavior:
It compiles to the equivalent of
ptr == end || ptr.is_null()
.Playground link. Use release mode.
version: rustc 1.15.0-nightly (0bd2ce6 2016-11-19)
Additional notes:
This version has the expected codegen, without the null check.
The text was updated successfully, but these errors were encountered: