Open
Description
I tried code very similar to the code below. The expected produced code would perform one bounds check at the start of the function and none inside the loop. offsets
is first asserted to be valid for the initial index. This should be enough to also find that it is valid for all smaller indices. No loop iteration can increase the index so the index correctness is a loop invariant.
fn problematic(buf: &mut [u8], offsets: &[u8], mut idx: usize) {
let offsets = &offsets[..=idx];
for b in buf {
*b = idx as u8;
idx = idx.saturating_sub(usize::from(offsets[idx]));
}
}
Instead we get this:
playground::problematic: push rax cmp r8, -1 je .LBB0_3 mov r9, rsi lea rsi, [r8 + 1] cmp r8, rcx jae .LBB0_2 test r9, r9 je .LBB0_8 xor r10d, r10d mov rcx, r8 .LBB0_6: mov byte ptr [rdi + r10], cl ;; The critical jump and cmp we don't want to have, costs ~5-10% loop perf cmp rcx, r8 ja .LBB0_9 movzx eax, byte ptr [rdx + rcx] sub rcx, rax mov eax, 0 cmovb rcx, rax add r10, 1 cmp r9, r10 jne .LBB0_6 .LBB0_8: pop rax ret .LBB0_9: lea rdx, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.2] mov rdi, rcx call qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL] ud2 .LBB0_3: lea rdi, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.1] call qword ptr [rip + core::slice::slice_index_overflow_fail@GOTPCREL] ud2 .LBB0_2: lea rdx, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.1] mov rdi, rsi mov rsi, rcx call qword ptr [rip + core::slice::slice_index_len_fail@GOTPCREL] ud2 .Lanon.9b054490e6ade753ffe504f874525a87.0: .ascii "src/lib.rs" .Lanon.9b054490e6ade753ffe504f874525a87.1: .quad .Lanon.9b054490e6ade753ffe504f874525a87.0 .asciz "\n\000\000\000\000\000\000\000\002\000\000\000\024\000\000" .Lanon.9b054490e6ade753ffe504f874525a87.2: .quad .Lanon.9b054490e6ade753ffe504f874525a87.0 .asciz "\n\000\000\000\000\000\000\000\005\000\000\000.\000\000"
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Category: An issue proposing an enhancement or a PR with one.Category: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.