Skip to content

Missed optimization: Loop with decreasing index does not elide bounds check #74186

Open
@197g

Description

@197g

I tried code very similar to the code below. The expected produced code would perform one bounds check at the start of the function and none inside the loop. offsets is first asserted to be valid for the initial index. This should be enough to also find that it is valid for all smaller indices. No loop iteration can increase the index so the index correctness is a loop invariant.

fn problematic(buf: &mut [u8], offsets: &[u8], mut idx: usize) {
    let offsets = &offsets[..=idx];
    for b in buf {
        *b = idx as u8;
        idx = idx.saturating_sub(usize::from(offsets[idx]));
    }
}

Instead we get this:

playground::problematic:
	push	rax
	cmp	r8, -1
	je	.LBB0_3
	mov	r9, rsi
	lea	rsi, [r8 + 1]
	cmp	r8, rcx
	jae	.LBB0_2
	test	r9, r9
	je	.LBB0_8
	xor	r10d, r10d
	mov	rcx, r8

.LBB0_6:
	mov	byte ptr [rdi + r10], cl
;; The critical jump and cmp we don't want to have, costs ~5-10% loop perf
	cmp	rcx, r8
	ja	.LBB0_9
	movzx	eax, byte ptr [rdx + rcx]
	sub	rcx, rax
	mov	eax, 0
	cmovb	rcx, rax
	add	r10, 1
	cmp	r9, r10
	jne	.LBB0_6

.LBB0_8:
	pop	rax
	ret

.LBB0_9:
	lea	rdx, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.2]
	mov	rdi, rcx
	call	qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
	ud2

.LBB0_3:
	lea	rdi, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.1]
	call	qword ptr [rip + core::slice::slice_index_overflow_fail@GOTPCREL]
	ud2

.LBB0_2:
	lea	rdx, [rip + .Lanon.9b054490e6ade753ffe504f874525a87.1]
	mov	rdi, rsi
	mov	rsi, rcx
	call	qword ptr [rip + core::slice::slice_index_len_fail@GOTPCREL]
	ud2

.Lanon.9b054490e6ade753ffe504f874525a87.0:
	.ascii	"src/lib.rs"

.Lanon.9b054490e6ade753ffe504f874525a87.1:
	.quad	.Lanon.9b054490e6ade753ffe504f874525a87.0
	.asciz	"\n\000\000\000\000\000\000\000\002\000\000\000\024\000\000"

.Lanon.9b054490e6ade753ffe504f874525a87.2:
	.quad	.Lanon.9b054490e6ade753ffe504f874525a87.0
	.asciz	"\n\000\000\000\000\000\000\000\005\000\000\000.\000\000"

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions