Skip to content

RISCV vector zvl256b miscompile at -O2 #82430

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
patrick-rivos opened this issue Feb 20, 2024 · 8 comments · Fixed by #82506
Closed

RISCV vector zvl256b miscompile at -O2 #82430

patrick-rivos opened this issue Feb 20, 2024 · 8 comments · Fixed by #82506

Comments

@patrick-rivos
Copy link
Contributor

Testcase:

int printf(const char *, ...);
int c;
int d[10][10];
char e, f;
int main() {
  for (f = 0; f < 8; f += 1) {
    --c >> d[0][4] && (e = 0);
    d[0][4] = d[5][0];
  }
  printf("%X\n", d[0][4]);
}

Commands:

> /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
AB2B8CB8
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out -fuse-ld=lld
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
555599E0
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang red.c -o red.out
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
0

Bugpoint, ASM, red.c: c-asm-bc.zip

--opt-bisect-limit points to SLPVectorizerPass

Discovered/tested using 7f3980a (not bisected)
Found using fuzzer

@llvmbot
Copy link
Member

llvmbot commented Feb 20, 2024

@llvm/issue-subscribers-backend-risc-v

Author: Patrick O'Neill (patrick-rivos)

Testcase: ```c int printf(const char *, ...); int c; int d[10][10]; char e, f; int main() { for (f = 0; f < 8; f += 1) { --c >> d[0][4] && (e = 0); d[0][4] = d[5][0]; } printf("%X\n", d[0][4]); } ```

Commands:

&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out
&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
AB2B8CB8
&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out -fuse-ld=lld
&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
555599E0
&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang red.c -o red.out
&gt; /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
0

Bugpoint, ASM, red.c: c-asm-bc.zip

--opt-bisect-limit points to SLPVectorizerPass

Discovered/tested using 7f3980a (not bisected)
Found using fuzzer

@topperc
Copy link
Collaborator

topperc commented Feb 20, 2024

This could mean either the SLP vectorizer generated bad code. Or the SLP IR is good but SelectionDAG miscompiled it.

@wangpc-pp
Copy link
Contributor

This part is really weird, the use of different linkers results in different outputs:

> /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
AB2B8CB8
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/clang -march=rv64gcv_zvl256b -O2 red.c -o red.out -fuse-ld=lld
> /scratch/tc-testing/tc-feb-20-llvm/build/bin/qemu-riscv64 red.out
555599E0

I don't see any assemblies diff: https://godbolt.org/z/8Thfqq9M7.
So this may be a bug of linker?

@topperc
Copy link
Collaborator

topperc commented Feb 21, 2024

This address doesn't make sense to me

        auipc   a2, %pcrel_hi(d)
        addi    a2, a2, %pcrel_lo(.Lpcrel_hi1)
        addi    a3, a2, -56
        li      a4, 72
        vsetivli        zero, 2, e32, mf2, ta, ma
        vlse32.v        v8, (a3), a4

That should be loading d[5][0] into element 0 and d[0][4] into element 1, but it starts 56 bytes before d. Unless I'm misreading.

@topperc
Copy link
Collaborator

topperc commented Feb 21, 2024

This definitely looks suspicious

Combining: t57: v2i32,ch = masked_gather<(load unknown-size, align 4, !tbaa !37), unsigned unscaled offset> t0, undef:v2i32, t7, GlobalAddress:i64<ptr @d> 0, t56, TargetConstant:i64<1>
Creating constant: t58: i64 = Constant<-56>                                      
Creating new node: t59: i64 = add GlobalAddress:i64<ptr @d> 0, Constant:i64<-56> 
Creating constant: t60: i64 = TargetConstant<9900>                               
Creating constant: t61: i64 = Constant<72>                                       
Creating new node: t62: v2i32,ch = llvm.riscv.masked.strided.load<(load unknown-size, align 4, !tbaa !37)> t0, TargetConstant:i64<9900>, undef:v2i32, t59, Constant:i64<72>, t7

@topperc
Copy link
Collaborator

topperc commented Feb 21, 2024

-56 in i8 is also 200. That's the correct offset. We failed to zero extend it when we converted it from i8 to i64. So we got i64 -56 instead of i64 200. The stride is also wrong. It shoud be 16-200, -184 instead of 72.

This all stems from using isSimpleVIDSequence to convert a build_vector index into a stride. We need to zero extend all the indices to XLen before we can use that I think.

lukel97 added a commit that referenced this issue Feb 21, 2024
…e. NFC

This shows the issue in #82430, but triggers it via the widening SEW combine
rather than a GEP that RISCVGatherScatterLowering doesn't detect.
lukel97 added a commit to lukel97/llvm-project that referenced this issue Feb 21, 2024
…g indices

This fixes the miscompile reported in llvm#82430 by telling isSimpleVIDSequence to
sign extending to XLen instead of the type of the indices, since the "sequence"
of indices generated by a strided load will be at XLen.

This was the simplest way I could think of of getting isSimpleVIDSequence to
treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant integers"
part from isSimpleVIDSequence and handle them as APInts so we can separately
zero extend it.
lukel97 added a commit to lukel97/llvm-project that referenced this issue Feb 22, 2024
…g indices

This fixes the miscompile reported in llvm#82430 by telling isSimpleVIDSequence to
sign extending to XLen instead of the type of the indices, since the "sequence"
of indices generated by a strided load will be at XLen.

This was the simplest way I could think of of getting isSimpleVIDSequence to
treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant integers"
part from isSimpleVIDSequence and handle them as APInts so we can separately
zero extend it.
lukel97 added a commit that referenced this issue Feb 22, 2024
…g indices (#82506)

This fixes the miscompile reported in #82430 by telling
isSimpleVIDSequence to sign extend to XLen instead of the width of the
indices, since the "sequence" of indices generated by a strided load
will be at XLen.

This was the simplest way I could think of getting isSimpleVIDSequence
to treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant
integers" part from isSimpleVIDSequence and handle them as APInts so we
can separately zero extend it.

Fixes #82430
@lukel97 lukel97 added this to the LLVM 18.X Release milestone Feb 22, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Feb 22, 2024
@lukel97
Copy link
Contributor

lukel97 commented Feb 22, 2024

/cherry-pick 2cd59bd 11d115d 815644b

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Feb 22, 2024
…e. NFC

This shows the issue in llvm#82430, but triggers it via the widening SEW combine
rather than a GEP that RISCVGatherScatterLowering doesn't detect.

(cherry picked from commit 2cd59bd)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Feb 22, 2024
…g indices (llvm#82506)

This fixes the miscompile reported in llvm#82430 by telling
isSimpleVIDSequence to sign extend to XLen instead of the width of the
indices, since the "sequence" of indices generated by a strided load
will be at XLen.

This was the simplest way I could think of getting isSimpleVIDSequence
to treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant
integers" part from isSimpleVIDSequence and handle them as APInts so we
can separately zero extend it.

Fixes llvm#82430

(cherry picked from commit 815644b)
@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2024

/pull-request #82572

@tstellar tstellar moved this from Needs Triage to Done in LLVM Release Status Feb 23, 2024
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 19, 2024
…e. NFC

This shows the issue in llvm#82430, but triggers it via the widening SEW combine
rather than a GEP that RISCVGatherScatterLowering doesn't detect.

(cherry picked from commit 2cd59bd)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 19, 2024
…g indices (llvm#82506)

This fixes the miscompile reported in llvm#82430 by telling
isSimpleVIDSequence to sign extend to XLen instead of the width of the
indices, since the "sequence" of indices generated by a strided load
will be at XLen.

This was the simplest way I could think of getting isSimpleVIDSequence
to treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant
integers" part from isSimpleVIDSequence and handle them as APInts so we
can separately zero extend it.

Fixes llvm#82430

(cherry picked from commit 815644b)
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
Development

Successfully merging a pull request may close this issue.

6 participants