Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

-msimd128 seems not work on wasi-sdk-11 #160

Closed
sophy228 opened this issue Oct 13, 2020 · 5 comments
Closed

-msimd128 seems not work on wasi-sdk-11 #160

sophy228 opened this issue Oct 13, 2020 · 5 comments

Comments

@sophy228
Copy link

I used below C code simd.c:
`#include <stdio.h>
void multiply_arrays(int* out, int* in_a, int* in_b, int size) {
for (int i = 0; i < size; i++) {
out[i] = in_a[i] * in_b[i];
}
}

int main(int argc, char * argv[])
{
int out[10];
int in_a[10];
int in_b[10];

multiply_arrays(out, in_a, in_b, 10);

for (int i = 0; i < 10; i++) {
    printf("%d ", out[i]);
}
printf("\n");

}`

and compile with:
/opt/wasi-sdk/bin/clang --sysroot=/opt/wasi-sdk/share/wasi-sysroot -O0 -msimd128 -o simd.wasm simd.c

But the actual multiply_arrays is like this:

(func $multiply_arrays (type 8) (param i32 i32 i32 i32) (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32) global.get 0 local.set 4 i32.const 32 local.set 5 local.get 4 local.get 5 i32.sub local.set 6 i32.const 0 local.set 7 local.get 6 local.get 0 i32.store offset=28 local.get 6 local.get 1 i32.store offset=24 local.get 6 local.get 2 i32.store offset=20 local.get 6 local.get 3 i32.store offset=16 local.get 6 local.get 7 i32.store offset=12 block ;; label = @1 loop ;; label = @2 local.get 6 i32.load offset=12 local.set 8 local.get 6 i32.load offset=16 local.set 9 local.get 8 local.set 10 local.get 9 local.set 11 local.get 10 local.get 11 i32.lt_s local.set 12 i32.const 1 local.set 13 local.get 12 local.get 13 i32.and local.set 14 local.get 14 i32.eqz br_if 1 (;@1;) local.get 6 i32.load offset=24 local.set 15 local.get 6 i32.load offset=12 local.set 16 i32.const 2 local.set 17 local.get 16 local.get 17 i32.shl local.set 18 local.get 15 local.get 18 i32.add local.set 19 local.get 19 i32.load local.set 20 local.get 6 i32.load offset=20 local.set 21 local.get 6 i32.load offset=12 local.set 22 i32.const 2 local.set 23 local.get 22 local.get 23 i32.shl local.set 24 local.get 21 local.get 24 i32.add local.set 25 local.get 25 i32.load local.set 26 local.get 20 local.get 26 i32.mul local.set 27 local.get 6 i32.load offset=28 local.set 28 local.get 6 i32.load offset=12 local.set 29 i32.const 2 local.set 30 local.get 29 local.get 30 i32.shl local.set 31 local.get 28 local.get 31 i32.add local.set 32 local.get 32 local.get 27 i32.store local.get 6 i32.load offset=12 local.set 33 i32.const 1 local.set 34 local.get 33 local.get 34 i32.add local.set 35 local.get 6 local.get 35 i32.store offset=12 br 0 (;@2;) end end return)

What I want is some like:
(loop (v128.store align=4 … get address in out… (i32x4.mul (v128.load align=4 … get address inin_a…) (v128.load align=4 … get address inin_b …) … ) )

@sbc100
Copy link
Member

sbc100 commented Oct 13, 2020

@tlively ?

Most likely this is because wasi-sdk-11 is based on a fairly old version of llvm (llvm 10 I think). Support for llvm 11 was just landed so you might want to try using one of the reset CI artifacts which will contains the llvm 11 version of the sdk.

@sophy228
Copy link
Author

@sunfishcode
Copy link
Member

Can you try with -O2 rather than -O0? The -O0 disables LLVM's optimizer, including loop auto-vectorization.

@sophy228
Copy link
Author

@sunfishcode Yes I tried the O2 and O3 , but the multiply_arrays will be optimized inline into main, and final wasm is :

(func $main (type 3) (param i32 i32) (result i32)
(local i32)
global.get 0
i32.const 160
i32.sub
local.tee 2
global.set 0
i32.const 1024
local.get 2
i32.const 144
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 128
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 112
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 96
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 80
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 64
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 48
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 32
i32.add
call $printf
drop
i32.const 1024
local.get 2
i32.const 16
i32.add
call $printf
drop
i32.const 1024
local.get 2
call $printf
drop
i32.const 10
call $putchar
drop
local.get 2
i32.const 160
i32.add
global.set 0
i32.const 0)

@sophy228
Copy link
Author

export the multiply_arrays :

/mnt/d/repos/dist-ubuntu-xenial/wasi-sdk-11.5gaea2680940cb/bin/clang --sysroot=/mnt/d/repos/dist-ubuntu-xenial/wasi-sysroot -g -O3 -msimd128 -Wl,--export=multiply_arrays -o simd.wasm simd.c

(func $multiply_arrays (type 8) (param i32 i32 i32 i32)
(local i32 i32 i32 i32 i32)
block ;; label = @1
local.get 3
i32.const 1
i32.lt_s
br_if 0 (;@1;)
i32.const 0
local.set 4
block ;; label = @2
local.get 3
i32.const 3
i32.le_u
br_if 0 (;@2;)
local.get 1
local.get 3
i32.const 2
i32.shl
local.tee 5
i32.add
local.get 0
i32.gt_u
local.get 0
local.get 5
i32.add
local.tee 6
local.get 1
i32.gt_u
i32.and
br_if 0 (;@2;)
local.get 2
local.get 5
i32.add
local.get 0
i32.gt_u
local.get 6
local.get 2
i32.gt_u
i32.and
br_if 0 (;@2;)
local.get 3
i32.const -4
i32.and
local.tee 4
local.set 7
local.get 0
local.set 5
local.get 2
local.set 6
local.get 1
local.set 8
loop ;; label = @3
local.get 5
local.get 6
v128.load align=4
local.get 8
v128.load align=4
i32x4.mul
v128.store align=4
local.get 5
i32.const 16
i32.add
local.set 5
local.get 6
i32.const 16
i32.add
local.set 6
local.get 8
i32.const 16
i32.add
local.set 8
local.get 7
i32.const -4
i32.add
local.tee 7
br_if 0 (;@3;)
end
local.get 4
local.get 3
i32.eq
br_if 1 (;@1;)
end
local.get 3
local.get 4
i32.sub
local.set 7
local.get 1
local.get 4
i32.const 2
i32.shl
local.tee 8
i32.add
local.set 5
local.get 2
local.get 8
i32.add
local.set 6
local.get 0
local.get 8
i32.add
local.set 8
loop ;; label = @2
local.get 8
local.get 6
i32.load
local.get 5
i32.load
i32.mul
i32.store
local.get 5
i32.const 4
i32.add
local.set 5
local.get 6
i32.const 4
i32.add
local.set 6
local.get 8
i32.const 4
i32.add
local.set 8
local.get 7
i32.const -1
i32.add
local.tee 7
br_if 0 (;@2;)
end
end)

kildom pushed a commit to kildom/clang-wasi-port that referenced this issue Jul 14, 2021
* wasi-headers: update WASI submodule, handle changes to witx ast

* wasi-headers: restructure lib and exe to be more flexible

just factor out some of the hard-coded stuff
alexcrichton pushed a commit to alexcrichton/wasi-sdk that referenced this issue Apr 5, 2023
This change was mostly generated by changing the upstream llvm
branch to 9.x and running:

  $ git submodule update --remote

As well as switching the llvm 9 this change also bring in the
following wasi-libc changes:

  5933c20 fix macos filename, use https
  7c39519 CI: upgrade to llvm 9.0.0
  9ca5187 remove no-self-update workaround for windows azure
  9580a25 deprecate azure pipelines CI, build libc on GH Actions
  2c2fc9a Don't call `free` on paths which are about to call `_Exit`. (WebAssembly#161)
  c6f2c05 gen-headers: Generate assertions of layout from witx (WebAssembly#149)
  37c663f Correct minor typo in c_headers.rs (WebAssembly#166)
  12f5832 Convert more wasi-libc code to `//`-style comments. (WebAssembly#153)
  ec86d4d Improvements to wasi-headers tool (WebAssembly#160)
  1fad338 Fix environment variable init to exit successfully. (WebAssembly#159)
  a280fea Move math source files. (WebAssembly#151)
  dd010be Avoid using cast expressions in WASI API constants. (WebAssembly#148)

Fixes: WebAssembly#101
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants