Skip to content

Conversation

KseniyaTikhomirova
Copy link
Contributor

No description provided.

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
@KseniyaTikhomirova KseniyaTikhomirova requested a review from a team as a code owner July 2, 2024 15:29
@KseniyaTikhomirova
Copy link
Contributor Author

follow up for #14385

@KseniyaTikhomirova
Copy link
Contributor Author

@intel/llvm-gatekeepers hi, this PR is ready for merge, thanks

@martygrant martygrant merged commit 9637803 into intel:sycl Jul 5, 2024
KseniyaTikhomirova added a commit to KseniyaTikhomirova/llvm that referenced this pull request Aug 21, 2024
iclsrc pushed a commit that referenced this pull request Jul 1, 2025
Combine sequences such as:
```llvm
  %pn1 = phi [init1, %BB1], [%op1, %BB2]
  %pn2 = phi [init2, %BB1], [%op2, %BB2]
  %op1 = binop %pn1, constant1
  %op2 = binop %pn2, constant2
  %rdx = binop %op1, %op2
```
Into:
```llvm
  %phi_combined = phi [init_combined, %BB1], [%op_combined, %BB2]
  %rdx_combined = binop %phi_combined, constant_combined
```

This allows us to simplify interleaved reductions, for example as
introduced by the loop vectorizer.

The anecdotal example for this is the loop below:
```c
float foo() {
  float q = 1.f;
  for (int i = 0; i < 1000; ++i)
    q *= .99f;
  return q;
}
```
Which currently gets lowered explicitly such as (on AArch64,
interleaved by four):
```gas
.LBB0_1:
  fmul    v0.4s, v0.4s, v1.4s
  fmul    v2.4s, v2.4s, v1.4s
  fmul    v3.4s, v3.4s, v1.4s
  fmul    v4.4s, v4.4s, v1.4s
  subs    w8, w8, #32
  b.ne    .LBB0_1
```
But with this patch lowers trivially:
```gas
foo:
  mov     w8, #5028
  movk    w8, #14389, lsl #16
  fmov    s0, w8
  ret
```
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants