[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938

amd-jianli12 · 2025-04-22T05:06:40Z

We should query the hardware to discover its warp size.

PiperOrigin-RevId: 700787004

tensorflow/tf-build-actions@600513b [ROCm] Fix flaky gpu compiler test when building with rocm tensorflow/tf-build-actions@a35cf48 [XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 xla@e849446 [ROCm] Pass correct warp size to Triton pipeline xla@3e7b0fe cherry-picked warp size passing to triton calls, and globally enabled warpsize=64 xla@750ad89 Fixes.

zoranjovanovic-ns · 2025-05-19T17:28:38Z

PR is very large; it changes 90 files.
There are some changes that are only related to code formatting.
Is it possible to remove formatting changes from this PR?
If it is really necessary to change 90 files, is it possible to split PR to several smaller PRs?

amd-jianli12 · 2025-05-20T02:39:43Z

PR is very large; it changes 90 files. There are some changes that are only related to code formatting. Is it possible to remove formatting changes from this PR? If it is really necessary to change 90 files, is it possible to split PR to several smaller PRs?

This is the original commit tensorflow@a35cf48 which has 75 files modified.

Yeah, definitely this is a quite large one for us already. And since the latest XLA is quite different from the one in tensorflow r2.18, the number of files modified increased during backporting.

Maybe I can split pure original commit (with conflicts) and modifications made during backporting (to solve conflicts and make it compile), but this will still leave us a commit contains at least 75 files modified plus a relatively smaller patch. Do you think this would help this case?

zoranjovanovic-ns · 2025-05-20T08:36:01Z

Do you think this would help this case?

Probably not too much, will try to review it as it is.

amd-jianli12 · 2025-05-20T09:09:02Z

Do you think this would help this case?

Probably not too much, will try to review it as it is.

That would be tough work, sorry

zoranjovanovic-ns · 2025-05-20T11:36:54Z

Do you think this would help this case?

Probably not too much, will try to review it as it is.

That would be tough work, sorry

Is it possible to get that second commit (fixing errors) somewhere?
If it is not difficult to get it (otherwise I will continue like this).

amd-jianli12 · 2025-05-20T12:59:42Z

Do you think this would help this case?

Probably not too much, will try to review it as it is.

That would be tough work, sorry

Is it possible to get that second commit (fixing errors) somewhere? If it is not difficult to get it (otherwise I will continue like this).

Just looked through local branches but had no fortune. But no worries, I believe it won't take too long to do the rework. Let me do it, otherwise it might not be well organized for you to review.

amd-jianli12 · 2025-05-21T05:32:37Z

Do you think this would help this case?

Probably not too much, will try to review it as it is.

That would be tough work, sorry

Is it possible to get that second commit (fixing errors) somewhere? If it is not difficult to get it (otherwise I will continue like this).

Just looked through local branches but had no fortune. But no worries, I believe it won't take too long to do the rework. Let me do it, otherwise it might not be well organized for you to review.

Done. Please help review under this reorganized PR: #2962

amd-jianli12 force-pushed the r2.18-rocm-enhanced-warpsize branch from c76562c to 5e84717 Compare April 23, 2025 04:26

amd-jianli12 added 4 commits May 8, 2025 08:08

[ROCM] Adapt legacy reduction algorithm to warp size 64

abf13eb

[ROCM] Fix GpuKernelTilingTest.UnnestedTransposeC128TypeRun

2fb4cee

[ROCM] Fix ParallelReductionTest.CouldBeThreeReductionGroups

cd9a3cd

[ROCM] Clean up XLA unit tests for warp size 64

c840321

amd-jianli12 requested a review from i-chaochen May 16, 2025 04:49

i-chaochen requested review from zoranjovanovic-ns and draganmladjenovic May 16, 2025 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938

[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938

amd-jianli12 commented Apr 22, 2025

Uh oh!

zoranjovanovic-ns commented May 19, 2025

Uh oh!

amd-jianli12 commented May 20, 2025 •

edited

Loading

Uh oh!

zoranjovanovic-ns commented May 20, 2025

Uh oh!

amd-jianli12 commented May 20, 2025

Uh oh!

zoranjovanovic-ns commented May 20, 2025

Uh oh!

amd-jianli12 commented May 20, 2025 •

edited

Loading

Uh oh!

amd-jianli12 commented May 21, 2025

Uh oh!

Uh oh!

[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938

Are you sure you want to change the base?

[XLA:GPU] Use DeviceDescription instead of hard-coding warp size as 32 #2938

Conversation

amd-jianli12 commented Apr 22, 2025

Uh oh!

zoranjovanovic-ns commented May 19, 2025

Uh oh!

amd-jianli12 commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoranjovanovic-ns commented May 20, 2025

Uh oh!

amd-jianli12 commented May 20, 2025

Uh oh!

zoranjovanovic-ns commented May 20, 2025

Uh oh!

amd-jianli12 commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amd-jianli12 commented May 21, 2025

Uh oh!

Uh oh!

amd-jianli12 commented May 20, 2025 •

edited

Loading

amd-jianli12 commented May 20, 2025 •

edited

Loading