-
Notifications
You must be signed in to change notification settings - Fork 125
[OpenCL][USM] urEnqueueUSMFill incorrectly assumes destination memory alignment and fails #2440
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[OpenCL][USM] urEnqueueUSMFill incorrectly assumes destination memory alignment and fails #2440
Comments
I'm reverting the test addition in intel/llvm#16465, please ensure it's properly re-committed before closing this issue. |
The test was added back with intel/llvm#16544 now properly marked as |
It's been shown to fail also on the |
This is not fixed, the reproducer still shows the issue with intel/llvm
$ clang++ --version
clang version 21.0.0git (https://github.com/intel/llvm 1e5d85da1c4e5a7bc62948a99f13885b565b0c14)
$ clang++ -fsycl -fsycl-targets=spir64_x86_64 -o mini minimal.cpp
$ ONEAPI_DEVICE_SELECTOR=opencl:cpu ./mini
terminate called after throwing an instance of 'sycl::_V1::exception'
what(): Enqueue process failed.
Aborted (core dumped) It looks like the CI configuration has changed and the affected test is no longer being run on the affected device. It was originally failing in Post Commit CI on OpenCL CPU and FPGA devices, but the Post Commit looks to run on GPU only now: |
@rafbiels I ran the actual failing e2e test and it passed on my machine with the same version of OpenCL as in your comment, is that something you can check locally too? I didn't realise the CI configuration has changed to not run on CPU I will investigate this. The reproducer you gave does look like it's failing though, it seems like there is a difference between that and the e2e test? |
Hi @martygrant, |
I have reverted my PR to re-enable the |
TLDR: OpenCL adapter implementation of
urEnqueueUSMFill
callsclEnqueueMemFillINTEL
for power-of-2 pattern size without checking destination memory alignment required byclEnqueueMemFillINTEL
.Full Story:
I ran into this issue when trying to add
sycl::queue::fill
test in intel/llvm#15991 (specific CI run failure: https://github.com/intel/llvm/actions/runs/11971631535/job/33695502538?pr=15991)I will disable the OpenCL CPU backend in the e2e test, linking this issue in a comment, so it can be re-enabled when the problem is solved.
The minimal reproducer for the issue is:
compiled and ran with:
Debugging further with this UR change:
confirmed that it's the
clEnqueueMemFillINTEL
call which sometimes returns -30 (CL_INVALID_VALUE
).I noticed that in my build of UR / DPC++ this happens when the binary name is shorter than 24 characters, but stops happening when it is longer. Simply renaming the file changes the behaviour. In another build I got the opposite behaviour where short-named binary succeeds but long-named fails. I assume what happens is that long file name causes heap allocation for
argv[0]
and shifts the memory layout, and thus alignment of the device allocation.I note that
clEnqueueMemFillINTEL
as described here:https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_unified_shared_memory.html
may return:
so I assume this is what happens as I checked the other conditions for returning
CL_INVALID_VALUE
are not met.IIUC neither SYCL API nor UR API make any requirements about the destination memory alignment for their USM fill functions, therefore it is incorrect for the implementation to assume alignment. I think the solution here could be to check the alignment and take the other (slower) path which doesn't call
clEnqueueMemFillINTEL
when the alignment requirement is not met.Side note: something seems to be lost in error handling here as the user is informed neither about the error code (INVALID_VALUE) nor its origin (USM fill). There is only a generic exception thrown by the SYCL runtime.
The text was updated successfully, but these errors were encountered: