Skip to content

[SYCL][UR] Update Unified Runtime tag to support UR_DEVICE_INFO_IP_VERSION #9873

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 19 commits into from
Jun 21, 2023

Conversation

dm-vodopyanov
Copy link
Contributor

@dm-vodopyanov dm-vodopyanov commented Jun 14, 2023

This should have been an obvious update of Unified Runtime tag to support UR_DEVICE_INFO_IP_VERSION, required in
#9843 (just tag update, nothing else), but it also brought many API breaks caused by this patch: oneapi-src/unified-runtime#536. So the current PR updates our codebase in accordance with changed UR API.

…RSION

This should have been an obvious update of Unified Runtime sources to support
UR_DEVICE_INFO_IP_VERSION, required in
intel#9843, but this update brought
many API breaks mostly caused by this patch:
oneapi-src/unified-runtime#536
@dm-vodopyanov dm-vodopyanov requested a review from a team as a code owner June 14, 2023 12:48
Copy link
Contributor Author

@dm-vodopyanov dm-vodopyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@intel/dpcpp-l0-pi-reviewers, I don't have much expertise in this code, so any feedback will be appreciated.

@dm-vodopyanov
Copy link
Contributor Author

clang-format failed somehow unexpectedly:

Run ./src/devops/actions/clang-format
Run git config --global --add safe.directory /__w/llvm/llvm
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
warning: Not a git repository. Use --no-index to compare two paths outside a working tree
usage: git diff --no-index [<options>] <path> <path>

@jandres742
Copy link
Contributor

clang-format failed somehow unexpectedly:

Run ./src/devops/actions/clang-format
Run git config --global --add safe.directory /__w/llvm/llvm
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
warning: Not a git repository. Use --no-index to compare two paths outside a working tree
usage: git diff --no-index [<options>] <path> <path>

this seems like glitch on infra?

@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 15, 2023 13:11 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov requested a review from a team as a code owner June 15, 2023 14:55
@dm-vodopyanov dm-vodopyanov requested a review from jchlanda June 15, 2023 14:55
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 15, 2023 15:29 — with GitHub Actions Inactive
@jandres742
Copy link
Contributor

jandres742 commented Jun 16, 2023

We cannot merge this update to the UR loader. I compiled @dm-vodopyanov branch with gcc-7.5 and I'm getting this error below. It's not seen with GCC 10.

@igchor , @kbenzie , @pbalcer : error seems to come from disjoint_pool lib, but couldn't find why.

[3867/4003] Linking CXX shared library lib/libur_adapter_null.so.0.6.0
FAILED: lib/libur_adapter_null.so.0.6.0
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -Wextra -Wno-deprecated-declarations -Wno-error -fvisibility=hidden -fvisibility-inlines-hidden -fno-strict-aliasing -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete -shared -Wl,-soname,libur_adapter_null.so.0 -o lib/libur_adapter_null.so.0.6.0 _deps/unified-runtime-build/source/adapters/null/CMakeFiles/ur_adapter_null.dir/ur_null.cpp.o _deps/unified-runtime-build/source/adapters/null/CMakeFiles/ur_adapter_null.dir/ur_nullddi.cpp.o _deps/unified-runtime-build/source/adapters/null/CMakeFiles/ur_adapter_null.dir/__/__/__/__/unified-runtime-src/source/common/linux/ur_lib_loader.cpp.o  -Wl,-rpath,/home/jarteaga/llvm/build/lib  -lstdc++fs  lib/libdisjoint_pool.a  lib/libunified_memory_allocation.a  -lstdc++fs  -ldl && :
/usr/bin/ld: lib/libdisjoint_pool.a(disjoint_pool.cpp.o): in function `std::__shared_mutex_pthread::unlock()':
disjoint_pool.cpp:(.text._ZNSt22__shared_mutex_pthread6unlockEv[_ZNSt22__shared_mutex_pthread6unlockEv]+0x5): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: lib/libdisjoint_pool.a(disjoint_pool.cpp.o): in function `usm::Slab::unregSlabByAddr(void*, usm::Slab&) [clone .localalias.28]':
disjoint_pool.cpp:(.text._ZN3usm4Slab15unregSlabByAddrEPvRS0_+0x24): undefined reference to `pthread_rwlock_wrlock'
/usr/bin/ld: disjoint_pool.cpp:(.text._ZN3usm4Slab15unregSlabByAddrEPvRS0_+0x15f): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: lib/libdisjoint_pool.a(disjoint_pool.cpp.o): in function `usm::DisjointPool::AllocImpl::deallocate(void*, bool&) [clone .localalias.1]':
disjoint_pool.cpp:(.text._ZN3usm12DisjointPool9AllocImpl10deallocateEPvRb+0x34): undefined reference to `pthread_rwlock_rdlock'
/usr/bin/ld: disjoint_pool.cpp:(.text._ZN3usm12DisjointPool9AllocImpl10deallocateEPvRb+0x11c): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: disjoint_pool.cpp:(.text._ZN3usm12DisjointPool9AllocImpl10deallocateEPvRb+0x154): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: disjoint_pool.cpp:(.text._ZN3usm12DisjointPool9AllocImpl10deallocateEPvRb+0x274): undefined reference to `pthread_rwlock_unlock'
/usr/bin/ld: lib/libdisjoint_pool.a(disjoint_pool.cpp.o): in function `usm::Slab::regSlabByAddr(void*, usm::Slab&) [clone .localalias.29]':
disjoint_pool.cpp:(.text._ZN3usm4Slab13regSlabByAddrEPvRS0_+0x27): undefined reference to `pthread_rwlock_wrlock'
/usr/bin/ld: disjoint_pool.cpp:(.text._ZN3usm4Slab13regSlabByAddrEPvRS0_+0xe3): undefined reference to `pthread_rwlock_unlock'
collect2: error: ld returned 1 exit status
[3890/4003] Building CXX object tools/llvm-spirv/lib/SPIRV/CMakeFiles/LLVMSPIRVLib.dir/SPIRVReader.cpp.o
ninja: build stopped: subcommand failed.

@dm-vodopyanov
Copy link
Contributor Author

@jandres742 @smaslov-intel @callumfare there is a new failure, because #9512 was merged [1].

Also, we still have pi_device_partition_property in piDevicePartition, I guess a new data structure should be added: pi_device_partition_properties

I would appreciate if you submit the fixes directly to this PR.

[1]:

FAILED: lib/libpi_cuda.so 
: && /usr/bin/g++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -Wextra -Wno-deprecated-declarations -Werror -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete -shared -Wl,-soname,libpi_cuda.so -o lib/libpi_cuda.so tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/pi2ur.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/ur.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/usm_allocator.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/common.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/context.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/device.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/enqueue.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/event.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/kernel.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/memory.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/platform.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/program.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/queue.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/sampler.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/tracing.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/ur_interface_loader.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/usm.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/pi_cuda.cpp.o tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/xpti/src/xpti_proxy.cpp.o  -Wl,-rpath,/usr/local/cuda/lib64:  /usr/local/cuda/lib64/stubs/libcuda.so  -ldl  /usr/local/cuda/lib64/libcupti.so  -Wl,--version-script=/__w/llvm/llvm/src/sycl/plugins/ld-version-script.txt  -lgcc_s  -lgcc && :
/usr/bin/ld: tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/pi_cuda.cpp.o: in function `pi2ur::piDevicePartition(_pi_device*, long const*, unsigned int, _pi_device**, unsigned int*)':
pi_cuda.cpp:(.text._ZN5pi2ur17piDevicePartitionEP10_pi_devicePKljPS1_Pj[_ZN5pi2ur17piDevicePartitionEP10_pi_devicePKljPS1_Pj]+0xa4): undefined reference to `urDevicePartition'
/usr/bin/ld: tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/__w/llvm/llvm/src/sycl/plugins/unified_runtime/ur/adapters/cuda/ur_interface_loader.cpp.o: in function `urGetDeviceProcAddrTable':
ur_interface_loader.cpp:(.text.urGetDeviceProcAddrTable+0x18): undefined reference to `urDevicePartition'
collect2: error: ld returned 1 exit status

@kbenzie
Copy link
Contributor

kbenzie commented Jun 16, 2023

@dm-vodopyanov thanks for the heads up, @callumfare is looking into this now.

@callumfare
Copy link
Contributor

@dm-vodopyanov I've opened a PR against your fork with the fix for the cuda adapter, let me know if there are any problems

Fix cuda adapter after UR bump
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 16, 2023 15:41 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 16, 2023 16:21 — with GitHub Actions Inactive
@jandres742
Copy link
Contributor

@dm-vodopyanov : I have opened a PR against your branch, to absorb the changes needed to be able to compile with gcc-7.

dm-vodopyanov#2

Jaime Arteaga and others added 3 commits June 16, 2023 12:56
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
@dm-vodopyanov
Copy link
Contributor Author

what about doing this:

      } PartitionProperties = {
          {Partitions..., ur_device_partition_t(0)}};

could you try to see if that makes the error go away?

That's exactly what I did in this PR. see the very first commit 193ed13#diff-d9b48cce2bc16acc175c9bfbf26bf8e4a9936efdb39b8c1b4c54d29c94181509R302:

  1. } PartitionProperties = {
    {Partitions..., ur_device_partition_property_t(0)}};
  2. } PartitionProperties = {
    {Partitions..., ur_device_partition_t(0)}};
  3. } PartitionProperties = {Partitions...};

So, the problem is in Partitions... - probably it should be casted to some type. @veselypeta could you please help?

@kbenzie
Copy link
Contributor

kbenzie commented Jun 20, 2023

@dm-vodopyanov I'll look at this as @veselypeta has a day off today. Let me investigate and get back to you.

@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 20, 2023 11:50 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 20, 2023 12:31 — with GitHub Actions Inactive
@dm-vodopyanov
Copy link
Contributor Author

@jchlanda @intel/llvm-reviewers-cuda, friendly ping

Copy link
Contributor

@jchlanda jchlanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I don't have in depth view of this part of the codebase, @npmiller?

Copy link
Contributor

@npmiller npmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA adapter changes LGTM

@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 21, 2023 16:30 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 21, 2023 16:34 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 21, 2023 17:32 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov temporarily deployed to aws June 21, 2023 17:48 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov merged commit f69dd09 into intel:sycl Jun 21, 2023
@sarnex
Copy link
Contributor

sarnex commented Jun 21, 2023

@dm-vodopyanov if you didn't already notice, build is failing

@dm-vodopyanov
Copy link
Contributor Author

@dm-vodopyanov if you didn't already notice, build is failing

I noticed it.

@dm-vodopyanov
Copy link
Contributor Author

Failure in post-commit, fix: #10018

dm-vodopyanov added a commit that referenced this pull request Jun 30, 2023
…l Zero and OpenCL (#9843)

This patch introduces new host API for
sycl_ext_oneapi_device_architecture extension and implements it,
currently only for Level Zero and OpenCL

Depends on 

- oneapi-src/unified-runtime#573
- #9873
- #9979
- #10054
veselypeta pushed a commit to veselypeta/llvm that referenced this pull request Sep 21, 2023
…RSION (intel#9873)

This should have been an obvious update of Unified Runtime tag to
support UR_DEVICE_INFO_IP_VERSION, required in
intel#9843 (just tag update, nothing else),
but it also brought many API breaks caused by this patch:
oneapi-src/unified-runtime#536. So the current
PR updates our codebase in accordance with changed UR API.

---------

Signed-off-by: Dmitry Vodopyanov <dmitry.vodopyanov@intel.com>
Co-authored-by: Callum Fare <callum@codeplay.com>
Co-authored-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
veselypeta pushed a commit to veselypeta/llvm that referenced this pull request Sep 21, 2023
veselypeta pushed a commit to veselypeta/llvm that referenced this pull request Sep 21, 2023
…l Zero and OpenCL (intel#9843)

This patch introduces new host API for
sycl_ext_oneapi_device_architecture extension and implements it,
currently only for Level Zero and OpenCL

Depends on 

- oneapi-src/unified-runtime#573
- intel#9873
- intel#9979
- intel#10054
fabiomestre pushed a commit to fabiomestre/llvm that referenced this pull request Sep 26, 2023
…RSION (intel#9873)

This should have been an obvious update of Unified Runtime tag to
support UR_DEVICE_INFO_IP_VERSION, required in
intel#9843 (just tag update, nothing else),
but it also brought many API breaks caused by this patch:
oneapi-src/unified-runtime#536. So the current
PR updates our codebase in accordance with changed UR API.

---------

Signed-off-by: Dmitry Vodopyanov <dmitry.vodopyanov@intel.com>
Co-authored-by: Callum Fare <callum@codeplay.com>
Co-authored-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
fabiomestre pushed a commit to fabiomestre/llvm that referenced this pull request Sep 26, 2023
omarahmed1111 pushed a commit to omarahmed1111/unified-runtime that referenced this pull request Oct 23, 2023
…l Zero and OpenCL (#9843)

This patch introduces new host API for
sycl_ext_oneapi_device_architecture extension and implements it,
currently only for Level Zero and OpenCL

Depends on 

- oneapi-src#573
- intel/llvm#9873
- intel/llvm#9979
- intel/llvm#10054
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants