-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][CUDA] Port CUDA plugin to Unified Runtime #9512
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
cbfad32
to
f1bba52
Compare
|
||
#include <sstream> | ||
|
||
ur_result_t map_error_ur(CUresult result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think going forward we agreed that this mapping from cuResult
to urResult
should be removed as explained here: oneapi-src/unified-runtime#500 (comment)
I understand maybe you don't want to do it at this point. However it actually may be easier to do it at this point and does allow the removal of a lot of redundant code so FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're trying to keep the scope of this PR and the porting to effort to just a straight port of the existing code to avoid any changes to the behavior of the plugin/adapter, so I don't think it makes sense to do it at this stage. Plus the size of the PR means that reviewing any actual functional changes at the same time would be tricky.
It also requires a resolution to oneapi-src/unified-runtime#500
} | ||
} | ||
|
||
ur_result_t check_error_ur(CUresult result, const char *function, int line, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
int getAttribute(ur_device_handle_t device, CUdevice_attribute attribute) { | ||
int value; | ||
sycl::detail::ur::assertion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also cases like these, where assertion
is used instead of check_error
. This will also lead to lost native error information (I think, although I haven't easily found the definition of ur::assertion
). All such cases (all calls to cu* functions) should be setting the last message and reporting a plugin specific error as described here: oneapi-src/unified-runtime#500 (comment) when the result in not CUDA_SUCCESS
I've added some comments that are basically criticisms of PI and changes we already agreed I think regarding error handling. Could make sense to use this as a good opportunity to make these error handling changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think this is an excellent port, and within the constraints of working to an existing API spec it's very worthy. I'm no expert on PI and runtimes in general so most of my comments are on a function / documentation level rather than an architectural level
However, I'd really like to see some of the decisions around code style aligned more closely with upstream. I understand some of this is in progress (e.g. naming conventions as discussed with @jchlanda), but I'd like to reiterate how important ergonomics of programming are. A couple of comments I've made are of high priority to me (hidden control flow in macros that buy you nothing) because making the code readable and clear at first glance is critical to understanding
sycl/plugins/unified_runtime/ur/adapters/cuda/ur_interface_loader.cpp
Outdated
Show resolved
Hide resolved
sycl/plugins/unified_runtime/ur/adapters/cuda/ur_interface_loader.cpp
Outdated
Show resolved
Hide resolved
case PI_EXT_CODEPLAY_DEVICE_INFO_MAX_REGISTERS_PER_WORK_GROUP: { | ||
InfoType = UR_EXT_DEVICE_INFO_MAX_REGISTERS_PER_WORK_GROUP; | ||
break; | ||
} | ||
default: | ||
return PI_ERROR_UNKNOWN; | ||
}; | ||
|
||
PI_ASSERT(Device, PI_ERROR_INVALID_DEVICE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I'm a bit late to the party as this is already in the codebase, but it really troubles me that we have macros that hide flow control. It's one extra line to expand the macro, and makes the control flow much more obvious to anyone reading. Additionally, it's not an assertion of any kind (which is about ensuring invariants about the design of the system are true); it's a simple parameter check for user input.
Same goes for HANDLE_ERRORS
.
There's zero ergonomic benefit as whenever you wrap an expression in HANDLE_ERRORS
you make the line length longer, clang-format splits it across lines, and it absolutely confounds stepping in a debugger.
Please use this as an opportunity to not reinforce this broken idiom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think this is an excellent port, and within the constraints of working to an existing API spec it's very worthy. I'm no expert on PI and runtimes in general so most of my comments are on a function / documentation level rather than an architectural level
However, I'd really like to see some of the decisions around code style aligned more closely with upstream. I understand some of this is in progress (e.g. naming conventions as discussed with @jchlanda), but I'd like to reiterate how important ergonomics of programming are. A couple of comments I've made are of high priority to me (hidden control flow in macros that buy you nothing) because making the code readable and clear at first glance is critical to understanding
…nding further investigation
@intel/llvm-gatekeepers Please merge this when possible |
We are working on a fix to this issue in the post merge actions. |
Resolves the warnings as errors reported in [post merge](https://github.com/intel/llvm/actions/runs/5266121277/jobs/9519634360) as a result of merging intel#9512. Additionally move pre-processor guards to resolve unused global variables which would also fail in this build configuration (clang & SYCL_ENABLE_WERROR=ON).
Fixed in #9872 |
Resolves the warnings as errors reported in [post merge](https://github.com/intel/llvm/actions/runs/5266121277/jobs/9519634360) as a result of merging #9512. Additionally move pre-processor guards to resolve unused global variables which would also fail in this build configuration (clang & SYCL_ENABLE_WERROR=ON).
This moves the CUDA plugin implementation to Unified Runtime; and changes the pi_cuda plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the Level Zero adapter (intel#8744) so will only be ready when that is merged. --------- Co-authored-by: Petr Vesely <petr.vesely@codeplay.com> Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Martin Morrison-Grant <martin.morrisongrant@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
Resolves the warnings as errors reported in [post merge](https://github.com/intel/llvm/actions/runs/5266121277/jobs/9519634360) as a result of merging intel#9512. Additionally move pre-processor guards to resolve unused global variables which would also fail in this build configuration (clang & SYCL_ENABLE_WERROR=ON).
This moves the HIP plugin implementation to Unified Runtime; and changes the pi_hip plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the CUDA adapter (#9512) so will only be ready when that is merged. --------- Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Petr Vesely <veselypeta@gmail.com> Co-authored-by: Callum Fare <callum@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
This moves the CUDA plugin implementation to Unified Runtime; and changes the pi_cuda plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the Level Zero adapter (intel#8744) so will only be ready when that is merged. --------- Co-authored-by: Petr Vesely <petr.vesely@codeplay.com> Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Martin Morrison-Grant <martin.morrisongrant@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
Resolves the warnings as errors reported in [post merge](https://github.com/intel/llvm/actions/runs/5266121277/jobs/9519634360) as a result of merging intel#9512. Additionally move pre-processor guards to resolve unused global variables which would also fail in this build configuration (clang & SYCL_ENABLE_WERROR=ON).
This moves the HIP plugin implementation to Unified Runtime; and changes the pi_hip plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the CUDA adapter (intel#9512) so will only be ready when that is merged. --------- Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Petr Vesely <veselypeta@gmail.com> Co-authored-by: Callum Fare <callum@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
This moves the HIP plugin implementation to Unified Runtime; and changes the pi_hip plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the CUDA adapter (intel/llvm#9512) so will only be ready when that is merged. --------- Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Petr Vesely <veselypeta@gmail.com> Co-authored-by: Callum Fare <callum@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
This moves the HIP plugin implementation to Unified Runtime; and changes the pi_hip plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR. This PR is based on top of the CUDA adapter (intel/llvm#9512) so will only be ready when that is merged. --------- Co-authored-by: Omar Ahmed <omar.ahmed@codeplay.com> Co-authored-by: Petr Vesely <veselypeta@gmail.com> Co-authored-by: Callum Fare <callum@codeplay.com> Co-authored-by: Aaron Greig <aaron.greig@codeplay.com>
This moves the CUDA plugin implementation to Unified Runtime; and changes the pi_cuda plugin to use pi2ur to implement PI. The changes to the implementation have been kept to a minimum and should be functionally the same. Documentation and comments have been moved verbatim, other than changing PI references to UR.
This PR is based on top of the Level Zero adapter (#8744) so will only be ready when that is merged.