-
Notifications
You must be signed in to change notification settings - Fork 770
[SYCL][PI][L0] Force reset of memcpy command-list. #4001
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
This change set forces a host syncronization on a mempcy's command list fence, and the reset of that commandlist. Resetting the command list has the effect of causing Level Zero to free any resources the command list may be using, and by doing this as early as possible helps remove any bookkeeping that level zero may be doing on memory regions.
// NOTE: it is valid to wait for the fence here as long as we aren't | ||
// doing batching on the involved command-list. Today memcpy goes by | ||
// itself in a command list. | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this NOTE: portion of the comment is accurate. I think batching can still be done on memcpy, but if we do batching on memcpy, then first memcpy event in the command list that finishes will force a fence synchronization for the whole command list. I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.
That's what I called no batching of memory copies, i.e. where same command-list would contain multiple memory copies. Yes, memcpy must be the last command in the batch, i.e. close the batch, in order for this fence wait logic here to be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@kbsmith-intel, pre-commit validation has failed. Please, take a look. |
* upstream/sycl: (649 commits) [SYCL][Driver][NFC] Update integration footer test for 32-bit host (intel#4039) [SYCL][L0] Initialize descriptor .stype and .pNext (intel#4032) [SYCL] Add sycl::kernel::get_kernel_bundle method (intel#3855) [SYCL] Add support for device UUID as a SYCL extension. (intel#3696) [SYCL][Matrix] Add spec document for the matrix extension interface and its first implementation for AMX (intel#3551) Fix debug build mangler test after PR#3992 (8f38045). (intel#4033) [Driver][SYCL] Restrict user -include file in final integration footer step (intel#4036) [SYCL] [Tests] Do not copy device binary image mocks (intel#4023) [SYCL][Doc] Update docs to reflect new compiler features (intel#4030) [SYCL][CUDA] cl_khr_fp16 extension connected to cuda PI. (intel#4029) [SYCL][NFC] Refactor RT unit tests (intel#4021) [SYCL] Switch to using integration footer by default (intel#3777) [SYCL][CUDA] Add the Use Default Stream property (intel#4004) Uplift GPU RT version for Linux to 21.24.20098 (intel#4003) [SYCL][CUDA] atomic_ref.fetch_add used for fp64 reduction if device.has(atomic64) (intel#3950) [Driver][SYCL] Differentiate host dependency link from regular host link (intel#4002) [SYCL][ESIMD] Support device half type in intrinsics. (intel#4024) [SYCL] Allow fpga_reg only for PODs and Trivially-copyable structs (intel#3643) [SYCL][FPGA] Restore legacy debug info version for the hardware (intel#3991) [SYCL][PI][L0] Force reset of memcpy command-list. (intel#4001) ...
This change set forces a host syncronization on a mempcy's
command list fence, and the reset of that commandlist. Resetting
the command list has the effect of causing Level Zero to
free any resources the command list may be using, and by doing
this as early as possible helps remove any bookkeeping that level
zero may be doing on memory regions.