Skip to content

[SYCL][PI][L0] Force reset of memcpy command-list. #4001

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 2 commits into from
Jun 29, 2021

Conversation

kbsmith-intel
Copy link
Contributor

This change set forces a host syncronization on a mempcy's
command list fence, and the reset of that commandlist. Resetting
the command list has the effect of causing Level Zero to
free any resources the command list may be using, and by doing
this as early as possible helps remove any bookkeeping that level
zero may be doing on memory regions.

This change set forces a host syncronization on a mempcy's
command list fence, and the reset of that commandlist. Resetting
the command list has the effect of causing Level Zero to
free any resources the command list may be using, and by doing
this as early as possible helps remove any bookkeeping that level
zero may be doing on memory regions.
// NOTE: it is valid to wait for the fence here as long as we aren't
// doing batching on the involved command-list. Today memcpy goes by
// itself in a command list.
//
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this NOTE: portion of the comment is accurate. I think batching can still be done on memcpy, but if we do batching on memcpy, then first memcpy event in the command list that finishes will force a fence synchronization for the whole command list. I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.

That's what I called no batching of memory copies, i.e. where same command-list would contain multiple memory copies. Yes, memcpy must be the last command in the batch, i.e. close the batch, in order for this fence wait logic here to be correct.

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kbsmith-intel kbsmith-intel changed the title [SYCL][PI][L0] Force reset of memcpy command-list. [SYCL][PI][L0] Force reset/sync of memcpy command-list. Jun 28, 2021
@kbsmith-intel kbsmith-intel changed the title [SYCL][PI][L0] Force reset/sync of memcpy command-list. [SYCL][PI][L0] Force reset of memcpy command-list. Jun 28, 2021
@bader
Copy link
Contributor

bader commented Jun 28, 2021

@kbsmith-intel, pre-commit validation has failed. Please, take a look.

@bader bader merged commit c56a270 into intel:sycl Jun 29, 2021
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Jul 2, 2021
* upstream/sycl: (649 commits)
  [SYCL][Driver][NFC] Update integration footer test for 32-bit host (intel#4039)
  [SYCL][L0] Initialize descriptor .stype and .pNext (intel#4032)
  [SYCL] Add sycl::kernel::get_kernel_bundle method (intel#3855)
  [SYCL] Add support for device UUID as a SYCL extension. (intel#3696)
  [SYCL][Matrix] Add spec document for the matrix extension interface and its first implementation for AMX (intel#3551)
  Fix debug build mangler test after PR#3992 (8f38045). (intel#4033)
  [Driver][SYCL] Restrict user -include file in final integration footer step (intel#4036)
  [SYCL] [Tests] Do not copy device binary image mocks (intel#4023)
  [SYCL][Doc] Update docs to reflect new compiler features (intel#4030)
  [SYCL][CUDA] cl_khr_fp16 extension connected to cuda PI. (intel#4029)
  [SYCL][NFC] Refactor RT unit tests (intel#4021)
  [SYCL] Switch to using integration footer by default (intel#3777)
  [SYCL][CUDA] Add the Use Default Stream property (intel#4004)
  Uplift GPU RT version for Linux to 21.24.20098 (intel#4003)
  [SYCL][CUDA] atomic_ref.fetch_add used for fp64 reduction if device.has(atomic64) (intel#3950)
  [Driver][SYCL] Differentiate host dependency link from regular host link (intel#4002)
  [SYCL][ESIMD] Support device half type in intrinsics. (intel#4024)
  [SYCL] Allow fpga_reg only for PODs and Trivially-copyable structs (intel#3643)
  [SYCL][FPGA] Restore legacy debug info version for the hardware (intel#3991)
  [SYCL][PI][L0] Force reset of memcpy command-list. (intel#4001)
  ...
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants