Skip to content

[SYCL][PI][L0] Force reset of memcpy command-list. #4001

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 2 commits into from
Jun 29, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 20 additions & 7 deletions sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -185,10 +185,9 @@ template <> ze_result_t zeHostSynchronize(ze_event_handle_t Handle) {
template <> ze_result_t zeHostSynchronize(ze_command_queue_handle_t Handle) {
return zeHostSynchronizeImpl(zeCommandQueueSynchronize, Handle);
}
// template <>
// ze_result_t zeHostSynchronize(ze_fence_handle_t Handle) {
// return zeHostSynchronizeImpl(zeFenceHostSynchronize, Handle);
// }
template <> ze_result_t zeHostSynchronize(ze_fence_handle_t Handle) {
return zeHostSynchronizeImpl(zeFenceHostSynchronize, Handle);
}

template <typename T, typename Assign>
pi_result getInfoImpl(size_t param_value_size, void *param_value,
Expand Down Expand Up @@ -4231,11 +4230,25 @@ static pi_result cleanupAfterEvent(pi_event Event) {
}

// It is possible that the fence was already noted as signalled and
// reset. In that case the InUse flag will be false, and there is
// no need to query the fence's status or try to reset it.
// reset. In that case the InUse flag will be false, and
// we shouldn't query it, synchronize on it, or try to reset it.
if (it->second.InUse) {
// Workaround for VM_BIND mode.
// Make sure that the command-list doing memcpy is reset before
// non-USM host memory potentially involved in the memcpy is freed.
//
// NOTE: it is valid to wait for the fence here as long as we aren't
// doing batching on the involved command-list. Today memcpy goes by
// itself in a command list.
//
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this NOTE: portion of the comment is accurate. I think batching can still be done on memcpy, but if we do batching on memcpy, then first memcpy event in the command list that finishes will force a fence synchronization for the whole command list. I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it would be OK to batch, and just force a memcpy to be the last command put into a batch.

That's what I called no batching of memory copies, i.e. where same command-list would contain multiple memory copies. Yes, memcpy must be the last command in the batch, i.e. close the batch, in order for this fence wait logic here to be correct.

// TODO: this will unnecessarily(?) wait for non-USM memory buffers
// too, so we might need to add a new command type to differentiate.
//
ze_result_t ZeResult =
ZE_CALL_NOCHECK(zeFenceQueryStatus, (it->second.ZeFence));
(Event->CommandType == PI_COMMAND_TYPE_MEM_BUFFER_COPY)
? ZE_CALL_NOCHECK(zeHostSynchronize, (it->second.ZeFence))
: ZE_CALL_NOCHECK(zeFenceQueryStatus, (it->second.ZeFence));

if (ZeResult == ZE_RESULT_SUCCESS) {
Queue->resetCommandListFenceEntry(*it, true);
Event->ZeCommandList = nullptr;
Expand Down