-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Layers missing call to cudaStreamSynchronize? #391
Comments
I think it's also missing from these kernel calls in |
Hi @seanbell Thank you. It's actually needed everywhere to make sure that compute on that stream is done because when you specify a non-zero stream for a kernel, then that kernel runs asynchronously. There are exclusions though:
So, those spots were missed and I'll fix them asap. But majority of them is used in chains of other calls thus they are properly synchronized (in other words, it's not that bad :) |
Fixed. |
Hi, I am trying to port my code to version 0.16, and trying to understand
cudaStreamSynchronize
. When is it needed? The different layers seem to inconsistently use it.For example, it is included here:
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/relu_layer.cu#L54
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/bias_layer.cu#L27
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/bnll_layer.cu#L26
...
But not here:
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/scale_layer.cu#L47
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/slice_layer.cu#L40
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/threshold_layer.cu#L22
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/tile_layer.cu#L28
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/batch_reindex_layer.cu#L30
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/tanh_layer.cu#L24
https://github.com/NVIDIA/caffe/blob/caffe-0.16/src/caffe/layers/crop_layer.cu#L70
...
(Some are potentially also missing the kernel check)
These are incomplete lists, more files may be missing it.
The text was updated successfully, but these errors were encountered: