Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Layers missing call to cudaStreamSynchronize? #391

Closed
seanbell opened this issue Aug 7, 2017 · 3 comments
Closed

Layers missing call to cudaStreamSynchronize? #391

seanbell opened this issue Aug 7, 2017 · 3 comments

Comments

@seanbell seanbell changed the title Scale layer missing call to cudaStreamSynchronize? Layers missing call to cudaStreamSynchronize? Aug 7, 2017
@seanbell
Copy link
Author

seanbell commented Aug 7, 2017

I think it's also missing from these kernel calls in math_functions: https://github.com/NVIDIA/caffe/blob/caffe-0.16/include/caffe/util/math_functions.hpp#L333-L387 (unless I'm misunderstanding something)

@drnikolaev
Copy link

drnikolaev commented Aug 8, 2017

Hi @seanbell Thank you. It's actually needed everywhere to make sure that compute on that stream is done because when you specify a non-zero stream for a kernel, then that kernel runs asynchronously. There are exclusions though:

  • When you have chain of calls on the same stream, they do not overlap. Just put cSS after the last one.
  • You run 2 kernels on 2 separate streams. If GPU can handle this load, it runs them simultaneously. Then you call cSS one time per each stream to make sure that compute is done.

So, those spots were missed and I'll fix them asap. But majority of them is used in chains of other calls thus they are properly synchronized (in other words, it's not that bad :)

@drnikolaev
Copy link

Fixed.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants