Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

deadlock when train on multiple GPU and restore the state from previous training #255

Closed
CFAndy opened this issue Oct 18, 2016 · 3 comments

Comments

@CFAndy
Copy link

CFAndy commented Oct 18, 2016

Nvidia-smi -l show
Root solver enter into 0% utilization
while the slaver solver are keep 100% utilization.
BVLC caffe has no such issue.
fixed by #254

@CFAndy
Copy link
Author

CFAndy commented Jun 6, 2017

Bug is still there. I use the 0.16 to train mobilenet with 4 titanx-pascal, caffe entered into deadlock state randomly after a test run. Have to disable the test.

@drnikolaev
Copy link

Thanks, I'm looking to it...

@drnikolaev
Copy link

Fixed in 0.16.2

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants