Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

algorithm selection return wrong result: needs 140632G working space. #353

Closed
CFAndy opened this issue Jun 12, 2017 · 2 comments
Closed

Comments

@CFAndy
Copy link

CFAndy commented Jun 12, 2017

Caffe 0.16
run Resnet with FP16math + FP16Data
11224 I0611 08:14:37.694576 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_2_conv3' with space 5.28G/1 1 1 1 (limit 2.97G, req 140632G)
11225 I0611 08:14:37.696094 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.96G, req 0.06G)
11226 I0611 08:14:37.749905 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11227 I0611 08:14:37.760154 80385 cudnn_conv_layer.cpp:834] [5] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11228 I0611 08:14:37.787997 80382 cudnn_conv_layer.cpp:834] [2] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11229 I0611 08:14:37.788635 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.96G, req 0.06G)
11230 I0611 08:14:37.795770 80383 cudnn_conv_layer.cpp:834] [3] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11231 I0611 08:14:37.803072 80386 cudnn_conv_layer.cpp:834] [6] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1 1 1 (limit 2.96G, req 140632G)
11232 I0611 08:14:37.823436 80381 cudnn_conv_layer.cpp:834] [1] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11233 I0611 08:14:37.835868 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11234 I0611 08:14:37.841663 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11235 I0611 08:14:37.854826 80385 cudnn_conv_layer.cpp:834] [5] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11236 I0611 08:14:37.881242 80382 cudnn_conv_layer.cpp:834] [2] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11237 I0611 08:14:37.886878 80383 cudnn_conv_layer.cpp:834] [3] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11238 I0611 08:14:37.894661 80386 cudnn_conv_layer.cpp:834] [6] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.96G, req 140632G)
11239 I0611 08:14:37.916431 80381 cudnn_conv_layer.cpp:834] [1] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11240 I0611 08:14:37.929443 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv3' with space 5.28G/1 1 1 1 (limit 2.96G, req 0.06G)
11241 I0611 08:14:37.929920 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11242 F0611 08:14:37.944598 80380 cudnn_conv_layer.cu:129] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
11243 *** Check failure stack trace: ***
11244 @ 0x7fe7720a3daa (unknown)
11245 @ 0x7fe7720a3ce4 (unknown)
11246 @ 0x7fe7720a36e6 (unknown)
11247 @ 0x7fe7720a6687 (unknown)
11248 @ 0x7fe772c8f2ad caffe::CuDNNConvolutionLayer<>::Backward_gpu()
11249 @ 0x7fe7728d525d caffe::Layer<>::Backward()
11250 @ 0x7fe772b65279 caffe::Net::BackwardFromToAu()
11251 @ 0x7fe772b65585 caffe::Net::Backward()
11252 @ 0x7fe772b65801 caffe::Net::ForwardBackward()
11253 @ 0x7fe772b94278 caffe::Solver::Step()
11254 @ 0x7fe772b94fb0 caffe::Solver::Solve()
11255 @ 0x7fe772bdb71b caffe::P2PSync::InternalThreadEntry()
11256 @ 0x7fe772b9b472 caffe::InternalThread::entry()
11257 @ 0x7fe772b9c0d4 boost::detail::thread_data<>::run()
11258 @ 0x7fe7688eea4a (unknown)
11259 I0611 08:14:37.980826 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv3' with space 5.28G/1 1 1 1 (limit 2.97G, req 140632G)
11260 @ 0x7fe7604e8184 start_thread
11261 @ 0x7fe77117a37d (unknown)
11262 @ (nil) (unknown)
11263 Aborted

@drnikolaev
Copy link

Hi @ChenFengAndy it's been fixed already but we need some time to deliver new release. Thank you for reporting this.

@drnikolaev
Copy link

Fixed in v0.16.2

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants