-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
why mkl has much slower for backward than forward? #2
Comments
Intel is still working on optimizing some kernels for backward pass. These are kernels from our speech recognition pipeline and they haven't optimized them as yet. I'll ping someone from Intel to provide more details. |
Thank you for your reply. Hope we can get that soon. |
I chatted with Intel regarding this. These optimizations will be available as a part of MKL 2017's new version which will be out in a few weeks. |
Closed
mshiryaev
added a commit
to mshiryaev/DeepBench
that referenced
this issue
Nov 16, 2017
mshiryaev
added a commit
to mshiryaev/DeepBench
that referenced
this issue
Nov 23, 2017
sharannarang
pushed a commit
to sharannarang/DeepBench
that referenced
this issue
Nov 30, 2017
Deepbench updates for volta.
sharannarang
pushed a commit
that referenced
this issue
May 2, 2018
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
according to benchmark result, mkl's deep learning with convolution (not gemm) has a much slower backward speed than the forward pass.
for example , for W=341, H=79,C=32,N=4, K=32, R=5, S=10, in KNL7250 platform, forward 0.91ms, backward with input is 68.79 ms, with weight is 74.98 ms! so backward is 68 times slower than forward.
as a comparison, in titanx, forward is 0.74ms, backward with input is 3.09 ms, with weight is 0.76 ms. For forward, KNL7250 is only a little slower than titanx , but for backward, KNL7250 is much much slower. This is similar with other W,H,C configuration.
can any one give me the reason? is it because mkl has not made much optimization for backward yet?
The text was updated successfully, but these errors were encountered: