-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Very slow inference in tensorflow #6
Comments
That slowdown seems quite drastic, do you have e.g. many categories or images to evaluate in one step? I suspect that dedicated CUDA kernel would speed up the implementation a lot (better than the large succession of the masking/selecting/sort operations). However I don't plan to tackle this in the near future - contributions in this direction are welcome. |
I just use it on cityscapes dataset. I use distributed computing. model is deeplab v3+ |
I did some profiling. It seems in tensorflow the In Pytorch, as expected, the I will investigate a bit more, it might mandate an issue report for tensorflow. |
This python notebook summarizes the problem: tensorflow/profile_ops.ipynb |
After looking more into it, it seems the easiest way is to create a custom tensorflow op using cub exclusive sum instead of the native I will not implement this for now as I'm mainly using pytorch - I might do it one day but in the meantime I'll tag this as |
The speed of cumsum has been improved significantly; I'm going to close this. Feel free to re-open if you feel it still isn't fast enough. |
@ekelsen Which version of tensor flow has these improvements? |
Currently just HEAD: tensorflow/tensorflow@73e3215 |
Thanks for the pointer @ekelsen. Closing this issue |
@ekelsen , hello, I'm using Keras(backend: tensorflow 1.12) and cuda9.0, but the train speed is still slow with this loss function. Can you give me advice? My GPU is GTX 1080Ti |
@stillwaterman I expect the build of Tensorflow you are using, was made before the changes to cumsum were implemented. Building Tensorflow from source, might be a reasonable option to expedite training. |
@jianlong-yuan hi~I want to use Lovász-Softmax loss in deeplab v3+ but failed. Could you give me some reference or demos? Thanks. |
@Z-Ianthe How to solve the abouve problems, i put here https://github.com/jianlong-yuan/LovaszSoftmax_tf/tree/master |
I don't have time to investivate into tensorflow issues for now but I am at least reopening the issue. |
before i use your loss function, 2.5sec/step
after i use your loss function, 32.0sec/step
i use tensorflow 1.6.0
The text was updated successfully, but these errors were encountered: