-
Notifications
You must be signed in to change notification settings - Fork 615
Closed
Description
Describe the feature and the current behavior/state.
For custom ops, we have two implementations: C++ and Cuda.
I propose that for simple ops , meaning less than 10 python lines, we implement a pure Python version too. I see several use cases:
- We can more easily test the correctness of the C++ and Cuda implementation. In the test suite, we generate random tensors and check that the results is the same for the three implementations. Since python code is easy to read, we have a readable reference implementation that's easier to check.
- It helps with users who want to understand the source code.
- It should also help all users who have issues with custom ops. It's better for those users to have a slow implementation, than an implementation which doesn't run at all. It also helps for people who want to run those ops in specialized hardware (ex: ROCm). We've had a number of issues related to users having to deal with that: BeamSearchDecoder segmentation fault (on GPU) #1109 2nd order gradients for activations #1099 Enable custom-ops for tensorflow-cpu #990 Custom Op Linux ABI Incompatibility: Undefined Symbol #987 .... We should provide a flag for them which they can activate to use the pure python implementation. For users of tensorflow addons on windows and mac, they'll be able to use their gpu.
Relevant information
- Are you willing to contribute it (yes/no): yes
- Are you willing to maintain it going forward? (yes/no): yes
- Is there a relevant academic paper? (if so, where): no
- Is there already an implementation in another framework? (if so, where): no
- Was it part of tf.contrib? (if so, where): no
Which API type would this fall under (layer, metric, optimizer, etc.)
Activation, image, layers, seq2seq, text
Who will benefit with this feature?
People having problems with custop ops, the maintainers because the test suite will be more robust.
Any other info.
We should make a pull request for a simple op see how to implement it. Especially how do we access the different implementations from the user side and tests side. Good candidates would be activations functions.
Gokkulnath