Skip to content

tf.function bug #807

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
fsx950223 opened this issue Dec 23, 2019 · 5 comments · Fixed by #823
Closed

tf.function bug #807

fsx950223 opened this issue Dec 23, 2019 · 5 comments · Fixed by #823

Comments

@fsx950223
Copy link
Member

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow version and how it was installed (source or binary):
  • TensorFlow-Addons version and how it was installed (source or binary):
  • Python version:
  • Is GPU used? (yes/no):

Describe the bug
tf.function will trigger tf.function retracing several times if user convert python types into the tf.function which will cause memory leak.

A clear and concise description of what the bug is.
Invalid usage of tf.function leads memory leak
Code to reproduce the issue

def softshrink(x, lower=-0.5, upper=0.5):

Provide a reproducible test case that is the bare minimum necessary to generate the problem.
bazel test -c opt -k --test_timeout 300,450,1200,3600 --crosstool_top=//build_deps/toolchains/gcc7_manylinux2010-nvcc-cuda10.1:toolchain --test_output=all --jobs=1 //tensorflow_addons/activations:softshrink_test

Other info / logs
WARNING:tensorflow:9 out of the last 27 calls to <function softshrink at 0x7f37439a28c8> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

@Squadrick
Copy link
Member

@fsx950223 Have you found the same issue with any other function apart from softshrink?

@fsx950223
Copy link
Member Author

fsx950223 commented Dec 26, 2019

@fsx950223 Have you found the same issue with any other function apart from softshrink?

Function which passes python arguments has same issue.rrelu.sparsemax

@seanpmorgan
Copy link
Member

@fsx950223 Thanks for the report!

So in practice these activation functions will likely not see frequent changes to their python args. The multiple retracing is happening here because each test case is modifying the data type or arguments.

AKAIK There is no real benefit to having these activations decorated with @tf.function though since they are simply calls to a custom op. We could remove the decoration to clean up the test logs a little. @WindQAQ thoughts?

@seanpmorgan
Copy link
Member

A similar issue occurs for CorrelationCost layer in typical usage. See this notebook from our gitter:
https://colab.research.google.com/drive/11zDwwgRyM5VvSgjDzvu4mbERiQCEC0Ih#scrollTo=afXILgNysdyZ

IMO we should remove @tf.function tracing on custom-op functions as I don't believe there should be any noticeable speed up.

@WindQAQ
Copy link
Member

WindQAQ commented Jan 2, 2020

@fsx950223 Thanks for the report!

So in practice these activation functions will likely not see frequent changes to their python args. The multiple retracing is happening here because each test case is modifying the data type or arguments.

AKAIK There is no real benefit to having these activations decorated with @tf.function though since they are simply calls to a custom op. We could remove the decoration to clean up the test logs a little. @WindQAQ thoughts?

Thanks for pinging me. I am fine with removing tf.function. At least for a single activation in addons, removing it won't lead to performance degradation.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants