Enable custom-ops for tensorflow-cpu #990

seanpmorgan · 2020-01-31T15:03:32Z

Currently tensorflow-cpu will fail when trying to load custom ops for undefined symbol: __cudaPushCallConfiguration:

from tensorflow_addons.activations.gelu import gelu
File "/usr/local/lib/python3.7/site-packages/tensorflow_addons/activations/gelu.py", line 24, in <module>
get_path_to_datafile("custom_ops/activations/_activation_ops.so"))
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/load_library.py", line 57, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python3.7/site-packages/tensorflow_addons/custom_ops/activations/_activation_ops.so: undefined symbol: __cudaPushCallConfiguration

I'm not quite sure what is causing this without having done a deep dive, but linking this possibly related PR since this was a departure from standard TF linking:
#539

The text was updated successfully, but these errors were encountered:

fsx950223 · 2020-02-06T05:38:44Z

Custom ops should be separated to cpu shared files and gpu shared files.
Load specific shared files depends on tf.test.is_built_with_gpu_support()

seanpmorgan · 2020-02-06T19:56:17Z

Custom ops should be separated to cpu shared files and gpu shared files.
Load specific shared files depends on tf.test.is_built_with_gpu_support()

Given the dlopen() dynamic kernel strategy that TF uses this shouldn't be required:
https://github.com/tensorflow/community/blob/master/rfcs/20180604-dynamic-kernels.md

However, I agree that your suggestion is a possible solution if for some reason we're unable this fixed.

failure-to-thrive · 2020-02-15T18:19:28Z

The problem is - libtensorflow_framework.so.2 exports CUDA stubs they use to dynamically load CUDA runtime. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/stream_executor/cuda/cudart_stub.cc
However, tensorflow-cpu doesn't have these stubs!
A simple reordering of TFA linking to allow CUDA libraries be first seems to solve the problem.
Let me explain.
Here is an import table of _activation_ops.so:

root@cff0ec50c2b5:~/addons# objdump -T bazel-bin/tensorflow_addons/custom_ops/activations/_activation_ops.so | grep cuda
0000000000000000      DF *UND*  0000000000000000              __cudaPushCallConfiguration
0000000000000000      DF *UND*  0000000000000000              __cudaUnregisterFatBinary
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFatBinary
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFatBinaryEnd
0000000000000000      DF *UND*  0000000000000000              __cudaPopCallConfiguration
0000000000000000      DF *UND*  0000000000000000              cudaLaunchKernel
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFunction

Exports of libtensorflow_framework.so.2:

root@cff0ec50c2b5:~# objdump -T /tensorflow-2.1.0/python3.6/tensorflow_core/libtensorflow_framework.so.2 | grep __cuda
00000000014e79e0 g    DF .text  0000000000000143  Base        __cudaRegisterFunction
00000000014e78a0 g    DF .text  000000000000013b  Base        __cudaRegisterVar
00000000014e7390 g    DF .text  0000000000000091  Base        __cudaUnregisterFatBinary
00000000014e7610 g    DF .text  000000000000013f  Base        __cudaPopCallConfiguration
00000000014e7430 g    DF .text  0000000000000091  Base        __cudaRegisterFatBinaryEnd
00000000014e7750 g    DF .text  000000000000014f  Base        __cudaRegisterFatBinary
00000000014e74d0 g    DF .text  0000000000000134  Base        __cudaPushCallConfiguration

After a simple modification of https://github.com/tensorflow/addons/blob/master/tensorflow_addons/tensorflow_addons.bzl

root@cff0ec50c2b5:~/addons# objdump -T bazel-bin/tensorflow_addons/custom_ops/activations/_activation_ops.so | grep cuda

now returns nothing and _activation_ops.so grows in size.

Looks great but I've not tested how it works yet. 😆

fsx950223 · 2020-02-18T12:21:05Z

The problem is - libtensorflow_framework.so.2 exports CUDA stubs they use to dynamically load CUDA runtime. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/stream_executor/cuda/cudart_stub.cc
However, tensorflow-cpu doesn't have these stubs!
A simple reordering of TFA linking to allow CUDA libraries be first seems to solve the problem.
Let me explain.
Here is an import table of _activation_ops.so:
root@cff0ec50c2b5:~/addons# objdump -T bazel-bin/tensorflow_addons/custom_ops/activations/_activation_ops.so | grep cuda
0000000000000000      DF *UND*  0000000000000000              __cudaPushCallConfiguration
0000000000000000      DF *UND*  0000000000000000              __cudaUnregisterFatBinary
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFatBinary
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFatBinaryEnd
0000000000000000      DF *UND*  0000000000000000              __cudaPopCallConfiguration
0000000000000000      DF *UND*  0000000000000000              cudaLaunchKernel
0000000000000000      DF *UND*  0000000000000000              __cudaRegisterFunction
Exports of libtensorflow_framework.so.2:
root@cff0ec50c2b5:~# objdump -T /tensorflow-2.1.0/python3.6/tensorflow_core/libtensorflow_framework.so.2 | grep __cuda
00000000014e79e0 g    DF .text  0000000000000143  Base        __cudaRegisterFunction
00000000014e78a0 g    DF .text  000000000000013b  Base        __cudaRegisterVar
00000000014e7390 g    DF .text  0000000000000091  Base        __cudaUnregisterFatBinary
00000000014e7610 g    DF .text  000000000000013f  Base        __cudaPopCallConfiguration
00000000014e7430 g    DF .text  0000000000000091  Base        __cudaRegisterFatBinaryEnd
00000000014e7750 g    DF .text  000000000000014f  Base        __cudaRegisterFatBinary
00000000014e74d0 g    DF .text  0000000000000134  Base        __cudaPushCallConfiguration
After a simple modification of https://github.com/tensorflow/addons/blob/master/tensorflow_addons/tensorflow_addons.bzl
root@cff0ec50c2b5:~/addons# objdump -T bazel-bin/tensorflow_addons/custom_ops/activations/_activation_ops.so | grep cuda
now returns nothing and _activation_ops.so grows in size.

Looks great but I've not tested how it works yet. 😆

Awesome, @failure-to-thrive you are the expert on magical bugs. 😂

tanguycdls · 2020-09-22T21:25:46Z

Hi, I'm running into the same issue: I'm trying to use Tensorflow cpu to reduce my docker file weight and now the import yields the same error:

activations/_activation_ops.so: undefined symbol: __cudaPushCallConfiguration

My only usage of Addons is for AdamW during training and the gelu activation. I know gelu has been moved to TF core: i'll try to use the nightly version and stop using the custom ops from addons.

Thanks,

MrGeva · 2020-10-05T10:27:09Z

Building the addons from sources fixed it for me on TF 2.2 (installed with pip).
I followed the CPU custom ops instructions at:
https://github.com/tensorflow/addons/tree/master#cpu-custom-ops

seanpmorgan · 2023-03-01T04:07:38Z

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

seanpmorgan added bug Something isn't working help wanted Needs help as a contribution build labels Jan 31, 2020

georgesterpu mentioned this issue Feb 19, 2020

BeamSearchDecoder segmentation fault (on GPU) #1109

Closed

gabrieldemarmiesse mentioned this issue Feb 20, 2020

Adding a pure python implementation corresponding to simple C++/Cuda ops. #1114

Closed

bhack mentioned this issue Jul 1, 2020

Addons image transforms crashing with CPU only tensorflow #1959

Closed

seanpmorgan closed this as completed Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable custom-ops for tensorflow-cpu #990

Enable custom-ops for tensorflow-cpu #990

seanpmorgan commented Jan 31, 2020

fsx950223 commented Feb 6, 2020 •

edited

Loading

seanpmorgan commented Feb 6, 2020

failure-to-thrive commented Feb 15, 2020

fsx950223 commented Feb 18, 2020

tanguycdls commented Sep 22, 2020

MrGeva commented Oct 5, 2020

seanpmorgan commented Mar 1, 2023

Enable custom-ops for tensorflow-cpu #990

Enable custom-ops for tensorflow-cpu #990

Comments

seanpmorgan commented Jan 31, 2020

fsx950223 commented Feb 6, 2020 • edited Loading

seanpmorgan commented Feb 6, 2020

failure-to-thrive commented Feb 15, 2020

fsx950223 commented Feb 18, 2020

tanguycdls commented Sep 22, 2020

MrGeva commented Oct 5, 2020

seanpmorgan commented Mar 1, 2023

fsx950223 commented Feb 6, 2020 •

edited

Loading