Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Adding automatic mixed precision support #1637

Merged
merged 6 commits into from
Jul 30, 2019
Merged

Conversation

vinhngx
Copy link
Contributor

@vinhngx vinhngx commented Jul 23, 2019

Automatic Mixed Precision for Tensorflow has been recently introduced:
https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540

This PR adds automatic mixed precision training to all tensor2tensor models via setting a single OS flag:

export TF_ENABLE_AUTO_MIXED_PRECISION=1

Alternatively, it can also be enabled by directly calling the optimize() function with an appropriate parameter:

optimize(loss, learning_rate, hparams, use_tpu=False, variables=None, gpu_auto_mixed_precision=True)

Automatic (GPU) mixed precision should not be used with TPU and manual mixed precision. The PR checks for these cases.

We've tested speed impact on several models on 1x V100 16GB GPU:
Image classification - Resnet-50: 1.9x
Translate - Transformer-big: 1.9x
Sentiment analysis - Transformer-big: 1.4x

@googlebot googlebot added the cla: yes PR author has signed CLA label Jul 23, 2019
@afrozenator
Copy link
Contributor

This is so awesome, thanks at lot @vinhngx !!

@afrozenator afrozenator merged commit bba231f into tensorflow:master Jul 30, 2019
tensorflow-copybara pushed a commit that referenced this pull request Jul 30, 2019
PiperOrigin-RevId: 260754631
@liutengbo
Copy link

Do you have the result of single TITAN V for translate task (transformer_base)

@nhatuan84
Copy link

Automatic Mixed Precision for Tensorflow has been recently introduced: https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540

This PR adds automatic mixed precision training to all tensor2tensor models via setting a single OS flag:

export TF_ENABLE_AUTO_MIXED_PRECISION=1

Alternatively, it can also be enabled by directly calling the optimize() function with an appropriate parameter:

optimize(loss, learning_rate, hparams, use_tpu=False, variables=None, gpu_auto_mixed_precision=True)

Automatic (GPU) mixed precision should not be used with TPU and manual mixed precision. The PR checks for these cases.

We've tested speed impact on several models on 1x V100 16GB GPU: Image classification - Resnet-50: 1.9x Translate - Transformer-big: 1.9x Sentiment analysis - Transformer-big: 1.4x

Does it support tensorflow C++ API?
Thanks.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
cla: yes PR author has signed CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants