QAT : TRT 8 compatible workflow #804

SrivastavaKshitij · 2022-09-13T13:21:17Z

I am introducing this new QAT workflow which is compatible with TensorRT 8.

TRT introduced IQuantize and IDequantize Layers which are to be manually placed in the network based on the guidelines mentioned in Q/DQ placement.

I have added support to quantize nn.Conv2d, nn.MaxPool2d and nn.AdaptiveAvfPool2d - layers that are necessary to quantize Resnet(s). I have also added a QuantGenericTensor which can be used to add QDQ layer anywhere in the model based on Nvidia's guidelines.

This PR also introduces the option to choose between per tensor quantization and per channel quantization. All quant layers are scriptable with torch.jit.script

Most of the files that I have modified / changed are under contrib folders, so it doesn't affect the main torch2trt library.

I will continue to add support for more layers but I believe this PR is big enough to land and then I can put up smaller PRs to add more functionalities.

Entire workflow is tested with Pytorch NGC Container 22.04-py3

Thanks.

praveen-aurora · 2022-09-23T02:20:22Z

torch2trt/contrib/qat/converters/QuantAdaptiveAvgPool2d.py

+            input = input_quantizer.get_output(0),
+            scale = scale_trt.get_output(0))
+
+    if hasattr(module._input_quantizer,'quant_axis'):


This seems like it can be simplified by re-using the result from the if block at line 25

SrivastavaKshitij added 24 commits August 24, 2022 10:20

made qat layer scriptable

e66f210

WIP - refactor qat library

38342b2

fixed mapping function

9922d1c

WIP

44d0750

WIP

73e523b

added converter for quantconv

5b1d6b9

working quant conv converter

f4b52ee

changing workflow

39244d7

removed redundant data

39b8f9f

working maxpool layer

fa6712c

added working quantmaxpool2d laayer

3d260ac

added support for AdaptiveAvgPool2d

7419d70

fixed trt and non trt mode

c9c3f0f

added converter for quant adaptive avgpool 2d

4687282

fixed adaptiveavgpool2d converter import

c4e13ef

fixed pytorch fake quant ops

37f89fb

added generic converter

6e63c4f

fixed nn.conv2d

179a9d9

removed graphviz import

f31f67c

fixed documentation

60cb7e6

fixed build script

c0521f1

fixed parameter type for torch.fake_quantize per channel

3f91fae

Fixed accuracy metrics

1068c64

added new patch file

fe5a74d

SrivastavaKshitij changed the title ~~[WIP] QAT : TRT 8 compatible workflow~~ QAT : TRT 8 compatible workflow Sep 16, 2022

SrivastavaKshitij added 2 commits September 19, 2022 10:39

fixed import warning

5e3e65b

fixed precision of zero point

0ef2df0

praveen-aurora reviewed Sep 23, 2022

View reviewed changes

SrivastavaKshitij added 2 commits September 23, 2022 10:14

fixed logging issue for conv2d while scripting

85cbc66

stripped off tensor quantizer depedency, hopefully

5f3de32

SrivastavaKshitij added 5 commits September 23, 2022 11:21

added min reproducible file

e0742c1

fixed converters

3a3c4bd

fixed loading

e0c7e62

fixed quant axis initial value

f994bad

WIP

8317ffa

SrivastavaKshitij closed this Oct 6, 2022

SrivastavaKshitij reopened this Oct 6, 2022

SrivastavaKshitij marked this pull request as draft October 17, 2022 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QAT : TRT 8 compatible workflow #804

QAT : TRT 8 compatible workflow #804

SrivastavaKshitij commented Sep 13, 2022 •

edited

Loading

praveen-aurora Sep 23, 2022

QAT : TRT 8 compatible workflow #804

Are you sure you want to change the base?

QAT : TRT 8 compatible workflow #804

Conversation

SrivastavaKshitij commented Sep 13, 2022 • edited Loading

praveen-aurora Sep 23, 2022

Choose a reason for hiding this comment

SrivastavaKshitij commented Sep 13, 2022 •

edited

Loading