PyTorch-QAT an example for QAT with PyTorch framework NOTE you must apply qat and ptq convert on CPU, after you get the quantized int8 model, you can do inference on CUDA device usage: python qat.py