[Docs] TensorRT docs (open-mmlab#136)

* Update TensorRT docs * add tensorrt ops document * reply comment * update * update document * update
tpoisonooo · Oct 26, 2021 · aa9c770 · aa9c770
1 parent b5797c9
commit aa9c770
Show file tree

Hide file tree

Showing 2 changed files with 404 additions and 2 deletions.
diff --git a/docs/backends/tensorrt.md b/docs/backends/tensorrt.md
@@ -8,7 +8,7 @@ Please install TensorRT 8 follow [install-guide](https://docs.nvidia.com/deeplea
 
 #### Build custom ops
 
-Some custom ops are created to support models in OpenMMLab, the custom ops can be build as follow:
+Some custom ops are created to support models in OpenMMLab, and the custom ops can be built as follow:
 
 ```bash
 cd ${MMDEPLOY_DIR}
@@ -18,11 +18,96 @@ cmake -DBUILD_TENSORRT_OPS=ON ..
 make -j$(nproc)
 ```
 
-If you haven't install TensorRT in default path, Please add `-DTENSORRT_DIR` flag in cmake.
+If you haven't installed TensorRT in the default path, Please add `-DTENSORRT_DIR` flag in CMake.
 
 ```bash
  cmake -DBUILD_TENSORRT_OPS=ON -DTENSORRT_DIR=${TENSORRT_DIR} ..
  make -j$(nproc)
 ```
 
 ### Convert model
+
+Please follow the tutorial in [How to convert model](../tutorials/how_to_convert_model.md). **Note** that the device must be `cuda` device.
+
+#### Int8 Support
+
+Since TensorRT supports INT8 mode, a custom dataset config can be given to calibrate the model. Following is an example for MMDetection:
+
+```python
+# calibration_dataset.py
+
+# dataset settings, same format as the codebase in OpenMMLab
+dataset_type = 'CalibrationDataset'
+data_root = 'calibration/dataset/root'
+img_norm_cfg = dict(
+    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(1333, 800),
+        flip=False,
+        transforms=[
+            dict(type='Resize', keep_ratio=True),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='Pad', size_divisor=32),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img']),
+        ])
+]
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        ann_file=data_root + 'train_annotations.json',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        ann_file=data_root + 'val_annotations.json',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        ann_file=data_root + 'test_annotations.json',
+        pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='bbox')
+```
+
+Convert your model with this calibration dataset:
+
+```python
+python tools/deploy.py \
+    ...
+    --calib-dataset-cfg calibration_dataset.py
+```
+
+If the calibration dataset is not given, the data will be calibrated with the dataset in model config.
+
+### FAQs
+
+- Error `error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]`
+
+  There is an input shape limit in deployment config:
+
+  ```python
+  backend_config = dict(
+      # other configs
+      model_inputs=[
+          dict(
+              input_shapes=dict(
+                  input=dict(
+                      min_shape=[1, 3, 320, 320],
+                      opt_shape=[1, 3, 800, 1344],
+                      max_shape=[1, 3, 1344, 1344])))
+      ])
+      # other configs
+  ```
+
+  The shape of the tensor `input` must be limited between `input_shapes["input"]["min_shape"]` and `input_shapes["input"]["max_shape"]`.
+
+- Error `error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS`
+
+  TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the default choice for SM version >= 7.0. However, you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you don't want to upgrade.
+
+  Read [this](https://forums.developer.nvidia.com/t/matrixmultiply-failed-on-tensorrt-7-2-1/158187/4) for detail.