Skip to content

Commit

Permalink
[Docs] TensorRT docs (open-mmlab#136)
Browse files Browse the repository at this point in the history
* Update TensorRT docs

* add tensorrt ops document

* reply comment

* update

* update document

* update
  • Loading branch information
grimoire authored Oct 26, 2021
1 parent b5797c9 commit aa9c770
Show file tree
Hide file tree
Showing 2 changed files with 404 additions and 2 deletions.
89 changes: 87 additions & 2 deletions docs/backends/tensorrt.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Please install TensorRT 8 follow [install-guide](https://docs.nvidia.com/deeplea

#### Build custom ops

Some custom ops are created to support models in OpenMMLab, the custom ops can be build as follow:
Some custom ops are created to support models in OpenMMLab, and the custom ops can be built as follow:

```bash
cd ${MMDEPLOY_DIR}
Expand All @@ -18,11 +18,96 @@ cmake -DBUILD_TENSORRT_OPS=ON ..
make -j$(nproc)
```

If you haven't install TensorRT in default path, Please add `-DTENSORRT_DIR` flag in cmake.
If you haven't installed TensorRT in the default path, Please add `-DTENSORRT_DIR` flag in CMake.

```bash
cmake -DBUILD_TENSORRT_OPS=ON -DTENSORRT_DIR=${TENSORRT_DIR} ..
make -j$(nproc)
```

### Convert model

Please follow the tutorial in [How to convert model](../tutorials/how_to_convert_model.md). **Note** that the device must be `cuda` device.

#### Int8 Support

Since TensorRT supports INT8 mode, a custom dataset config can be given to calibrate the model. Following is an example for MMDetection:

```python
# calibration_dataset.py

# dataset settings, same format as the codebase in OpenMMLab
dataset_type = 'CalibrationDataset'
data_root = 'calibration/dataset/root'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'train_annotations.json',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + 'val_annotations.json',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'test_annotations.json',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox')
```

Convert your model with this calibration dataset:

```python
python tools/deploy.py \
...
--calib-dataset-cfg calibration_dataset.py
```

If the calibration dataset is not given, the data will be calibrated with the dataset in model config.

### FAQs

- Error `error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]`

There is an input shape limit in deployment config:

```python
backend_config = dict(
# other configs
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 320, 320],
opt_shape=[1, 3, 800, 1344],
max_shape=[1, 3, 1344, 1344])))
])
# other configs
```

The shape of the tensor `input` must be limited between `input_shapes["input"]["min_shape"]` and `input_shapes["input"]["max_shape"]`.

- Error `error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS`

TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the default choice for SM version >= 7.0. However, you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you don't want to upgrade.

Read [this](https://forums.developer.nvidia.com/t/matrixmultiply-failed-on-tensorrt-7-2-1/158187/4) for detail.
Loading

0 comments on commit aa9c770

Please # to comment.