[Improve] Speed up data preprocessor. #1064

mzr1996 · 2022-09-30T01:11:38Z

Motivation

The original data preprocessor consumes much time because it needs to cast the label tensors of samples one by one.

Modification

In ClsDataPreprocessoor, only cast gt_label of data samples to cuda instead of the whole data sample. And before casting, stack/concat the tensors.
Refactoring the batch augmentations class like Mixup. Now it directly processes the batch inputs and batch scores (one-hot format labels), and won't accept normal labels or data samples.
Add ClsDataSample serialization override functions. During serialization in ForkingPicker, convert all tensors to NumPy array and convert them back during deserialization. This is to decrease the consumption of file descriptors in the dataloader.

Here is the speed comparison in MobileNetV2 (batch size 64)

	inference time	FPS
Original	19.5 ms	3289
New	14.1 ms	4531

BC-breaking (Optional)

The Mixup, CutMix, and ResizeMix classes won't accept the num_classes argument, and it becomes the argument of ClsDataPreprocessor.

In the config files, the changes are as below:

# --------- Original config --------
model = dict(
    ...
    train_cfg=dict(augments=[
        dict(type='Mixup', alpha=0.2, num_classes=1000),
        dict(type='CutMix', alpha=1.0, num_classes=1000)
    ]),
)

data_preprocessor = dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True,
)

# -------- New config --------
model = dict(
    ...
    train_cfg=dict(augments=[
        dict(type='Mixup', alpha=0.2,
        dict(type='CutMix', alpha=1.0)
    ]),
)

data_preprocessor = dict(
    num_classes=1000,
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True,
)

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
CLA has been signed and all committers have signed the CLA in this PR.

codecov · 2022-09-30T01:17:41Z

Codecov Report

Base: 0.02% // Head: 91.32% // Increases project coverage by +91.29% 🎉

Coverage data is based on head (08dd17c) compared to base (b8b31e9).
Patch has no changes to coverable lines.

❗ Current head 08dd17c differs from pull request most recent head 833ebe7. Consider uploading reports for the commit 833ebe7 to get more accurate results

Additional details and impacted files

@@             Coverage Diff              @@
##           dev-1.x    #1064       +/-   ##
============================================
+ Coverage     0.02%   91.32%   +91.29%     
============================================
  Files          121      128        +7     
  Lines         8217     9509     +1292     
  Branches      1368     1498      +130     
============================================
+ Hits             2     8684     +8682     
+ Misses        8215      639     -7576     
- Partials         0      186      +186

Flag	Coverage Δ
unittests	`91.32% <ø> (+91.29%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmcls/apis/inference.py	`0.00% <0.00%> (ø)`
mmcls/datasets/transforms/compose.py
mmcls/models/backbones/swin_transformer_v2.py	`89.47% <0.00%> (ø)`
mmcls/models/backbones/efficientformer.py	`95.08% <0.00%> (ø)`
mmcls/models/heads/efficientformer_head.py	`93.10% <0.00%> (ø)`
mmcls/models/backbones/edgenext.py	`95.20% <0.00%> (ø)`
mmcls/models/utils/layer_scale.py	`86.66% <0.00%> (ø)`
mmcls/models/backbones/mvit.py	`92.46% <0.00%> (ø)`
mmcls/models/backbones/mobileone.py	`94.47% <0.00%> (ø)`
mmcls/structures/utils.py	`77.77% <0.00%> (ø)`
... and 118 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

configs/_base_/datasets/imagenet21k_bs128.py

tonysy · 2022-10-08T09:06:15Z

configs/_base_/datasets/imagenet_bs64_mixer_224.py

@@ -3,6 +3,7 @@



This line can be removed to keep consistent with other config file

tonysy · 2022-10-08T09:23:35Z

mmcls/models/utils/data_preprocessor.py

-        data['inputs'] = inputs
-
-        return data
+        data_samples = data.get('data_samples', None)


To help the user for understanding our codebase, maybe we need to update the document by adding some description of the data_samples used in MMClassification(with text or link to MMEngine related document), this could be added into the document TODO list.

We can create an issue and add it to Oct. TODO list

tonysy

LGTM

* [Improve] Speed up data preprocessor. * Add ClsDataSample serialization override functions. * Add unit tests * Modify configs to fit new mixup args. * Fix `num_classes` of the ImageNet-21k config. * Update docs.

mzr1996 requested a review from tonysy September 30, 2022 01:11

mzr1996 added the 1.0rc Functionalities for MMClassification 1.0rc label Sep 30, 2022

okotaku mentioned this pull request Oct 1, 2022

How to get inference time and FPS on v1.0.0? #1069

Closed

2 tasks

tonysy reviewed Oct 8, 2022

View reviewed changes

configs/_base_/datasets/imagenet21k_bs128.py Outdated Show resolved Hide resolved

tonysy reviewed Oct 8, 2022

View reviewed changes

mzr1996 added 5 commits October 9, 2022 14:53

[Improve] Speed up data preprocessor.

d9a27ec

Add ClsDataSample serialization override functions.

7243aed

Add unit tests

8d9865a

Modify configs to fit new mixup args.

f474182

Fix num_classes of the ImageNet-21k config.

08dd17c

mzr1996 force-pushed the 1x-acceleration branch from 25b549d to 08dd17c Compare October 9, 2022 06:54

tonysy approved these changes Oct 11, 2022

View reviewed changes

Update docs.

833ebe7

mzr1996 merged commit 29f066f into open-mmlab:dev-1.x Oct 17, 2022

mzr1996 mentioned this pull request Oct 18, 2022

[Feature] Support HorNet Backbone for dev1.x #1094

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improve] Speed up data preprocessor. #1064

[Improve] Speed up data preprocessor. #1064

mzr1996 commented Sep 30, 2022 •

edited

Loading

codecov bot commented Sep 30, 2022 •

edited

Loading

tonysy Oct 8, 2022

tonysy Oct 8, 2022

tonysy Oct 11, 2022

tonysy left a comment

[Improve] Speed up data preprocessor. #1064

[Improve] Speed up data preprocessor. #1064

Conversation

mzr1996 commented Sep 30, 2022 • edited Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

codecov bot commented Sep 30, 2022 • edited Loading

Codecov Report

tonysy Oct 8, 2022

Choose a reason for hiding this comment

tonysy Oct 8, 2022

Choose a reason for hiding this comment

tonysy Oct 11, 2022

Choose a reason for hiding this comment

tonysy left a comment

Choose a reason for hiding this comment

mzr1996 commented Sep 30, 2022 •

edited

Loading

codecov bot commented Sep 30, 2022 •

edited

Loading