-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MMSIG] [Doc] Update data_preprocessor.md (#2055)
[Doc] Update data_preprocessor.md
- Loading branch information
1 parent
cd183d9
commit 1f4d243
Showing
2 changed files
with
88 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,45 @@ | ||
# Data pre-processor \[Coming Soon!\] | ||
# Data pre-processor | ||
|
||
We're improving this documentation. Don't hesitate to join us! | ||
## The position of the data preprocessor in the training pipeline. | ||
|
||
[Make a pull request](https://github.com/open-mmlab/mmagic/compare) or [discuss with us](https://github.com/open-mmlab/mmagic/discussions/1429)! | ||
During the model training process, image data undergoes data augmentation using the transforms provided by mmcv. The augmented data is then loaded into a dataloader. Subsequently, a preprocessor is used to move the data from the CPU to CUDA (GPU), perform padding, and normalize the data. | ||
|
||
Below is an example of the `train_pipeline` in the complete configuration file using `configs/_base_/datasets/unpaired_imgs_256x256.py`. The train_pipeline typically defines a sequence of transformations applied to training images using the mmcv library. This pipeline is designed to prevent redundancy in the transformation functions across different downstream algorithm libraries. | ||
|
||
```python | ||
... | ||
train_pipeline = [ | ||
dict(color_type='color', key='img_A', type='LoadImageFromFile'), | ||
dict(color_type='color', key='img_B', type='LoadImageFromFile'), | ||
dict(auto_remap=True, mapping=dict(img=['img_A', 'img_B',]), | ||
share_random_params=True, | ||
transforms=[dict(interpolation='bicubic', scale=(286, 286,), type='Resize'), | ||
dict(crop_size=(256, 256,), keys=['img',], random_crop=True, type='Crop'),], | ||
type='TransformBroadcaster'), | ||
dict(direction='horizontal', keys=['img_A', ], type='Flip'), | ||
dict(direction='horizontal', keys=['img_B', ], type='Flip'), | ||
dict(mapping=dict(img_mask='img_B', img_photo='img_A'), | ||
remapping=dict(img_mask='img_mask', img_photo='img_photo'), | ||
type='KeyMapper'), | ||
dict(data_keys=['img_photo', 'img_mask',], | ||
keys=['img_photo', 'img_mask',], type='PackInputs'), | ||
] | ||
... | ||
``` | ||
|
||
In the `train_step` function in the `mmagic/models/editors/cyclegan/cyclegan.py` script, the data preprocessing steps involve moving, concatenating, and normalizing the transformed data before feeding it into the neural network. Below is an example of the relevant code logic: | ||
|
||
```python | ||
... | ||
message_hub = MessageHub.get_current_instance() | ||
curr_iter = message_hub.get_info('iter') | ||
data = self.data_preprocessor(data, True) | ||
disc_optimizer_wrapper = optim_wrapper['discriminators'] | ||
|
||
inputs_dict = data['inputs'] | ||
outputs, log_vars = dict(), dict() | ||
... | ||
``` | ||
|
||
In mmagic, the code implementation for the data processor is located at `mmagic/models/data_preprocessors/data_preprocessor.py`. The data processing workflow is as follows: | ||
![image](https://github.com/jinxianwei/CloudImg/assets/81373517/f52a92ab-f86d-486d-86ac-a2f388a83ced) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,45 @@ | ||
# 数据预处理器(待更新) | ||
# 数据预处理器 | ||
|
||
## 数据preprocessor在训练流程中的位置 | ||
|
||
在模型训练过程中,图片数据先通过mmcv中的transform进行数据增强,并加载为dataloader,而后通过preprocessor将数据从cpu搬运到cuda上,并进行padding和归一化 | ||
|
||
mmcv中的transform来自各下游算法库中transform的迁移,防止各下游算法库中transform的冗余,以`configs/_base_/datasets/unpaired_imgs_256x256.py`为例,其完整config中的`train_pipeline`如下所示 | ||
|
||
```python | ||
... | ||
train_pipeline = [ | ||
dict(color_type='color', key='img_A', type='LoadImageFromFile'), | ||
dict(color_type='color', key='img_B', type='LoadImageFromFile'), | ||
dict(auto_remap=True, mapping=dict(img=['img_A', 'img_B',]), | ||
share_random_params=True, | ||
transforms=[dict(interpolation='bicubic', scale=(286, 286,), type='Resize'), | ||
dict(crop_size=(256, 256,), keys=['img',], random_crop=True, type='Crop'),], | ||
type='TransformBroadcaster'), | ||
dict(direction='horizontal', keys=['img_A', ], type='Flip'), | ||
dict(direction='horizontal', keys=['img_B', ], type='Flip'), | ||
dict(mapping=dict(img_mask='img_B', img_photo='img_A'), | ||
remapping=dict(img_mask='img_mask', img_photo='img_photo'), | ||
type='KeyMapper'), | ||
dict(data_keys=['img_photo', 'img_mask',], | ||
keys=['img_photo', 'img_mask',], type='PackInputs'), | ||
] | ||
... | ||
``` | ||
|
||
data_preprocessor会对transform后的数据进行数据搬移,拼接和归一化,而后输入到网络中,以`mmagic/models/editors/cyclegan/cyclegan.py`中的`train_step`函数为例,代码中的引用逻辑如下 | ||
|
||
```python | ||
... | ||
message_hub = MessageHub.get_current_instance() | ||
curr_iter = message_hub.get_info('iter') | ||
data = self.data_preprocessor(data, True) | ||
disc_optimizer_wrapper = optim_wrapper['discriminators'] | ||
|
||
inputs_dict = data['inputs'] | ||
outputs, log_vars = dict(), dict() | ||
... | ||
``` | ||
|
||
在mmagic中的data_processor,其代码实现路径为`mmagic/models/data_preprocessors/data_preprocessor.py`,其数据处理流程如下图 | ||
![image](https://github.com/jinxianwei/CloudImg/assets/81373517/f52a92ab-f86d-486d-86ac-a2f388a83ced) |