π§βπ« Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection π§βπ«
This repository contains the offical implementation for our CVPR-2023 paper.
β¨We are now able to train detector on 10% MS-COCO to 40 mAPβ¨
Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection
[arxiv] [code] [project page]
Xinjiang Wang*, Xingyi Yang*, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang
(*: Co-first Author)
- Selected as Hightligh for CVPR2023π₯ (235/2360, top 10% accepted paper)
In this paper, we systematically investigate the inconsistency problems in semi-supervised object detection, where the pseudo boxes may be highly inaccurate and vary greatly at different stages of training. To alleviate the aforementioned problem, we present a holistic semi-supervised object detector termed Consistent-Teacher. Consistent-Teacher achieves compelling improvement on a wide range of evaluations and serves as a new solid baseline for SSOD.
All results, logs, configs and checkpoints are listed here. Enjoy π!
MS-COCO 1%/2%/5/%/10% Labeled Data
Method | Data | mAP | config | Links | Google Drive | Baidu Drive |
---|---|---|---|---|---|---|
ConsistentTeacher | MS-COCO 1% | 25.50 | config | log/ckpt | log/ckpt | log/ckpt |
ConsistentTeacher | MS-COCO 2% | 30.70 | config | log/ckpt | log/ckpt | log/ckpt |
ConsistentTeacher | MS-COCO 5% | 36.60 | config | log/ckpt | log/ckpt | log/ckpt |
ConsistentTeacher | MS-COCO 10% | 40.20 | config | log/ckpt | log/ckpt | log/ckpt |
ConsistentTeacher 2x8 | MS-COCO 10% | 38.00 | config | log/ckpt | log/ckpt | log/ckpt |
ConsistentTeacher 2x8 (FP16) | MS-COCO 10% | 37.90 | config | log/ckpt | log/ckpt | log/ckpt |
MS-COCO100% Labeled + Unlabeled Data
Method | Data | mAP | config | Links | Google Drive | Baidu Drive |
---|---|---|---|---|---|---|
ConsistentTeacher 5x8 | MS-COCO 100% + unlabeled | 48.20 | config | log/ckpt | log/ckpt | log/ckpt |
PASCAL VOC07 Label + VOC12 Unlabel
Method | Data | mAP | AP50 | config | Links |
---|---|---|---|---|---|
ConsistentTeacher | PASCAL VOC07 Label + VOC12 Unlabel | 59.00 | 81.00 | config | log/ckpt |
- Defaultly, all models are trained on 8*V100 GPUs with 5 images per GPU.
- Additionally, we support the
2x8
andfp16
training setting to ensure everyone is able to run the code, even with only 12G graphic cards. - With
8x2+fp16
, the total training time for MS-COCO is less than 1 day. - We carefully tuned the hyper-parameters after submitting the paper, which is why the results in the repository are slightly higher than those reported in the paper.
Zoom in for better View.
βββ configs
βββ baseline
β |-- mean_teacher_retinanet_r50_fpn_coco_180k_10p.py
| # Mean Teacher COCO 10% config
| |-- mean_teacher_retinanet_r50_fpn_voc0712_72k.py
| # Mean Teacher VOC0712 config
βββ consistent-teacher
| |-- consistent_teacher_r50_fpn_coco_360k_fulldata.py
| # Consistent Teacher COCO label+unlabel config
|
| |-- consistent_teacher_r50_fpn_coco_180k_1/2/5/10p.py
| # Consistent Teacher COCO 1%/2%/5%/10% config
| |-- consistent_teacher_r50_fpn_coco_180k_10p_2x8.py
| # Consistent Teacher COCO 10% config with 8x2 GPU
| |-- consistent_teacher_r50_fpn_voc0712_72k.py
| # Consistent Teacher VOC0712 config
βββ ssod
|-- models/mean_teacher.py
| # Consistent Teacher Class file
|-- models/consistent_teacher.py
| # Consistent Teacher Class file
|-- models/dense_heads/fam3d.py
| # FAM-3D Class file
|-- models/dense_heads/improved_retinanet.py
| # ImprovedRetinaNet baseline file
|-- core/bbox/assigners/dynamic_assigner.py
| # Aadaptive Sample Assignment Class file
βββ tools
|-- dataset/semi_coco.py
| # COCO data preprocessing
|-- train.py/test.py
| # Main file for train and evaluate the models
Pytorch=1.9.0
mmdetection=2.25.0
mmcv=1.3.9
wandb=0.10.31
or
mmdetection=2.28.1
mmcv=1.7.1
- We use wandb for visualization, if you don't want to use it, just comment line
328-339
inconfigs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py
.
Install all the requirements INSTALL, then git pull the mmdetecton
repo and ConsistentTeacher under the same folder
git clone https://github.com/open-mmlab/mmdetection.git
git clone https://github.com/Adamdad/ConsistentTeacher.git
cd ConsistentTeacher/
make install
- Download the COCO dataset
- Execute the following command to generate data set splits:
# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
# coco_semi/
# instances_train2017.${fold}@${percent}.json
# coco/
# train2017/
# val2017/
# unlabeled2017/
# annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct
For concrete instructions of what should be downloaded, please refer to tools/dataset/prepare_coco_data.sh
line 11-24
- Download JSON files for unlabeled images PASCAL VOC data in COCO format
cd ${DATAROOT}
wget https://storage.cloud.google.com/gresearch/ssl_detection/STAC_JSON.tar
tar -xf STAC_JSON.tar.gz
# voc/VOCdevkit/VOC2007/instances_test.json
# voc/VOCdevkit/VOC2007/instances_trainval.json
# voc/VOCdevkit/VOC2012/instances_trainval.json
- To train model on the partial labeled data and full labeled data setting:
# CONFIG_FILE_PATH: the config file for experiment.
# GPU_NUM: number of gpus to run the job
bash tools/dist_train.sh <CONFIG_FILE_PATH> <NUM_GPUS>
For example, to train ours R50
model with 8 GPUs:
bash tools/dist_train.sh configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py 8
- To train model on new dataset:
The core idea is to convert a new dataset to coco format. Details about it can be found in the adding new dataset.
- To inference with the pretrained models on images and videos and plot the bounding boxes, we add two scripts
tools/inference.py
for image inferencetools/inference_vido.py
for video inference
This project is released under the Apache 2.0 license.
@article{wang2023consistent,
author = {Xinjiang Wang, Xingyi Yang, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang },
title = {Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection},
journal = {The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)},
year = {2023},
}
- This code pattern was inspired from a SoftTeacher and mmdet