Self-Supervised Category-Level 6D Object Pose Estimation with Deep Implicit Shape Representation (AAAI 2022)
This is the implementation of Self-Supervised Category-Level 6D Object Pose Estimation with Deep Implicit Shape Representation published on AAAI 2022.
- Python 3.8
- PyTorch 1.7.0
- cudatookit 10.2
- Pytorch3D 0.3.0
Dependencies can also be easily installed using our off-the-shelf environment based on Anaconda.
-
Download camera_train, camera_val, real_train, real_test, and mesh models provided by NOCS.
-
Download the segmentation results from Mask R-CNN and predictions of NOCS from here provided by Object-DeformNet.
-
The directories should be arranged like the following folder structure:
data
├── CAMERA
│ ├── train
│ └── val
├── Real
│ ├── train
│ └── test
├── results
│ ├── mrcnn_results
│ ├── real
│ └── nocs_results
└── obj_models
├── train
├── val
├── real_train
└── real_test
- Run the following command to preprocess data, which will last very long.
sh ./preprocess/preprocess.sh
We provide the pretrained models for all the categories. We train the prediction network for each category separately so that the estimated poses and shapes do not interfere with each other among different classes. We use the average result from 6 repeated experiments as the final result in our paper for the preciseness of the experimental results. You can download the models for the DeepSDF decoder from here and all the models of pose and shape encoder from here.
For reproducibility, you should evaluate all the 36 models and then take an average.
To evaluate one category, e.g. laptop, just run the following command:
python eval/eval_6D_ICP.py \
--category laptop \
--estimator_model models/trained_models/laptop/a_v1.1/pose_model_13_dis_real_0.028434450540386583.pth \
--model_name a_v1.1
-
Train DeepSDF decoder using the CAD models. We provide pre-trained models from here. Note that we remove the models with excessively strange shapes that can’t exist in the real world beforehand. The removed models are listed in
lib/data/removed_models/
. -
Pre-train the model only using synthetic data. You should set the corresponding parameters from
./script/pre_train.sh
and simply run:
sh ./script/pre_train.sh
- Train the model in the real dataset in a self-supervised way. You should set the corresponding parameters from
./script/train.sh
and simply run:
sh ./script/train.sh
You can download the pretrained and trained models from here.
It should be noted that there is a small mistake in the original evaluation source code of NOCS, thus we revised it and recalculated all the metrics ot other methods. The evaluation source code after revision is given in this released code. The revised code is here.
This repo is built based on Object-DeformNet, Densefusion, DeepSDF and DIST.
If you find this work useful in your research, please consider citing our paper:
@inproceedings{peng2022self,
title={Self-Supervised Category-Level 6D Object Pose Estimation with Deep Implicit Shape Representation},
author={Peng, Wanli and Yan, Jianhang and Wen, Hongtao and Sun, Yi},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={36},
number={2},
pages={2082--2090},
year={2022}
}
Our code is released under MIT license.