Skip to content

Latest commit

 

History

History
112 lines (91 loc) · 5.34 KB

README.md

File metadata and controls

112 lines (91 loc) · 5.34 KB

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

By Zhen Liu1, Yinglong Wang2, Bing Zeng3 and Shuaicheng Liu3,1*

1Megvii Technology, 2Noah’s Ark Lab, Huawei Technologies, 3University of Electronic Science and Technology of China

This is the official MegEngine implementation of our ECCV2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer (HDR-Transformer). The PyTorch version is available at HDR-Transformer-PyTorch.

News

  • 2022.08.26 The PyTorch implementation is now avaible.
  • 2022.08.11 The arXiv version of our paper is now available.
  • 2022.07.19 The source code is now available.
  • 2022.07.04 Our paper has been accepted by ECCV 2022.

Abstract

High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets.

Pipeline

pipeline Illustration of the proposed CA-ViT. As shown in Fig (a), the CA-ViT is designed as a dual-branch architecture where the global branch models long-range dependency among image contexts through a multi-head Transformer encoder, and the local branch explores both intra-frame local details and inner-frame feature relationship through a local context extractor. Fig. (b) depicts the key insight of our HDR deghosting approach with CA-ViT. To remove the residual ghosting artifacts caused by large motions of the hand (marked with blue), long-range contexts (marked with red), which are required to hallucinate reasonable content in the ghosting area, are modeled by the self-attention in the global branch. Meanwhile, the well-exposed non-occluded local regions (marked with green) can be effectively extracted with convolutional layers and fused by the channel attention in the local branch.

Usage

Requirements

  • Python 3.7.0
  • MegEngine 1.8.3+
  • CUDA 10.0 on Ubuntu 18.04

Install the require dependencies:

conda create -n hdr_transformer python=3.7
conda activate hdr_transformer
pip install -r requirements.txt

Dataset

  1. Download the dataset (include the training set and test set) from Kalantari17's dataset
  2. Move the dataset to ./data and reorganize the directories as follows:
./data/Training
|--001
|  |--262A0898.tif
|  |--262A0899.tif
|  |--262A0900.tif
|  |--exposure.txt
|  |--HDRImg.hdr
|--002
...
./data/Test (include 15 scenes from `EXTRA` and `PAPER`)
|--001
|  |--262A2615.tif
|  |--262A2616.tif
|  |--262A2617.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
|--BarbequeDay
|  |--262A2943.tif
|  |--262A2944.tif
|  |--262A2945.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
  1. Prepare the corpped training set by running:
cd ./dataset
python gen_crop_data.py

Training & Evaluaton

cd HDR-Transformer

To train the model, run:

python train.py --model_dir experiments

To evaluate, run:

python evaluate.py --model_dir experiments --restore_file experiments/val_model_best.pth

Results

results

Acknowledgement

The MegEngine version of the Swin-Transformer is based on Swin-Transformer-MegEngine. Our work is inspired the following works and uses parts of their official implementations:

We thank the respective authors for open sourcing their methods.

Citation

@inproceedings{liu2022ghost,
  title={Ghost-free High Dynamic Range Imaging with Context-aware Transformer},
  author={Liu, Zhen and Wang, Yinglong and Zeng, Bing and Liu, Shuaicheng},
  booktitle={European Conference on Computer Vision},
  pages={344--360},
  year={2022},
  organization={Springer}
}

Contact

If you have any questions, feel free to contact Zhen Liu at liuzhen03@megvii.com.