This project hosts the code for our CVPR 2017 paper.
The RGP stands for Recurrent Gaze Prediction module for gaze prediction. The recurrent gaze prediction (RGP) model learns from human gaze to predict which parts of scenes to be focused.
VAS is newly collected dataset, consisting of movie clips, and corresponding multiple descriptive sentences along with human gaze tracking data. You can download dataset from here Please read README.md file in VAS.zip
If you use this code as part of any published research, please acknowledge the following paper.
@inproceedings{yu:2017:CVPR,
author = {Youngjae Yu and Jongwook Choi and Yeonhwa Kim and Kyung Yoo and Sang-Hun Lee and Gunhee Kim},
title = "{Supervising Neural Attention Models for Video Captioning by Human Gaze Data}",
booktitle = {CVPR},
year = 2017
}
git clone https://github.com/snuvl/RGP.git
Thanks for Vision and Learning Lab researchers
Youngjae Yu and Jongwook Choi and Yeonhwa Kim and Kyung Yoo and Sang-Hun Lee and Gunhee Kim Seoul National University
MIT license