web2021-figure-captioning

Data and supplementary materials for the Web Conference 2021 paper "Generating Accurate Caption Units For Figure Captioning".

Some Caveats

While the paper expects to appear in dl.acm.org, currently the final camera ready version is available here.

In our view, figure captioning is a visionary problem. Our work on this line is a proof-of-concept of ML capability, based on synthetic figure question answering data availability.

Data

Directory Structure

This directory contains three parts of material.

Dataset (DVQA-cap and FigureQA-cap): groundtruth captions for modeling. Converted from DVQA and FigureQA datasets. Includes the full split train, val, test_easy and test_hard. Due to size limit, we provide Google drive link to the train split. All splits follow the same schema below.
quality-validation.xlsx: a spreadsheet of quality validatio results. Two co-authors did quality validation on a sample of captions from the test_hard split of captions.json files, in two dimensions: accuracy and grammar. The sample covers 20 random captions for each type in each dataset.
user-study-12-figures.html (along with the directory of user-study-png-output): the 12 figures for the Google form user study.
aggregated-perfect-accuracy: Calculation of perfect accuracy scores as additional results from Table 3 and 5.

`captions.json`

In each split subdirectory, the file captions.json contains groundtruth captions and figure metadata that follow our problem formulation, for modeling.

Please download figure images from the original repository of DVQA and FigureQA dataset, for figure consistency.

An example caption tuple

An example figure metadata tuple

Below code can be used to read JSON objects from captions.json files to understand its schema.

import json

print("Loading the validation set of converted captions, in JSON, e.g. from DVQA")
jobject = json.load(open("FigureQA/captions.json", "r"))

print("In this JSON object, the keys are ", jobject.keys())
print()

print("Total caption count in this validation split", len(jobject['captions']))
print()

caption_types = set([item['caption_template_fine_grained'] for item in jobject['captions']])
print("Unique caption types (slight naming difference to Table 1 definition, e.g. horizontal-vertical means figure type)", caption_types)
print()

print("The metadata for one random figure looks like below (has dynamic dictionary, and bounding box positions)")
print(jobject['metadata'][list(jobject['metadata'].keys())[0]])
print()

Intellectual Property Note

Thank you for readers interested about source implementation. We unfortunately cannot share here due to company policy.

Contacts

Please email questions to Xin Qian (xinq@umd.edu). Thank you!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
DVQA		DVQA
FigureQA		FigureQA
aggregated-perfect-accuracy		aggregated-perfect-accuracy
user-study-png-output		user-study-png-output
Example-caption.png		Example-caption.png
Example-metadata.png		Example-metadata.png
README.md		README.md
model-choices.md		model-choices.md
quality-validation.xlsx		quality-validation.xlsx
user-study-12-figures.html		user-study-12-figures.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

web2021-figure-captioning

Some Caveats

Table of Contents

Data

Directory Structure

`captions.json`

An example caption tuple

An example figure metadata tuple

Intellectual Property Note

Contacts

About

Languages

xeniaqian94/web2021-figure-captioning

Folders and files

Latest commit

History

Repository files navigation

web2021-figure-captioning

Some Caveats

Table of Contents

Data

Directory Structure

captions.json

An example caption tuple

An example figure metadata tuple

Intellectual Property Note

Contacts

About

Resources

Stars

Watchers

Forks

Languages

`captions.json`