Skip to content

MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi), Hebrew (iw), Romanian (ro), Thai (th), and Chinese (zh).

License

Notifications You must be signed in to change notification settings

google-research-datasets/maxm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

MaXM

We introduce MaXM, a test-only multi-lingual visual question answering benchmark in 7 diverse languages: English (en), French (fr), Hindi (hi), Hebrew (iw), Romanian (ro), Thai (th), and Chinese (zh). The datasets are based on the images and the captions from the Crossmodal-3600 dataset (XM3600). Check our paper for further details.

Our approach to data generation is similar to VQ^2A used to generate MAVERICS.

Download

MaXM v1 (157KB, released on Feb 18, 2023)

Format (.json)

dataset                 str: dataset name
version                 str: dataset version
split                   str: language ID
annotations             List of image-question-answers triplets, each of which is
-- image_id             str: image ID
-- image_url            str: image URL
-- qa_pairs             List of question-answer pairs, each of which is
---- question_id        str: question ID
---- question           str: raw question
---- answers            List of str: ground-truth answers
---- processed_answers  List of str: processed ground-truth answers. 16 tokenized answers.
---- is_collection      bool: "true" if the question is of the "Collection" type; "false" otherwise..

Cite

If you use this dataset in your research, please cite the original Crossmodal-3600 dataset and our paper:

Soravit Changpinyo, Linting Xue, Michal Yarom, Ashish V. Thapliyal, Idan Szpektor, Julien Amelot, Xi Chen, Radu Soricut. MaXM: Towards Multilingual Visual Question Answering. Findings of the Association for Computational Linguistics: EMNLP, 2023.

@inproceedings{changpinyo2023maxm,
  title = {{MaXM}: Towards Multilingual Visual Question Answering},
  author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP},
  year = {2023},
}

Contact Us

Please create an issue in this repository. If you would like to share feedback or report concerns, please email schangpi@google.com.

About

MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi), Hebrew (iw), Romanian (ro), Thai (th), and Chinese (zh).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published