Skip to content

Add Tatoeba #802

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 32 commits into
base: eval-hackathon
Choose a base branch
from

Conversation

Muennighoff
Copy link

@Muennighoff Muennighoff commented Jul 19, 2022

See: https://huggingface.co/datasets/Helsinki-NLP/tatoeba_mt/
I left the python script I used inside for others to use. I can rmv it if it's a problem :)

cc @haileyschoelkopf @thomasw21 @lintangsutawika

Tests are failing due to problems with downloading the dataset. I don't face any problems downloading it..

Script I used is here

Note: xCopa & xWinograd should be merged first

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants