Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Unable to search or download #13

Open
sinaahmadi opened this issue May 16, 2024 · 2 comments
Open

Unable to search or download #13

sinaahmadi opened this issue May 16, 2024 · 2 comments

Comments

@sinaahmadi
Copy link

Hi,

The new website is sleek! However, it seems to have some glitches when it comes to searching or downloading. I have noticed this particularly for languages for which their codes contain the script name like "Central Kurdish" or "Kurdish (Arabic)".

When trying to download NLLB for that language (here: https://opus.nlpl.eu/NLLB/en&ku-Arab/v1/NLLB), searching doesn't return anything. If I try something on NLLB like Tamil-English (ta-eng) and the search works, I can then search the other language code, yet the download links remain the previous one. Ultimately, I get this error: We're sorry, no samples for Kurdish (Arabic) (ku-Arab) - in the[ NLLB](https://opus.nlpl.eu/NLLB/ku-Arab&/v1/NLLB) dataset, version v1 were found. at https://opus.nlpl.eu/sample/ku-Arab&/NLLB&v1/sample.

Thanks for your help.

@jorgtied
Copy link
Member

jorgtied commented Jul 19, 2024

We are looking into this. It seems to be a problem of the OPUS-API. The language pair does not show for some reason. The issue might be related to the way it is specified in the metadata (it says ku_Arab-en instead of en-ku_Arab -- in OPUS the language pair is typically specified by alphabetically sorted language IDs).

In the meantime, you could download the data from the links on the legacy NLLB OPUS site: https://opus.nlpl.eu/legacy/NLLB.php

@sinaahmadi
Copy link
Author

Thanks.
I have also contacted you many times regarding adding a few parallel corpora for Kurdish. Would you be able to add this to OPUS please? https://github.com/KurdishBLARK/InterdialectCorpus/tree/master

Thanks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants