Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Finer selection #263

Closed
wants to merge 4 commits into from
Closed

Finer selection #263

wants to merge 4 commits into from

Conversation

LysandreJik
Copy link
Member

@LysandreJik LysandreJik commented Aug 5, 2021

WIP PR, do not merge.

This PR offers a more powerful way to filter models by tags. This breaks the limitations imposed by selecting a model by a list of tags.

It introduces five enumerations: Language, Task, Dataset, License and Library. These four enumerations allow filtering of models (and eventually datasets when #194 is merged). Alongside these five enumerations are four operators which can be used to make a more powerful filter: & (AND) , | (OR) and ~ (NOT).

See below for some examples of common use-cases:

Example of single categories using languages

from huggingface_hub.hf_api import HfApi, Language

# List all English models
english_models = api.list_models_by_tag(Language.en)
len(english_models)  # 3061

# List all English or French models
en_or_fr_models = api.list_models_by_tag(Language.en | Language.fr)
len(en_or_fr_models)  # 3318

# List all English or French models
en_and_fr_models = api.list_models_by_tag(Language.en | Language.fr)
len(en_and_fr_models)  # 27

# List all English models that are not French models
en_and_fr_models = api.list_models_by_tag(Language.en & ~Language.fr)
len(en_and_fr_models)  # 3034

More advanced typical use-cases.

I'm looking for a model that is both English and French, specialized in translation

from huggingface_hub.hf_api import HfApi, Language, Task

# List all English models
translation = api.list_models_by_tag(Task.translation & (Language.en & Language.fr))
len(translation)  # 19

I'm looking for a model that is both English and French, specialized in translation. However, I don't want a model trained on multilingual, as those usually contain a lot of different languages. French & English only!

from huggingface_hub.hf_api import HfApi, Language, Task

# List all English models
translation = api.list_models_by_tag(Task.translation & (Language.en & Language.fr & ~Language.mt))
len(translation)  # 17

@adrinjalali
Copy link
Contributor

You could probably use Flag instead? (https://docs.python.org/3/library/enum.html#flag)

@osanseviero
Copy link
Contributor

This is probably obsolete with the new advanced search introduced by @muellerzr

@muellerzr
Copy link
Contributor

Correct. @LysandreJik are we good to close this you think?

@LysandreJik
Copy link
Member Author

Indeed, obsolete and should be closed.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants