Releases · huggingface/huggingface_hub

07 Apr 19:09

v0.5.0

2f65b70

v0.5.0: Reference documentation, Keras improvements, stabilizing the API

Documentation

Version v0.5.0 is the first version which features an API reference. It is still a work in progress with features lacking, some images not rendering, and a documentation reorg coming up, but should already provide significantly simpler access to the huggingface_hub API.

The documentation is visible here.

API reference documentation by @LysandreJik in #782
[API Reference docs] Remove git references from GitHub Action templates by @LysandreJik in #813
DOC API docstring improvements by @adrinjalali in #731

Model & datasets list improvements

The list_models and list_datasets methods have been improved in several ways.

List private models

These two methods now accept the token keyword to specify your token. Specifying the token will include your private models and datasets in the returned list.

Support list_models and list_datasets with token arg by @muellerzr in #638

Modelcard metadata

These two methods now accept the cardData boolean argument. If set to True, the modelcard metadata will also be returned when using these two methods.

Include cardData in list_models and list_datasets by @muellerzr in #639

Filtering by carbon emissions

The list_models method now also accepts an emissions_trehsholds parameter to filter by carbon emissions.

Enable filtering by carbon emission by @muellerzr in #668

Keras improvements

The Keras serialization and upload methods have been worked on to provide better support for models:

All parameters are now included in the saved model when using push_to_hub_keras
log_dir parameter for TensorBoard logs, which will automatically spawn a TensorBoard instance on the Hub.
Automatic model card

Introduce include_optimizer parameter to push_to_hub_keras() by @merveenoyan in #616
Add TensorBoard for Keras models by @merveenoyan in #651
Create Automatic Keras model card by @merveenoyan in #679
Allow TensorBoard Override for same Repository by @merveenoyan in #709
Add tempfile for tensorboard logs in tensorboard tests in test_keras_integration.py by @merveenoyan in #761

Contributing guide

A contributing guide is now available for the huggingface_hub repository. For any and all information related to contributing to the repository, please check it out!

Read more about it here: CONTRIBUTING.md.

Pre-commit hooks

The huggingface_hub GitHub repository has several checks to ensure that the code respects code quality standards. Opt-in pre-commit hooks have been added in order to make it simpler for contributors to leverage them.

Read more about it in the aforementionned CONTRIBUTING guide.

MNT Add pre-commit hooks by @adrinjalali in #807

Renaming and transferring repositories

Repositories can now be renamed and transferred programmatically using move_repo.

Allow renaming and transferring repos programmatically by @osanseviero in #704

Breaking changes & deprecation

⛔ The following methods have now been removed following a deprecation cycle

`list_repos_objs`

The list_repos_objs and the accompanying CLI utility huggingface-cli repo ls-files have been removed.
The same can be done using the model_info and dataset_info methods.

Remove deprecated list_repos_objs and huggingface-cli repo ls-files by @julien-c in #702

Python 3.6

Python 3.6 support is now dropped as end of life. Using Python 3.6 and installing huggingface_hub will result in version v0.4.0 being installed.

CI support python 3.7-3.10 - remove 3.6 support by @adrinjalali in #790

⚠️ Items below are now deprecated and will be removed in a future version

API deprecate positional args in file_download and hf_api by @adrinjalali in #745
MNT deprecate name and organization in favor of repo_id by @adrinjalali in #733

What's Changed

Include "model" in repo_type to keep consistency by @muellerzr in #620
Hotfix for repo_type by @muellerzr in #623
fix: typo in docstring by @ariG23498 in #647
{upload|delete}_file: Remove client-side filename validation by @SBrandeis in #669
Ensure post_method is only executed once by @sgugger in #676
Remove paying subscription mention from docstring by @cakiki in #653
Improve tests and logging by @muellerzr in #682
docs(links): Update settings/token to settings/tokens by @ronvoluted in #699
Add support for private hub by @juliensimon in #703
Add retry_endpoint for test stability by @osanseviero in #719
FIX fix a bug in _filter_emissions to accept numbers w/o decimal and dict emissions by @adrinjalali in #753
Logging fix for hf_api, logging documentation by @LysandreJik in #748
Contributing guide & code of conduct by @LysandreJik in #692
Fix pytorch and tensorflow python matrix by @osanseviero in #760
MNT add links to related projects and the forum on issue template by @adrinjalali in #773
Note on the README by @LysandreJik in #772
Remove autoreviewers by @muellerzr in #793
CI Error on FutureWarning by @adrinjalali in #787
MNT more informative message on error in Hf.Api.delete_repo by @adrinjalali in #783
Add security status by @McPatate in #654
Remove redundant part of security test by @osanseviero in #802
Changed test repository names to fix tests by @merveenoyan in #803
TST calling delete_repo under tempfile for fixing the test by @merveenoyan in #804
Disable logging in with organization token by @merveenoyan in #780
MNT change dev version to 0.5, 0.4 is already released by @adrinjalali in #810
👨‍💻 Configure HF Hub URL with environment variable by @SBrandeis in #815
MNT support oder requests versions by @adrinjalali in #817
Rename the env variable HF_ENDPOINT. by @Narsil in #819

New Contributors

@McPatate made their first contribution in #583
@FremyCompany made their first contribution in #606
@simoninithomas made their first contribution in #633
@mlonaws made their first contribution in #630
@ariG23498 made their first contribution in #647
@J-Petiot made their first contribution in #660
@ronvoluted made their first contribution in #699
@juliensimon made their first contribution in #703
@allendorf made their first contribution in #742
@frgfm made their first contribution in #747
@hbredin made their first contribution in #688

Full Changelog: v0.4.0...v0.5.0

Contributors

Narsil, julien-c, and 19 other contributors

Assets 2

26 Jan 18:30

LysandreJik

v0.4.0

735b82e

v0.4.0: Tag listing, Namespace Objects, Model Filter

Tag listing

Introduce Tag Listing by @muellerzr in #537

This PR introduces the ability to fetch all available tags for models or datasets and returns them as a nested namespace object, for example:

>>> from huggingface_hub import HfApi

>>> api = HfApi() 
>>> tags = api.get_model_tags()
>>> print(tags)
Available Attributes:
 * benchmark
 * language_creators
 * languages
 * licenses
 * multilinguality
 * size_categories
 * task_categories
 * task_ids

>>> print(tags.benchmark)
Available Attributes:
 * raft
 * superb
 * test

Namespace objects

Namespace Objects for Search Parameters by @muellerzr in #556

With a goal of adding more tab-completion to the library, this PR introduces two objects:

DatasetSearchArguments
ModelSearchArguments

These two AttributeDictionary objects contain all the valid information we can extract from a model as tab-complete parameters. We also include the author_or_organization and dataset (or model) _name as well through careful string splitting.

Model Filter

Implement a Model Filter class by @muellerzr in #553

This PR introduces a new way to search the hub: the ModelFilter class.

It is a simple Enum at first to the user, allowing them to specify what they want to search for, such as:

f = ModelFilter(author="microsoft", model_name="wavlm-base-sd", framework="pytorch")

From there, they can pass in this filter to the new list_models_by_filter function in HfApi to search through it:

models = api.list_modes(filter=f)

The API may then be used for complex queries:

args = ModelSearchArguments()
f = ModelFilter(framework=[args.library.pytorch, args.library.TensorFlow], model_name="bert", tasks=[args.pipeline_tag.Summarization, args.pipeline_tag.TokenClassification])

api.list_models_from_filter(f)

Ignoring filenames in snapshot_download

This PR introduces a way to limit the files that will be fetched by the snapshot_download. This is useful when you want to download and cache an entire repository without using git, and that you want to skip files according to their filenames.

[Snapshot download] allow some filenames to be ignored by @patrickvonplaten in #566

What's Changed

[Hotfix][API] card_data => cardData on /api/datasets by @julien-c in #530
Fix the progress bars when cloning a repository by @LysandreJik in #517
Update Hugging Face Hub documentation README and Endpoints by @muellerzr in #527
Convert string functions to f-string by @muellerzr in #536
Fixing FS for espnet. by @Narsil in #542
[snapshot_download] upgrade to canonical separator by @julien-c in #545
Add test directions by @muellerzr in #547
[HOTFIX] Change test for missing_input to reflect back-end redirect changes by @muellerzr in #552
Bring consistency to download and upload APIs by @muellerzr in #574
Search by authors and string by @FrancescoSaverioZuppichini in #531
Quick typo by @muellerzr in #575

New Contributors

@kahne made their first contribution in #569
@FrancescoSaverioZuppichini made their first contribution in #531

Full Changelog: v0.2.1...v0.4.0

Contributors

Narsil, julien-c, and 5 other contributors

Assets 2

26 Jan 18:18

LysandreJik

v0.2.1

d4b2da8

v0.2.1: Patch release

This is a patch release fixing an issue with the notebook login.

5e2da9b#diff-fb1696cbcf008dd89dde5e8c1da9d4be5a8f7d809bc32f07d4453caba40df15f

Assets 2

26 Jan 18:17

LysandreJik

v0.2.0

c1ccbee

v0.2.0: Access tokens, skip large files, local files only

Access tokens

Version v0.2.0 introduces the access token compatibility with the hub. It offers the access tokens as the main login handler, with the possibility to still login with username/password when doing [Ctrl/CMD]+C on the login prompt:

The notebook login is adapted to work with the access tokens.

Skipping large files

The Repository class now has an additional parameter, skip_lfs_files, which allows cloning the repository while skipping the large file download.

#472

Local files only for `snapshot_download`

The snapshot_download method can now take local_files_only as a parameter to enable leveraging previously downloaded files.

#505

Assets 2

09 Nov 17:46

LysandreJik

v0.1.2

f31030e

v0.1.2: Patch release

What's Changed

clean_ok should be True by default by @LysandreJik in #462

Full Changelog: v0.1.1...v0.1.2

Contributors

LysandreJik

Assets 2

05 Nov 18:39

LysandreJik

v0.1.1

5eb5bfc

v0.1.1: Patch release

What's Changed

Fix typing-extensions minimum version by @lhoestq in #453
Fix argument order in create_repo for Repository.clone_from by @sgugger in #459

Full Changelog: v0.1.0...v0.1.1

Contributors

sgugger and lhoestq

Assets 2

02 Nov 22:42

LysandreJik

v0.1.0

162eeca

v0.1.0: Optional token, `HfApi` begone, git prune

What's Changed

Version v0.1.0 is the first minor release of the huggingface_hub package, which promises better stability for the incoming versions. This update comes with big quality of life improvements.

Make token optional in all HfApi methods. by @sgugger in #379

Previously, most methods of the HfApi class required the token to be explicitly passed. This is changed in this version, where it defaults to the token stored in the cache. This results in a re-ordering of arguments, but backward compatibility is preserved in most cases. Where it is not preserved, an explicit error is thrown.

Root methods instead of `HfApi` by @LysandreJik in #388

The HfApi class now exposes its methods through the hf_api file, reducing the friction to access these helpers. See the example below:

# Previously
from huggingface_hub import HfApi

api = HfApi()
user = api.whoami()

# Now
from huggingface_hub.hf_api import whoami

user = whoami()

The HfApi can still be imported and works as before for backward compatibility.

Add `list_repo_files` util by @sgugger in #395

Offers a list_repo_files to ... list the repo files! Supports both model repositories and dataset repositories

Add helper to generate an eval result `model-index`, with proper typing by @julien-c in #382

Offers a metadata_eval_result in order to generate a YAML block to put in model cards according to evaluation results.

Add metrics to API by @mariosasko in #429

Adds a list_metrics method to HfApi!

Git prune by @LysandreJik in #450

Adds a git_prune method to the Repository class. This prunes local files which are unneeded as already pushed to a remote.
It adds the argument auto_lfs_prune to git_push and the commit context-manager for simpler handling.

Bug fixes

Fix HfApi.create_repo when repo_type is 'space' by @nateraw in #394
Last fixes for datasets' push_to_hub method by @LysandreJik in #415

Full Changelog: v0.0.19...v0.1.0

Contributors

julien-c, LysandreJik, and 3 other contributors

Assets 2

04 Oct 21:10

LysandreJik

v0.0.18

30dbd31

v0.0.18: Repo metadata, git tags, Keras mixin

Repository metadata (@julien-c)

The version v0.0.18 of the huggingface_hub includes tools to manage repository metadata. The following example reads metadata from a repository:

from huggingface_hub import Repository

repo = Repository("xxx", clone_from="yyy")
data = repo.repocard_metadata_load()

The following example completes that metadata before writing it to the repository locally.

data["license"] = "apache-2.0"
repo.repocard_metadata_save(data)

Repo metadata load and save #339 (@julien-c)

Git tags (@AngledLuffa)

Tag management is now available! Add, check, delete tags locally or remotely directly from the Repository utility.

Tags #323 (@AngledLuffa)

Revisited Keras support (@nateraw)

The Keras mixin has been revisited:

It now saves models as SavedModel objects rather than .h5 files.
It now offers methods that can be leveraged simply as a functional API, instead of having to use the Mixin as an actual mixin.

Improvements and bug fixes

Better error message for bad token. #362 (@sgugger)
Add utility to get repo name #364 (@sgugger)
Improve save and load repocard metadata #355 (@elishowk)
Update Keras Mixin #284 (@nateraw)
Add timeout to dataset_info #373 (@lhoestq)

Contributors

elishowk, julien-c, and 4 other contributors

Assets 2

04 Oct 21:00

LysandreJik

v0.0.17

259a4ce

v0.0.17: Non-blocking git push, notebook login

Non-blocking git-push

The pushing methods now have access to a blocking boolean parameter to indicate whether the push should happen
asynchronously.

In order to see if the push has finished or its status code (to spot a failure), one should use the command_queue
property on the Repository object.

For example:

from huggingface_hub import Repository

repo = Repository("<local_folder>", clone_from="<user>/<model_name>")

with repo.commit("Commit message", blocking=False):
    # Save data

last_command = repo.command_queue[-1]

# Status of the push command
last_command.status  
# Will return the status code
#     -> -1 will indicate the push is still ongoing
#     -> 0 will indicate the push has completed successfully
#     -> non-zero code indicates the error code if there was an error

# if there was an error, the stderr may be inspected
last_command.stderr

# Whether the command finished or if it is still ongoing
last_command.is_done

# Whether the command errored-out.
last_command.failed

When using blocking=False, the commands will be tracked and your script will exit only when all pushes are done, even
if other errors happen in your script (a failed push counts as done).

Non blocking git push #315 (@LysandreJik)

Notebook login (@sgugger)

The huggingface_hub library now has a notebook_login method which can be used to login on notebooks with no access to the shell. In a notebook, login with the following:

from huggingface_hub import notebook_login

notebook_login()

Add a widget to login in notebook #329 (@sgugger)

Improvements and bugfixes

added option to create private repo #319 (@philschmid)
display git push warnings #326 (@elishowk)
Allow specifying data with the Inference API wrapper #271 (@osanseviero)
Add auth to snapshot download #340 (@lewtun)

Contributors

elishowk, osanseviero, and 4 other contributors

Assets 2

27 Aug 13:03

LysandreJik

v0.0.16

6402d10

v0.0.16: Progress bars, git credentials

The huggingface_hub version v0.0.16 introduces several quality of life improvements.

Progress bars in `Repository`

Progress bars are now visible with many git operations, such as pulling, cloning and pushing:

>>> from huggingface_hub import Repository
>>> repo = Repository("local_folder", clone_from="huggingface/CodeBERTa-small-v1")

Cloning https://huggingface.co/huggingface/CodeBERTa-small-v1 into local empty directory.
Download file pytorch_model.bin:  45%|████████████████████████████▋                                   | 144M/321M [00:13<00:12, 14.7MB/s]
Download file flax_model.msgpack:  42%|██████████████████████████▌                                    | 134M/319M [00:13<00:13, 14.4MB/s]

Branching support

There is now branching support in Repository. This will clone the xxx repository and checkout the new-branch revision. If it is an existing branch on the remote, it will checkout that branch. If it is another revision, such as a commit or a tag, it will also checkout that revision.

If the revision does not exist, it will create a branch from the latest commit on the main branch.

>>> from huggingface_hub import Repository
>>> repo = Repository("local", clone_from="xxx", revision="new-branch")

Once the repository is instantiated, it is possible to manually checkout revisions using the git_checkout method. If the revision already exists:

>>> repo.git_checkout("main")

If a branch should be created from the current head in the case that it does not exist:

>>> repo.git_checkout("brand-new-branch", create_branch_ok=True)

Revision `brand-new-branch` does not exist. Created and checked out branch `brand-new-branch`

Finally, the commit context manager has a new branch parameter to specify to which branch the utility should push:

>>> with repo.commit("New commit on branch brand-new-branch", branch="brand-new-branch"):
...     # Save any file or model here, it will be committed to that branch.
...     torch.save(model.state_dict())

Git credentials

The login system has been redesigned to leverage git-credential instead of a token-based authentication system. It leverages the git-credential store helper. If you're unaware of what this is, you may see the following when logging in with huggingface_hub:

        _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
        _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
        _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
        _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
        _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

        
Username: 
Password: 
Login successful
Your token has been saved to /root/.huggingface/token
Authenticated through git-crendential store but this isn't the helper defined on your machine.
You will have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal to set it as the default

git config --global credential.helper store

Running the command git config --global credential.helper store will set this as the default way to handle credentials for git authentication. All repositories instantiated with the Repository utility will have this helper set by default, so no action is required from your part when leveraging it.

Improved logging

The logging system is now similar to the existing logging system in transformers and datasets, based on a logging module that controls the entire library's logging level:

>>> from huggingface_hub import logging
>>> logging.set_verbosity_error()
>>> logging.set_verbosity_info()

Bug fixes and improvements

Add documentation to GitHub and the Hub docs about the Inference client wrapper #253 (@osanseviero)
Have large files enabled by default when using Repository #219 (@LysandreJik)
Clarify/specify/document model card metadata, model-index, and pipeline/task types #265 (@julien-c)
[model_card][metadata] Actually, lets make dataset.name required #267 (@julien-c)
Progress bars #261 (@LysandreJik)
Add keras mixin #230 (@nateraw)
Open source code related to the repo type (tag icon, display order, snippets) #273 (@osanseviero)
Branch push to hub #276 (@LysandreJik)
Git credentials #277 (@LysandreJik)
Push to hub/commit with branches #282 (@LysandreJik)
Better logging #262 (@LysandreJik)
Remove custom language pack behavior #291 (@LysandreJik)
Update Hub and huggingface_hub docs #293 (@osanseviero)
Adding a handler #292 (@LysandreJik)

Contributors

julien-c, osanseviero, and 2 other contributors

Assets 2

Releases: huggingface/huggingface_hub

v0.5.0: Reference documentation, Keras improvements, stabilizing the API

Documentation

Model & datasets list improvements

List private models

Modelcard metadata

Filtering by carbon emissions

Keras improvements

Contributing guide

Pre-commit hooks

Renaming and transferring repositories

Breaking changes & deprecation

list_repos_objs

Python 3.6

What's Changed

New Contributors

Contributors

v0.4.0: Tag listing, Namespace Objects, Model Filter

Tag listing

Namespace objects

Model Filter

Ignoring filenames in snapshot_download

What's Changed

New Contributors

Contributors

v0.2.1: Patch release

v0.2.0: Access tokens, skip large files, local files only

Access tokens

Skipping large files

Local files only for snapshot_download

v0.1.2: Patch release

What's Changed

Contributors

v0.1.1: Patch release

What's Changed

Contributors

v0.1.0: Optional token, `HfApi` begone, git prune

What's Changed

Make token optional in all HfApi methods. by @sgugger in #379

Root methods instead of HfApi by @LysandreJik in #388

Add list_repo_files util by @sgugger in #395

Add helper to generate an eval result model-index, with proper typing by @julien-c in #382

Add metrics to API by @mariosasko in #429

Git prune by @LysandreJik in #450

Bug fixes

Contributors

v0.0.18: Repo metadata, git tags, Keras mixin

v0.0.18: Repo metadata, git tags, Keras mixin

Repository metadata (@julien-c)

Git tags (@AngledLuffa)

Revisited Keras support (@nateraw)

Improvements and bug fixes

Contributors

v0.0.17: Non-blocking git push, notebook login

v0.0.17: Non-blocking git push, notebook login

Non-blocking git-push

Notebook login (@sgugger)

Improvements and bugfixes

Contributors

v0.0.16: Progress bars, git credentials

v0.0.16: Progress bars, git credentials

Progress bars in Repository

Branching support

Git credentials

Improved logging

Bug fixes and improvements

Contributors

`list_repos_objs`

Local files only for `snapshot_download`

Root methods instead of `HfApi` by @LysandreJik in #388

Add `list_repo_files` util by @sgugger in #395

Add helper to generate an eval result `model-index`, with proper typing by @julien-c in #382

Progress bars in `Repository`