Releases · HumanCompatibleAI/imitation · GitHub

31 Oct 18:48

AdamGleave

v1.0.0 -- first stable release Latest

Latest

We're pleased to announce the first stable release of imitation. Key improvements include:

Gymnasium compatibility, which has superceded Gym
Tuned hyperparameters and benchmark results for common algorithm-environment pairs (see release artifact attached).
New algorithm (beta): SQIL
For more information, see the changelog below.

What's Changed

Updated Installation Instructions by @ernestum in #760
Download experts from hf inside tutorials and docs by @jas-ho in #766
Implementation of the SQIL algorithm by @RedTachyon in #744
Additional examples of CLI usage by @EdoardoPona in #761
Dependency fixes by @ernestum in #775
Tune hyperparameters for kernel density estimation tutorial by @michalzajac-ml in #774
Tune hyperparameters in tutorials for GAIL and AIRL by @michalzajac-ml in #772
Introduce interactive policies to gather data from a user by @michalzajac-ml in #776
Add an option to run SQIL with various off-policy algorithms by @michalzajac-ml in #778
Complete PR #771 (Tune preference comparison example hyperparameters) by @lukasberglund in #782
Add CLI for SQIL by @lukasberglund in #784
Gymnasium Compatibility by @ernestum in #735
Ensure MyST-NB raises an error when rendering a notebook fails. by @ernestum in #803
Add a test timeout by @ernestum in #779
Fix MacOS Pipeline: Include tests not in subdirectories by @AdamGleave in #797
Remove MuJoCo dependency from SQIL notebook by @AdamGleave in #800
Add partial support for dictionary observation spaces (bc, density) by @NixGD in #785
Update gymnasium dependency and render_mode in gym.make by @taufeeque9 in #806
Upgrade pytype by @ZiyueWang25 in #801
Reduce training time and improve expert loading code in the tutorials by @ernestum in #810
Add scripts and configs for hyperparameter tuning by @taufeeque9 in #675
SQIL and PC performance check fixes by @ernestum in #811
Running benchmarks by @ernestum in #812

New Contributors

@jas-ho made their first contribution in #766
@EdoardoPona made their first contribution in #761
@michalzajac-ml made their first contribution in #774
@lukasberglund made their first contribution in #782
@NixGD made their first contribution in #785
@ZiyueWang25 made their first contribution in #801

Full Changelog: v0.4.0...v1.0.0

Contributors

ernestum, AdamGleave, and 8 other contributors

Assets 3

17 Jul 23:05

AdamGleave

v0.4.0

What's Changed

Continuous Integration: Add support for Mac OS; remove dependency on MuJoCo
Preference comparison: improved logging, support for active learning based on variance of ensemble.
HuggingFace integration for model and dataset loading.
Benchmarking: add results and example configs.
Documentation: add notebook tutorials; other general improvements.
General changes: migrate to pathlib; add more type hints to enable mypy as well as pytype.

Full Changelog: v0.3.1...v0.4.0

Assets 2

29 Jul 00:58

AdamGleave

v0.3.1

What's Changed

Main changes:

Added reward ensembles and conservative reward functions by @levmckinney in #460
Dropping support for python 3.7 by @levmckinney in #505

Minor changes:

Docstring and other fixes after #472 by @Rocamonde in #497
Improve Windows CI by @AdamGleave in #495

Full Changelog: v0.3.0...v0.3.1

Contributors

AdamGleave, Rocamonde, and levmckinney

Assets 2

26 Jul 21:07

AdamGleave

Major improvements

New features:

New algorithm: Deep RL from Human Preferences (thanks to @ejnnr @norabelrose et al)
Notebooks with examples (thanks to @ernestum)
Serialized trajectories using NumPy arrays rather than pickles, ensuring stability across versions and saving space on disk (thanks to @norabelrose)
Weights and Biases logging support (thanks to @yawen-d)

Improvements:

Port MCE IRL from JAX to Torch, eliminating the JAX dependency. (thanks to @qxcv)
Refactor RewardNet code to be independent from AIRL, and shared across algorithms. (thanks to @ejnnr)
Add Windows support including continuous integration. (thanks to @taufeeque9)

Contributors

ernestum, qxcv, and 4 other contributors

Assets 2

23 Oct 23:07

shwang

First PyTorch release

compute_train_stats: Fix logits passed in as proba (#273)

Led to an error when I was training.

Assets 2

01 Sep 01:39

shwang

Final TF1 release

v0.1.1

Final TF1 release

Assets 2

09 May 19:46

AdamGleave

Initial release

Prototype versions of AIRL, GAIL, BC, DAGGER.

Assets 4