Home

Welcome to the Intrepid: Interactive Representation Discovery wiki!

This wiki contains detailed instructions on how to use this repository. We begin with a quick summary. If you have any questions then please raise an issue and tag it as [Question].

What is Intrepid?

Intrepid is a repository that contains a list of decision-making algorithms (which includes bandits and reinforcement learning as special cases). A decision-making algorithm helps a decision-making agent to well simply make decisions.

A core focus of Intrepid is on Decision Making which requires learning a latent state/representation of the world. E.g., consider an agent that is navigating in an image-based environment. The observation here is the image generated by its camera while a good latent state could be the position of the agent in the world, along with any dynamic obstacles.

Core components of Intrepid

Intrepid consists of the following components:

Core learning algorithms. These are mostly located in ./src/learning/core_learner with algorithm-specific util functionality in src/learning/learning_utils. E.g., the Homer algorithm is implemented in ./src/learning/core_learner/homer.py. See the algorithm page for full list and description of these algorithms. The learning utils for example consist of a generic learner class in ./src/learning/learning_utils/generic_learner.py or routines that perform independence test.
Useful Decision-Making Tools: This includes a variety of packages that are routinely sued across algorithms. This includes:
- methods for generating episode (./src/learning/core_learner/policy_roll_in)
- methods for policy search given either offline data or a set of exploration policies (./src/learning/core_learner/policy_search)
- a variety of self-supervised learning objectives for learning latent states (./src/learning/core_learner/state_abstraction). This includes autoencoder, inverse dynamics, temporal contrastive learning, and multi-step inverse dynamics. For legacy reasons, at times the inverse dynamics is referred to as inverse kinematics in the code.
We include a large list of models that includes various encoders, inverse dynamics models, and generic classifiers (./src/model).
- A list of policies that map an observation (or history including time) can be found in ./src/model/policy
- A list of encoders that map observation to a latent state representation (either discrete or continuous) can be found in ./src/model/encoders
- A list of decoders that map latent state representation to observation can be found in ./src/model/decoders
- A list of classifiers can be found in ./src/model/classifiers
- A list of models for forward dynamics that map a given observation and action to the next observation can be found in ./src/model/forward_model
- A list of models for inverse dynamics that map a given ordered pair of observations to the action that can take the agent from the former observation to the latter can be found in ./src/model/inverse_dynamics
Set of environments and environment wrappers for popular existing domains (to be installed separately).
- We include some challenging exploration problems with relatively simple observational space for quick proof-of-concept studies where the focus is not on realistic observational noise but on exploration and planning. (./src/environments/rl_acid_env)
- We also include several grid world instances built on top of the Minigrid environment (./src/environments/minigrid). You will have to install minigrid using requirements file or on your own.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

What is Intrepid?

Core components of Intrepid

Clone this wiki locally