COCOA

Code accompanying the NeurIPS 2023 spotlight paper Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis.

Dependencies

Please install jax, as well as the packages in requirements.txt.

Experiments

To reproduce the results of the paper, you can run the following wandb sweeps followed by the corresponding scripts to create the plots.

Linear key-to-door, performance, learnt-models

wandb sweep sweeps/performance_asymptotic_learnt.yaml

The relevant figures can be generated using the following scripts:

python3 figures/performance_asymptotic_length_learned.py <WANDB_SWEEP_ID>
python3 figures/performance_time_envlen103_learned.py <WANDB_SWEEP_ID>

Linear key-to-door, performance, groundtruth-models

wandb sweep sweeps/performance_asymptotic_gt.yaml

python3 figures/performance_time_envlen103_gt.py <WANDB_SWEEP_ID>

Linear key-to-door, shadow training

wandb sweep sweeps/shadow_asymptotic_learnt.yaml

python3 figures/bias-variance-snr_asymptotic_length_learned.py <WANDB_SWEEP_ID>
python3 figures/bias-variance_aggregate_env103_learned.py <WANDB_SWEEP_ID>
python3 figures/snr_aggregate_env103_learned.py <WANDB_SWEEP_ID>

Reward switching

wandb sweep sweeps/reward_switch.yaml

python3 figures/reward-switch_performance_time_learned.py <WANDB_SWEEP_ID>

Reward aliasing

wandb sweep sweeps/aliasing_exp.yaml

python3 figures/performance_time_envlen103_reward-aliasing.py <WANDB_SWEEP_ID>

Tree environment

wandb sweep sweeps/tree_env.yaml

python3 figures/var_asymptotic_state-overlap_gt.py <WANDB_SWEEP_ID>

Interleaving tasks

wandb sweep task_interleaving_stochastic.yaml

wandb sweep task_interleaving_stochastic_gt.yaml

Acknowledgements

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC).

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ccoa		ccoa
configs		configs
figures		figures
sweeps		sweeps
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
callback.py		callback.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COCOA

Dependencies

Experiments

Linear key-to-door, performance, learnt-models

Linear key-to-door, performance, groundtruth-models

Linear key-to-door, shadow training

Reward switching

Reward aliasing

Tree environment

Interleaving tasks

Acknowledgements

About

Contributors 3

Languages

License

seijin-kobayashi/cocoa

Folders and files

Latest commit

History

Repository files navigation

COCOA

Dependencies

Experiments

Linear key-to-door, performance, learnt-models

Linear key-to-door, performance, groundtruth-models

Linear key-to-door, shadow training

Reward switching

Reward aliasing

Tree environment

Interleaving tasks

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages