CS294-112 Final Project

Bridging Distribution Mismatch: Better Bound on Composable Deep Reinforcement Learning

Dependencies:

Python 3.5
Numpy version 1.14.5
TensorFlow version 1.10.5
OpenAI Gym version 0.10.5
seaborn
Box2D==2.3.2
OpenCV
ffmpeg
Vizdoom

Before doing anything, first replace gym/envs/box2d/lunar_lander.py with the provided lunar_lander.py file.

The code was based on an implementation of Q-learning in HW3.

This code can support Atari games and Vizdoom environments.

Instructions for Composable Soft Q Learning

python run_dqn_vizdoom.py --vizdoom --explore e-greedy --subgame shoot_monster
python run_dqn_vizdoom.py --vizdoom --explore soft_q --subgame avoid_shooters
python run_dqn_vizdoom.py --vizdoom --explore soft_q --ex2 --coef 1e-4 --subgame avoid_shooters

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
Reference Paper		Reference Paper
bstmodel		bstmodel
maps		maps
.gitignore		.gitignore
README.md		README.md
_vizdoom.ini		_vizdoom.ini
atari_wrappers.py		atari_wrappers.py
compose_dqn.py		compose_dqn.py
dqn.py		dqn.py
dqn_utils.py		dqn_utils.py
dqn_vizdoom.py		dqn_vizdoom.py
exemplar.py		exemplar.py
logz.py		logz.py
lunar_lander.py		lunar_lander.py
my_exemplar.py		my_exemplar.py
my_exemplar_conv.py		my_exemplar_conv.py
requirements.txt		requirements.txt
run.bash		run.bash
run_all.sh		run_all.sh
run_dqn_atari.py		run_dqn_atari.py
run_dqn_lander.py		run_dqn_lander.py
run_dqn_ram.py		run_dqn_ram.py
run_dqn_ram_comp.py		run_dqn_ram_comp.py
run_dqn_vizdoom.py		run_dqn_vizdoom.py
siamese_tf.py		siamese_tf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS294-112 Final Project

Bridging Distribution Mismatch: Better Bound on Composable Deep Reinforcement Learning

Instructions for Composable Soft Q Learning

Cheers!

About

Releases

Packages

Contributors 2

Languages

buaazhangfan/Composable-RL-with-EX2

Folders and files

Latest commit

History

Repository files navigation

CS294-112 Final Project

Bridging Distribution Mismatch: Better Bound on Composable Deep Reinforcement Learning

Instructions for Composable Soft Q Learning

Cheers!

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages