767-final-IR

COMP 767 final project

Alex Hoffman and Nikhil Podila

McGill University

We created a Python implementation of importance resampling algorithm from Importance Resampling for Off-Policy Prediction

We also experimented with the addition of prioritized experience replay to the resampling algorithm

The code requires the following packages: numpy, gym, tensorflow, matplotlib. These can be installed with pip install or conda install if you use anaconda. Running the file "OffPolicyAgent_testing.py" will produce plots depending on which functions are commented out at the bottom of the file. Hyperparameters are set in the body of the file. Experiment settings are set in the test functions (learning rates for the lr sweep, number of updates, steps per update, batch size). Feel free to raise an issue if you are having trouble navigating the code!

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.3mean_lrncurve_4rooms.csv		.3mean_lrncurve_4rooms.csv
.3std_lrncurve_4rooms.csv		.3std_lrncurve_4rooms.csv
DP_walk.py		DP_walk.py
IRAgent.py		IRAgent.py
IRAgent_FourRooms.py		IRAgent_FourRooms.py
IS_testing_working_april15.py		IS_testing_working_april15.py
OffPolicyAgent.py		OffPolicyAgent.py
OffPolicyAgent_FourRooms.py		OffPolicyAgent_FourRooms.py
OffPolicyAgent_FourRooms_testing.py		OffPolicyAgent_FourRooms_testing.py
OffPolicyAgent_testing.py		OffPolicyAgent_testing.py
PERAgent.py		PERAgent.py
PER_testing.py		PER_testing.py
README.md		README.md
SumTree.py		SumTree.py
TransitionData.py		TransitionData.py
WISAgent.py		WISAgent.py
WISBufferAgent_FourRooms.py		WISBufferAgent_FourRooms.py
WISMinibatchAgent_FourRooms.py		WISMinibatchAgent_FourRooms.py
four_rooms_env.py		four_rooms_env.py
prioritized_memory.py		prioritized_memory.py
random_walk_env.py		random_walk_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

767-final-IR

About

Releases

Packages

Contributors 2

Languages

AlexHoffman9/767-final-IR

Folders and files

Latest commit

History

Repository files navigation

767-final-IR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages