awesome deep learning papers for reinforcement learning
-
[1. Playing Atari with Deep Reinforcement Learning,V. Mnih et al., NIPS Workshop, 2013.]
-
[2. Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.]
-
[1. Dueling Network Architectures for Deep Reinforcement Learning. Z. Wang et al., arXiv, 2015.]
-
[2. Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.]
-
[3. Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.]
-
[4. Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.]
-
[5. Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.]
-
[6. Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.]
-
[7. How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.]
-
[8. Learning functions across many orders of magnitudes,H Van Hasselt,A Guez,M Hessel,D Silver]
-
[9. Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.]
-
[10. State of the Art Control of Atari Games using shallow reinforcement learning]
-
[11. Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening]
-
[12. Deep Reinforcement Learning with Averaged Target DQN]
-
[13. Safe and Efficient Off-Policy Reinforcement Learning]
-
[14. The Predictron: End-To-End Learning and Planning ]
-
[1. Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.]
-
[2. Deep Attention Recurrent Q-Network]
-
[3. Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.]
-
[4. Progressive Neural Networks]
-
[5. Language Understanding for Text-based Games Using Deep Reinforcement Learning]
-
[6. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks]
-
[7. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation]
-
[8. Recurrent Reinforcement Learning: A Hybrid Approach]
-
[9. Value Iteration Networks, NIPS, 2016]
-
[10. MazeBase:A sandbox for learning from games]
-
[11. Strategic Attentive Writer for Learning Macro-Actions]
-
[1. End-to-End Training of Deep Visuomotor Policies]
-
[2. Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search]
-
[3. Trust Region Policy Optimization]
-
[1. Deterministic Policy Gradient Algorithms]
-
[2. Continuous control with deep reinforcement learning]
-
[3. High-Dimensional Continuous Control Using Using Generalized Advantage Estimation]
-
[4. Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies]
-
[5. Deep Reinforcement Learning in Parameterized Action Space]
-
[6. Memory-based control with recurrent neural networks]
-
[7. Terrain-adaptive locomotion skills using deep reinforcement learning]
-
[8. Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies]
-
[9. SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY]
-
[1. End-to-End Training of Deep Visuomotor Policies]
-
[2. Interactive Control of Diverse Complex Characters with Neural Networks]
- [1. Curiosity-driven Exploration in DRL via Bayesian Neuarl Networks]
-
[1. Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC]
-
[2. PGQ: COMBINING POLICY GRADIENT AND Q-LEARNING]
-
[1. Gradient Estimation Using Stochastic Computation Graphs]
-
[2. Continuous Deep Q-Learning with Model-based Acceleration]
-
[3. Benchmarking Deep Reinforcement Learning for Continuous Control]
-
[4. Learning Continuous Control Policies by Stochastic Value Gradients]
-
[5. Generalizing Skills with Semi-Supervised Reinforcement Learning]
-
[1. Deep Successor Reinforcement Learning]
-
[2. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation]
-
[3. Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks]
-
[4. Stochastic Neural Networks for Hierarchical Reinforcement Learning – Authors: Carlos Florensa, Yan Duan, Pieter Abbeel]
-
[1. ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources]
-
[2. A Deep Hierarchical Approach to Lifelong Learning in Minecraft]
-
[3. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning]
-
[4. Policy Distillation]
-
[5. Progressive Neural Networks]
-
[6. Universal Value Function Approximators]
-
[7. Multi-task learning with deep model based reinforcement learning]
-
[8. Modular Multitask Reinforcement Learning with Policy Sketches]
-
[1. Control of Memory, Active Perception, and Action in Minecraft]
-
[2. Model-Free Episodic Control]
-
[1. Action-Conditional Video Prediction using Deep Networks in Atari Games]
-
[2. Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks]
-
[3. Deep Exploration via Bootstrapped DQN]
-
[4. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation]
-
[5. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models]
-
[6. Unifying Count-Based Exploration and Intrinsic Motivation]
-
[7. #Exploration: A Study of Count-Based Exploration for Deep Reinforcemen Learning]
-
[8. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning]
-
[9. VIME: Variational Information Maximizing Exploration]
-
[1. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks]
-
[2. Multiagent Cooperation and Competition with Deep Reinforcement Learning]
-
[1. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization]
-
[2. Maximum Entropy Deep Inverse Reinforcement Learning]
-
[3. Generalizing Skills with Semi-Supervised Reinforcement Learning]
-
[1. Deep learning for real-time Atari game play using offline Monte-Carlo tree search planning]
-
[2. Better Computer Go Player with Neural Network and Long-term Prediction]
-
[3. Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.]
-
[1. Asynchronous Methods for Deep Reinforcement Learning]
-
[2. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU]
-
[1. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.]
-
[2. Strategic Attentive Writer for Learning Macro-Actions]
-
[3. Unifying Count-Based Exploration and Intrinsic Motivation]
-
[1. Policy Distillation]
-
[2. Universal Value Function Approximators]
-
[3. Learning values across many orders of magnitude]
-
[1. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games]
-
[2. Fictitious Self-Play in Extensive-Form Games]
-
[3. Smooth UCT search in computer poker]
-
[1. ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning]
-
[2. Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning]
-
[3. Playing FPS Games with Deep Reinforcement Learning]
-
[4. LEARNING TO ACT BY PREDICTING THE FUTURE]
-
[5. Deep Reinforcement Learning From Raw Pixels in Doom]
- [1. Deep Reinforcement Learning in Large Discrete Action Spaces]
- [1. Deep Reinforcement Learning in Parameterized Action Space]
-
[1. Learning Visual Predictive Models of Physics for Playing Billiards]
-
[2. J. Schmidhuber, On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, arXiv, 2015. arXiv]
-
[3. Learning Continuous Control Policies by Stochastic Value Gradients]
-
[4.Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models]
-
[5. Action-Conditional Video Prediction using Deep Networks in Atari Games]
-
[6. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models]
-
[1. Trust Region Policy Optimization]
-
[2. Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control]
-
[3. Path Integral Guided Policy Search]
-
[4. Memory-based control with recurrent neural networks]
-
[5. Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection]
-
[6. Learning Deep Neural Network Policies with Continuous Memory States]
-
[7. High-Dimensional Continuous Control Using Generalized Advantage Estimation]
-
[8. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization]
-
[9. End-to-End Training of Deep Visuomotor Policies]
-
[10. DeepMPC: Learning Deep Latent Features for Model Predictive Control]
-
[11. Deep Visual Foresight for Planning Robot Motion]
-
[12. Deep Reinforcement Learning for Robotic Manipulation]
-
[13. Continuous Deep Q-Learning with Model-based Acceleration]
-
[14. Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search]
-
[15. Asynchronous Methods for Deep Reinforcement Learning]
-
[16. Learning Continuous Control Policies by Stochastic Value Gradients]
- [1. Simultaneous Machine Translation using Deep Reinforcement Learning]
- [1. Active Object Localization with Deep Reinforcement Learning]
- [1. Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning]
- [1. Using Deep Q-Learning to Control Optimization Hyperparameters]
-
[1. Deep Reinforcement Learning for Dialogue Generation]
-
[2. SimpleDS: A Simple Deep Reinforcement Learning Dialogue System]
-
[3. Strategic Dialogue Management via Deep Reinforcement Learning]
-
[4. Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning]
- [1. Action-Conditional Video Prediction using Deep Networks in Atari Games]
- [1. WaveNet: A Generative Model for Raw Audio]
- [1. Generating Text with Deep Reinforcement Learning]
- [1. Language Understanding for Text-based Games Using Deep Reinforcement Learning]
- [1. Deep Reinforcement Learning for Accelerating the Convergence Rate]
-
[1. Designing Neural Network Architectures using Reinforcement Learning]
-
[2. Tuning Recurrent Neural Networks with Reinforcement Learning]
-
[3. Neural Architecture Search with Reinforcement Learning]
- [1. Using a Deep Reinforcement Learning Agent for Traffic Signal Control]
-
[1. CARMA: A Deep Reinforcement Learning Approach to Autonomous Driving]
-
[2. Deep Reinforcement Learning for Simulated Autonomous Vehicle Control]
-
[3. Deep Reinforcement Learning framework for Autonomous Driving]
- [1. Combating Deep Reinforcement Learning’s Sisyphean Curse with Intrinsic Fear]
- [1. On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning]