Ahmed Hussein
Deep reward shaping from demonstrations.
Hussein, Ahmed; Elyan, Eyad; Gaber, Mohamed Medhat; Jayne, Chrisina
Abstract
Deep reinforcement learning is rapidly gaining attention due to recent successes in a variety of problems. The combination of deep learning and reinforcement learning allows for a generic learning process that does not consider specific knowledge of the task. However, learning from scratch becomes more difficult when tasks involve long trajectories with delayed rewards. The chances of finding the rewards using trial and error become much smaller compared to tasks where the agent continuously interacts with the environment. This is the case in many real life applications which poses a limitation to current methods. In this paper we propose a novel method for combining learning from demonstrations and experience to expedite and improve deep reinforcement learning. Demonstrations from a teacher are used to shape a potential reward function by training a deep supervised convolutional neural network. The shaped function is added to the reward function used in deep-Q-learning (DQN) to perform off-policy training through trial and error. The proposed method is demonstrated on navigation tasks that are learned from raw pixels without utilizing any knowledge of the problem. Navigation tasks represent a typical AI problem that is relevant to many real applications and where only delayed rewards (usually terminal) are available to the agent. The results show that using the proposed shaped rewards significantly improves the performance of the agent over standard DQN. This improvement is more pronounced the sparser the rewards are.
Citation
HUSSEIN, A., ELYAN, E., GABER, M.M. and JAYNE, C. 2017. Deep reward shaping from demonstrations. In Proceedings of the 2017 International joint conference on neural networks (IJCNN 2017), 14-19 May 2017, Anchorage, USA. Piscataway: IEEE [online], article number 7965896, pages 510-517. Available from: https://doi.org/10.1109/IJCNN.2017.7965896
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2017 International joint conference on neural networks (IJCNN 2017) |
Start Date | May 14, 2017 |
End Date | May 19, 2017 |
Acceptance Date | Feb 3, 2017 |
Online Publication Date | May 14, 2017 |
Publication Date | Jul 3, 2017 |
Deposit Date | Mar 9, 2017 |
Publicly Available Date | Mar 9, 2017 |
Print ISSN | 2161-4393 |
Electronic ISSN | 2161-4407 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Article Number | 7965896 |
Pages | 510-517 |
Series ISSN | 2161-4407 |
ISBN | 9781509061815 |
DOI | https://doi.org/10.1109/IJCNN.2017.7965896 |
Keywords | Deep reinforcement; Reinforcement learning; Generic learning process; DeepQlearning (DQN) |
Public URL | http://hdl.handle.net/10059/2196 |
Contract Date | Mar 9, 2017 |
Files
HUSSEIN 2016 Deep reward shaping
(957 Kb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc/4.0/
You might also like
Deep imitation learning for 3D navigation tasks.
(2017)
Journal Article
Imitation learning: a survey of learning methods.
(2017)
Journal Article
Deep active learning for autonomous navigation.
(2016)
Presentation / Conference Contribution
Deep imitation learning with memory for robocup soccer simulation.
(2018)
Presentation / Conference Contribution
Deep learning based approaches for imitation learning.
(2018)
Thesis
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search