Differences

This shows you the differences between two versions of the page.

Link to this comparison view

learning_from_demonstration [2017/04/15 16:26] (current)
Line 1: Line 1:
 +https://​arxiv.org/​pdf/​1704.03732.pdf Learning from Demonstrations for Real World Reinforcement Learning
  
 +In this paper we
 +study a setting where the agent may access data
 +from previous control of the system. We present
 +an algorithm, Deep Q-learning from Demonstrations
 +(DQfD), that leverages this data to massively
 +accelerate the learning process even from
 +relatively small amounts of demonstration data.
 +DQfD works by combining temporal difference
 +updates with large-margin classification of the
 +demonstrator’s actions. We show that DQfD has
 +better initial performance than Deep Q-Networks
 +(DQN) on 40 of 42 Atari games and it receives
 +more average rewards than DQN on 27 of 42
 +Atari games. We also demonstrate that DQfD
 +learns faster than DQN even when given poor
 +demonstration data.