Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
human_in_the_loop [2018/02/07 17:59] admin |
human_in_the_loop [2018/03/15 11:23] (current) admin |
||
---|---|---|---|
Line 30: | Line 30: | ||
This paper is a proof of concept that illustrates the potential for deep reinforcement learning to enable flexible and practical assistive systems. | This paper is a proof of concept that illustrates the potential for deep reinforcement learning to enable flexible and practical assistive systems. | ||
+ | https://arxiv.org/abs/1703.06207v5 Cooperating with Machines | ||
+ | |||
+ | In contrast, less attention has been given to developing autonomous machines that establish mutually cooperative relationships with people who may not share the machine's preferences. A main challenge has been that human cooperation does not require sheer computational power, but rather relies on intuition [11], cultural norms [12], emotions and signals [13, 14, 15, 16], and pre-evolved dispositions toward cooperation [17], common-sense mechanisms that are difficult to encode in machines for arbitrary contexts. Here, we combine a state-of-the-art machine-learning algorithm with novel mechanisms for generating and acting on signals to produce a new learning algorithm that cooperates with people and other machines at levels that rival human cooperation in a variety of two-player repeated stochastic games. This is the first general-purpose algorithm that is capable, given a description of a previously unseen game environment, of learning to cooperate with people within short timescales in scenarios previously unanticipated by algorithm designers. This is achieved without complex opponent modeling or higher-order theories of mind, thus showing that flexible, fast, and general human-machine cooperation is computationally achievable using a non-trivial, but ultimately simple, set of algorithmic mechanisms. |