Extracting Hierarchy from Demonstration with Ordered Neuron

Abstract

Intelligent agents can learn learn to observe and decompose the expert’s behaviorinto a set of useful skills, even without any direct reward. In this project, weexplore the application of Ordered Neuron(ON-LSTM) to extract the underlying hierarchy information from agent demonstrations without any structural humanannotations. We define a hierarchy to be a tree structure where each node can beviewed as a sub-goal, which can be decomposed into further sub-goals until theatomic actions. Specifically, we draw analogy between the hierarchy extraction from expert demonstration and unsupervised sentence parsing. Our preliminary experiments on BaByAI show that the ON-LSTM could learn to parse a trajectory without structural annotation, with an improvement of F1 score over trivial parsing methods when compared with the ground truth sub-goal hierarchy. ON-LSTM shows a systematic generalization in face of growing room size and re-use existing skills compared with LSTM. Our results show that introducing the correct inductivebias into architectural design could also be a useful way to introduce prior andstructure in RL. Future work could focus on how to use ON-LSTM for better controlling.

Date
Jul 19, 2019 2:00 PM