M. Rungger, H. Ding, and O. Stursberg, “Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning,” Anticipatory Behavior in Adaptive Learning Systems, vol. 5499, pp. 301–320, 2009.

 

Abstract

In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within this paper, we combine both principles by proposing a two-level hierarchical reinforcement learning scheme to enable the system to autonomously determine suitable solutions to new tasks. The approach is based on a behavior representation specified by hybrid automata, which combines continuous and discrete behavior, to predict (anticipate) the outcome of a sequence of actions. On the higher layer of the hierarchical scheme, the behavior is abstracted in the form of finite state automata, on which value function iteration is performed to obtain a goal leading sequence of subtasks. This sequence is realized on the lower layer by applying policy gradient-based reinforcement learning to the hybrid automaton model. The iteration between both layers leads to a consistent and goal-attaining behavior, as shown for a simple robot grasping task.

 

BibTex

@INPROCEEDINGS{RDS09,
  author = {M. Rungger and H. Ding and O. Stursberg},
  title = {{Multiscale Anticipatory Behavior by Hierarchical Reinforcement Learning}},
  booktitle = {Anticipatory Behavior in Adaptive Learning Systems},
  year = {2009},
  volume = {5499},
  series = {LNCS},
  pages = {301-320},
  comment = {ISBN 978-3-642-02564-8, 49 Normseiten}
}

 

URL

https://www.researchgate.net/publication/220714609_Multiscale_Anticipatory_Behavior_by_Hierarchical_Reinforcement_Learning