Learning Sequences of Policies by using an Intrinsically Motivated Learner and a Task Hierarchy

Nicolas Duminy 1, 2 Alexandre Manoury 3, 4 Sao Mai Nguyen 3, 4 Cédric Buche 5 Dominique Duhaut 6, 2
1 Lab-STICC_UBS_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
3 Lab-STICC_IMTA_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
5 Lab-STICC_ENIB_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
6 Lab-STICC_UBS_CID_IHSEV
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : Our goal is to propose an algorithm for robots to learn sequences of actions, also called policies, in order to achieve complex tasks. We consider in this paper multiple and hierarchical tasks of various difficulties. To tackle this highly dimensional learning we propose a new algorithm, named Socially Guided Intrinsic Motivation for Sequence of Actions through Hierarchical Tasks (SGIM-SAHT), based on intrinsic motivation and using different learning strategies. We then present two implementations of this algorithm designed to address this challenge in different ways: through a "procedures" framework for Socially Guided Intrinsic Motivation with Procedure Babbling (SGIM-PB) and owing to planning and a dynamic environment representation learning for Continual Hierarchical Intrinsically Motivated Exploration (CHIME). We compare the two implementations and show, through two experiments, how efficiently they learn sequences of actions and dynamically adapt to their environment. We also discuss the benefits of implementing a full unified version of SGIM-SAHT using all the mentioned features of both implementations.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01887073
Contributor : Alexandre Manoury <>
Submitted on : Wednesday, October 3, 2018 - 3:48:23 PM
Last modification on : Thursday, April 25, 2019 - 10:24:34 AM
Long-term archiving on : Friday, January 4, 2019 - 2:49:07 PM

File

icdl-epirob-2018_CameraReady.p...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01887073, version 1

Citation

Nicolas Duminy, Alexandre Manoury, Sao Mai Nguyen, Cédric Buche, Dominique Duhaut. Learning Sequences of Policies by using an Intrinsically Motivated Learner and a Task Hierarchy. Workshop on Continual Unsupervised Sensorimotor Learning, ICDL-EpiRob 2018, Sep 2018, Tokyo, Japan. ⟨hal-01887073⟩

Share

Metrics

Record views

126

Files downloads

68