Unsupervised learning of simultaneous motor primitives through imitation

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Unsupervised learning of simultaneous motor primitives through imitation Olivier Mangin, Pierre-Yves Oudeyer


I. INTRODUCTION
We propose to build a system able to learn motor primitives from simultaneous demonstrations of several such primitives.Our approach is based on compact local descriptors of the motor trajectory similar to those used to learn acoustic words amongst sentences or objects inside visual scenes.
Learning by demonstration aims at making robots able to learn from human demonstrations of motor skills.Learning complex skills in a life-long perspective requires to recognize and re-use common parts between those skills to avoid learning them over and over, but also to allow the complexity of learned skills to increase.
A challenge of learning by demonstration is to make it accessible to non-expert public which would not adapt to complex requirement such as the decomposition of each demonstration in low level building blocks.

A. Combinations of motor primitives
Motor primitives have been introduced as a form of such preliminary knowledge, and may be used as elementary building blocks for more complex motor control and skills.They can both be found in biological and robotic systems, and can be either innate or aquired [1].
The main function of an aquired motor primitives is to make motor knowledge and skills re-usable, in a modular way, and to provide high level motor command instead of a trajectory level of control.
The notion of combination of motor primitives can take different forms depending both on the kind of combination one wants to achieve and on the definition and representation one has in mind for the concept of motor or sensori-motor primitives.For example, it is possible to combine alternative primitives in a context dependent manner, to compose them as functions taking inputs and producing outputs, to stack them in successive time sequences, or to treat them as constraints that can be satisfied in a parallel, competing or subordinate manner.

B. Related work achieving combination of motor primitives
Primitives combined in an alternative way, often called experts, are present in Gaussian mixture encoding of movement [2], and more recently on a higher level of complexity, in work from Grollman et al. [3].Calinon and Billard have shown [4] that it is possible to produce action from simultaneously active motor primitives, represented in Gaussian mixture framework, both for the competing and subordinate combination.
Time sequences of motor primitives are studied in the following, very different approaches: as time signals by Li et al. [5], using dictionary learning techniques, byusing the dependancy relationships between successive motor primitives [6] , and as an extension of the reinforcement learning framework by Sutton et al. in the option framework [7] .Again, those works does not handle the kind of simultaneous primitive we are interested in.
None of them actually handle the introduced issue of learning primitives simultaneously active in demonstrations.

II. EXPERIMENTAL SETUP
As stated in previous section, very few work tries to learn the motor primitives when they are simultaneously demonstrated.A first step to adress this issue is to learn movements that are only active on certain degrees of freedom of the system, for example independent movements on different limbs.We thus propose an approache to tackle this issue.

A. Problem settlement
We are interested in learning motor primitives that are happening simultaneously in demonstrations and that we can, in first approximation consider as independent.Such a situation is present in dancing movements: choregraphies are composed of elementary postures and transitions that happen independently on different limbs, in the same way that different instruments simultaneously take part in an orchestra.
We consider a two arm robot and two sets of movements, associated respectively to left and right arm.We provide the system with demonstrations each composed of one left arm movement and one right arm movement, simultaneously executed.Movement pairs are chosen randomly.
Our objective is to make the system able to learn motor primitives, and use them to represent demonstrations as pairs of learned primitives, instead of learning each particular demonstration in a flat manner.For a sufficient number or primitives, the former achieves better compression than the latter.

B. Movement representation and features
Movements are represented as time sequences of positions.However, by introducing some time related features such as velocities along with the position vector, we obtain a sequence of vectors from a new feature space, whose samples may be approximated as independent.In such an approximation movement may be represented by the probability distribution on those samples.
However trying to represent this distribution over the whole feature space lead to high dimensional descriptors, heavy to store and compare.We thus make a further approximation by considering separate distributions on each degree of freedom, and we represent those distribution by simple histograms.
This approximation is not meant to provide a complete representation of a motor skill but to allow efficient and cheap discriminative representation of motor skills.Indeed it is easier to efficiently represent a motor skill when relevant degrees of freedom to this motor skill are known.However one needs an efficient method to simultaneously discover both the primitives and their relevant degrees of freedom.
In the following we build feature vectors by concatenating position histograms on the different joints or degrees of freedom of our system.

C. Unsupervised learning of motor primitives with nonnegative matrix factorization
Histograms introduced in previous section, and their combinations in the case of simultaneous primitives fit well the additive properties required by non-negative matrix factorization (NMF) [8], which is an efficient technique to discover nonnegative composants of a signal in an unsupervised scenario.
NMF takes as input a data matrix X of dimension n × p where n is the number of demonstrations, and p the dimension of our fearures space.Those data are assumed to be nonnegative, which is true for the histogram features.Here p is the sum of resolution of histograms, wich typicaly is DOF × resolution.
Given a parameter k, NMF then provides two non-negative matrices W and H, of dimension respectively n × k and k × p, such that X W • H. Lines of H provide a basis of prototypical elements of the data, that is to say some kind of motor primitives.The coefficients of W are then interpreted as the degree of activity of those primitives in the demonstrations.

D. Preliminary results
We use a database of movements captured on two six degrees of freedom robotic arms.5 different movements have been associated to each arm, with 20 demonstration of pairs of thos movements.Data is acquired through joint position sensors integrated to motors, at a frequency of 50Hz.
The obtained database of one hundred pairs of examples is then presented to the system, which has to discover elementary histograms corresponding to motor primitves, without being given the labels.
In a first experiment we use the known number of primitives as the parameter of the NMF algorithm, in order to compare discovered primitives and real primitives.Activity coefficients of the learned motor primitives.Each line corresponds to a primitive, each column to a demonstration.Both graphics represent the same data, but ordered differently by demonstration.The first one is ordered according to left movement label, and thus columns 0 to 19 corresponds to first movement on left arm, 20 − 39 to second movement etc.On second graphics one can read more easily activations of right arm movements.Blocks of dark coefficients corresponding to a given movement in the demonstration indicates that one learned primitive is well associated to this movement.
Fig. 1.Activity coefficients of the learned motor primitives.Each line corresponds to a primitive, each column to a demonstration.Both graphics represent the same data, but ordered differently by demonstration.The first one is ordered according to left movement label, and thus columns 0 to 19 corresponds to first movement on left arm, 20 − 39 to second movement etc.On second graphics one can read more easily activations of right arm movements.Blocks of dark coefficients corresponding to a given movement in the demonstration indicates that one learned primitive is well associated to this movement.