A Combination of Machine Learning and Cerebellar Models for the Motor Control and Learning of a Modular Robot

Demand response with residential and commercial loads for phase balancing in secondary distribution networks Zacharaki, V., Zehir, M. A., Thavlov, A., Heussen, K., Batman, A., Tsiamitros, D., Stimoniaris, D., Ozdemir, A., Dialynas, E. & Bagriyanik, M. 2018 Proceedings of 2018 6th International Istanbul Smart Grids and Cities Congress and Fair . IEEE , p. 124-28 Publication: Research peer-review › Article in proceedings – Annual report year: 2018


Introduction
Understanding the human brain is one of the greatest and most appealing challenges facing scientific research.Therefore, initiatives such as the Human Brain Project 1 (HBP) were conceived to encourage the delivery of beneficial breakthroughs for society and industry.The HBP unites the effort of numerous research centers and universities involving multiple disciplines and goals in the form of 12 subprojects.In particular, our group is framed within the subproject 10 focused on neurorobotics.Robots lack the adaptability and precision of human beings towards uncertain or unknown environments.In contrast, the brain accomplishes tasks in an admirable way allowing smooth movements with a low power consumption.As a result, studying how we control our bodies in uncertain or unknown environments, how we coordinate smooth movements and the mechanisms of motor control and motor learning of the central nervous system (CNS) has become of interest towards the development of bio-inspired autonomous robotic systems.

The cerebellum
Among the distinct parts of the brain, the cerebellum stands out due to its key role in modulating accurate, complex and coordinated movements, acting as a universal learning machine 2 .Its contributions include the neural control of bodily functions, such as postural positioning, balance or coordination of movements over time 3 .Thus, understanding and mimicking the cerebellar mechanisms through bio-inspired architectures are interesting processes in the development of innovative P -33 robotic systems capable of carrying out complex and accurate tasks in varying situations (See Refs.4-6).Following this motivation, we took inspiration from the cerebellum for the motor control and learning of a real modular robot.For this purpose, we used a bio-inspired control architecture which combines machine learning and a cerebellar microcircuit.The cerebellar models were simplifications of the real biological microcircuit including only the Purkinje, granule and deep cerebellar nuclei cells, and the parallel, mossy and climbing fibers.Moreover, we considered three cerebellar models, including spiking and non-spiking approaches, aiming at enlightening when to use which model and establishing the grounding for our future research.The result was a compliant robot module that by means of a bio-inspired approach was able to learn how to trace out a circular trajectory with its end-effector.

Material and Methods
In this section, we address the modular robot, our modular control approach and the Machine Learning technique utilized.

The modular robot: Fable
The target robotic platform is the modular robot called Fable 7 .Fable is based on the combination of selfcontained modules which can work independently or collaborate in modular configurations.Due to a low lag radio communication link to the modules the user can program the distributed robot modules at different levels of abstraction as if they were centralized and connected directly to the computer.To do so, the communication is done via a radio dongle addressing each module with an ID and a radio channel.Fig. 1.The robot Fable.An example of a modular configuration comprising four actuated modules and a head module endowed with ultrasound sensors.

The modular approach
Scientific research studies on the cerebellum such as Refs.8-9, describe it as a set of adaptive modules, called microcomplexes, which represent the minimal functional unit and show a uniform almost crystalline microcircuitry 10 .Thus, we decided to use its structure to control a robot module in a generic manner.Two microcomplexes were used to command the joints of a 2-DoF module.Each cerebellar output was linked to one joint as it happens in our body, where one motor cell commands one motor unit.Fig. 2. Modular scheme of the connections between the computer, the modular robot Fable and the neuromorphic SpiNNaker platform, which was used for the implementation of the spiking cerebellar model.

The bio-inspired modular control architecture
In order to perform the motor control and learning of a Fable module, we chose the Adaptive Feedback Error Learning (AFEL) architecture 4 shown in Fig. 3.The trajectory generation block generates the desired joint angles and velocities (Qd, Q ̇d) by inverse kinematics.On the one hand, the AFEL scheme guarantees the stability of the system by means of a control loop in which a Learning Feedback (LF) controller 4 is implemented.The LF overcomes the lack of a precise robot morphology dynamic model ensuring stability and adjusting its output torque through a learning rule after consecutive iterations of the same task.Its gains were heuristically tunes to Kp = 7.5, Kv = 6.4 and Ki = 1 for the Fable module.On the other hand, the AFEL architecture is endowed with a ULM, comprised by the LWPR algorithm and a cerebellar circuit.The ULM performs a feedforward control of the robot module.The LWPR is in charge of abstracting the internal model of the robot, while the cerebellar microcircuit refines the output delivering corrective torques.In case 3 we implemented a simplified spiking cerebellar model using the neuromorphic platform SpiNNaker 13 , consisting of Purkinje cells and DCN cells but without considering recurrent or inhibitory synapses.

The LWPR algorithm
The LWPR algorithm 14 creates and combines N linear local models and feeds the sensorimotor inputs (Qd, Q ̇d, Q, Q ̇) of the robot including desired and real values to them.Thereafter, the LWPR incrementally divides this sensorimotor input space into a set of receptive fields (RFs) performing an optimal function approximation.The RFs are represented by a Gaussian weighting kernel (Eq. 1) which computes a weight in each k-th local unit, for each xi data point according to its distance to the ck center of the kernel.
The weight measures how often an item xi of the data falls into the validity region of each linear model, characterized by a positive definite matrix Dk, called distance matrix.
The LWPR conveys the weights to the cerebellar circuit and at the same time, it delivers a torque output computed as the weighted mean average of the linear local model's contributions.
We chose the LWPR algorithm for three reasons: to optimize the input space to enhance learning speed and accuracy; since it can substitute and optimize the role of a certain group of cells of the cerebellum, called granule cells; and due to its capability of learning incrementally on-line.

Results
We tested the control architecture using the three cerebellar model cases described in Section 2.3 by commanding the robot to trace out a circular trajectory with its end-effector.The tests considered the normalized mean square error (nMSE) of the position of the joints with respect to the desired positions.First, the performance test consisted in following a circular trajectory with constant amplitude and spin frequency (see Fig. 4).Secondly, we carried out two robustness tests (see Figs. 5 and 6) where we considered trajectories that varied they amplitude keeping the spin frequency constant, and thereafter, trajectories that kept the amplitude constant but varied their spin frequency (0.5-1Hz).

Conclusions
We combined the LWPR algorithm and a cerebellar circuit for the motor control and learning of a real robot module.Furthermore, we implemented three distinct cerebellar models: two non-spiking and one spiking using the neuromorphic SpiNNaker platform.Compared to Case 1, Case 2 showed better results in the performance test while keeping similar results in both robustness tests.Case 3 did not show improvements with respect to the non-spiking models, but since its circuitry was quite simple there is much room for promising further research.Future research will exploit the potential of more detailed spiking models where inhibitory and recurrent synapses will take place and explore the control of several robot modules using SpiNNaker.

Fig. 3 .
Fig. 3.The AFEL control architecture.The AFEL control architecture embeds a ULM which acts as a feed-forward controller while it abstracts the internal model of the robot module.