Backflipping With Miniature Quadcopters by Gaussian-Process-Based Control and Planning

This article proposes two control methods for performing a backflip maneuver with miniature quadcopters. First, an existing feedforward control approach is improved by finding the optimal sequence of motion primitives via Bayesian optimization, using a surrogate Gaussian process (GP) model. To evaluate the cost function, the flip maneuver is performed repeatedly in a simulation environment. The second method is based on closed-loop control and it consists of two main steps: first, a novel robust, adaptive controller is designed to provide reliable reference tracking even in case of model uncertainties. The controller is constructed by augmenting the nominal model of the drone with a GP that is trained using measurement data. Second, an efficient trajectory planning algorithm is proposed, which designs feasible trajectories for the flip maneuver using only quadratic programming. The two approaches are analyzed in simulations and in real experiments using Bitcraze Crazyflie 2.1 quadcopters.


I. Introduction
The aim of this work is to develop and implement trajectory planning and motion control algorithms that allow quadcopters to perform complex maneuvers at high speed. With the wide spread use of quadcopters, increasing expectations towards these systems such as fast autonomous navigation in a cluttered environment, rapidly changing wind conditions in built environments, security and surveillance tasks, require to perform complex, fast maneuvers that push the drones to their physical limits [1]. In these cases, classical flight controllers designed for a linearized dynamical model of the vehichle are no longer sufficient and more advanced control methods capable to handle the entire operating domain are needed [2]. These algorithms can be developed based on nonlinear control techniques or machine learning approaches.
Execution of a backflip illustrates well such complex maneuvers because it requires careful handling of the full complex nonlinear behaviour of the drone and it is typically a challenging task even for a human pilot. The complexity and speed of the maneuver is characterized by the fact that it takes less than a second to complete during which the vehicle is able to make a full turn around one of the horizontal axes.
Because of its benchmark characteristics, various control strategies has been already proposed to perform the flip maneuver. In [3], energy-based control is applied to overcome the uncontrollability of the quadcopter at singular configurations when following a circular or clothoidal reference trajectory. In [4], a Lyapunov-stability based controller synthesis is used to execute multi-flip maneuvers with quadcopters. Machine learning approaches are utilized in many cases, for example to imitate the maneuver performed by an expert drone pilot with apprenticeship learning [5], or to design time-optimal trajectories with deep reinforcement learning [6] or even learning complex acrobatic maneuvers [7,8].
A simple learning strategy for adaptive feedforward control is proposed in [9], based on the optimization of a parametric motion primitive sequence. As backflipping pushes the actuators of the quadcopter to their physical limits, the application of near-maximal and minimal control inputs are required. This approach builds on the theory of bang-bang control and first-principles motion primitive design to perform and optimize the flip maneuver. The method is easy to implement and it is well suited for generating a feasible motion sequence, however, many trials on the real robot are necessary to optimize the parameters of the motion and the resulting control law is sensitive to parameter uncertainties and external disturbances. These effects have greater influence on the behaviour of miniature quadcopters compared to mediumsized and large drones, hence, the control robustness is even more important.
The robust adaptive control method we propose 1 in this paper is based on geometric control, which is a nonlinear approach for attitude feedback control of rigid bodies in 3D space. In [2], it is theoretically proven that geometric control is able to stabilize the orientation of a quadcopter in the whole operating domain based on differential geometric considerations and Lyapunov stability. In [11], the geometric control is augmented by robust terms to guarantee uniformly ultimately bounded tracking errors in the presence of disturbances in the quadcopter dynamics. However, the control design requires a priori knowledge of the magnitude of external disturbances which can be challenging in unknown situations and environments or can lead to poor performance. An adaptive augmentation of geometric control is proposed in [12], where the adaptive terms compensate the effects of uncertainties in the quadrotor dynamics, while the stability of the closed loop system is proven mathematically. Artificial neural networks are used in [13] to develop an adaptive geometric control law that renders the quadcopter able to perform complex maneuvers in wind fields. Although both adaptive algorithms can be implemented efficiently, they do not provide an estimation of the uncertainty of the adaptive terms. Furthermore, both methods lack a systematic way to determine the parametric structure of the adaptive terms, which can be challenging without expert knowledge of the unmodeled dynamics and external disturbances.
To overcome these challenges, the main contributions of our present work are as follows: C1 We improve the convergence and the training time of the feedforward control design of [9] by applying Bayesian optimization with Gaussian Process surrogate function, making it possible to effectively apply the method with real experiments based tuning.
C2 By Gaussian Process (GP) based augmentation of a nominal quadcopter model, we achieve adaption to unknown model dynamics and external disturbances together with quantification of the remaining model uncertainty. A robust geometric control scheme is designed that exploits the GP model for improved robust performance compared to previous methods. We also show convergence and robustness of the resulting method.
C3 To execute the flip maneuver by the robust geometric approach, we propose an optimization-based trajectory planning method. The algorithm is based on quadratic programming and it is computationally efficient.
C4 We compare the proposed methods in simulation and experimentally in performing the backflip maneuver.
The paper is structured as follows. First, as a short overview of the preliminaries, a brief introduction to Gaussian Process regression is given in Section II, while, in Section III, the dynamic motion model of quadcopters is presented. Then, the Bayesian optimization based improved feedforward control strategy is described in Section IV, corresponding to C1. The GP based robust geometric reference tracking control scheme and the quadratic programming based trajectory planning are presented in Section V, corresponding to C2 and C3. Sections VI and VII give a detailed comparison of the two control approaches: first via simulations and then in real world experiments, providing C4. Finally in Section VIII, conclusions on the proposed approaches are drawn. A video presentation of our results is available at https://youtu.be/-ifV1rozlx8.

II. Gaussian Process Regression
Gaussian Processes (GPs) are universal function approximators [14]. Due to their flexibility, wide representation capability and ability to express uncertainty of the approximation, GPs have become popular in robotics and control engineering [15][16][17]. GP regression is used for estimating an unknown, possibly nonlinear relation f 0 : R n → R between input x ∈ R n and output y ∈ R variables based on noisy observations, in the form of where is an independent noise process with ∼ N (0, σ 2 ). In fact, y and are random variables, but for the sake of simplicity, we will not use a different notation for their sample realization. Consider that a set of (1). The core idea of GP based estimation of f 0 is to consider that candidate estimates f belong to a GP, seen as a prior distribution. Then, using D N and this prior, a predictive GP distribution of f is computed that provides estimate of f 0 in terms of its mean and describes uncertainty of this estimate by its variance.
In terms of definition, a Gaussian Process GP : R d → R assigns to every point x ∈ R n a random variable GP(x) ∈ R such that for any finite set x 1 . . . x N , the joint probability distribution of GP (x 1 ) , . . . , GP (x N ) is Gaussian. GPs are fully determined by their mean m and covariance functions κ, hence if f ∼ GP(m, κ), Both m and κ, where the latter is also called a kernel function, are often parametrized in terms of hyper parameters θ ∈ R n θ . In fact, taking f ∼ GP(m, κ) as the prior distribution in the estimation process defines the prior knowledge about f 0 in terms of the mean function m, while the choice of κ determines the function space in which an estimate of the function is searched for. Parametrization of m and κ in terms of θ allows to adjust the prior, i.e., these choices to the estimation problem of f 0 using D N . For the estimation of a smooth f 0 , a Squared Exponential (SE) kernel for κ is a common choice. The SE kernel is characterized by where hyper parameters are the scaling σ f , used for numerical conditioning, and the symmetric matrix Λ, which determines the smoothness of the candidate function class along each x i .
Based on the given D N and the prior f ∼ GP(m, κ), describes the probability density function of the outputs Y = [ y 1 · · · y N ] seen as random variables conditioned on the observed inputs X = [ x 1 · · · x N ] and hyper parameter values θ. To predict the value of the unknown function f 0 at a test point x * , the following joint distribution holds based on the previous considerations. Hence, the predictive distribution for f (x * ), based on the observed samples The mean (4a) gives an approximation of f 0 (x * ) while the variance (4b) gives measure of the uncertainty of this approximation. Computation of (4) requires only elementary matrix operations, therefore it is computationally efficient.
To tune the hyper parameters θ and σ 2 associated with the prior, a common method is to maximize the likelihood, i.e. probability, of the observations of Y for (3) marginalized w.r.t. θ and σ : where w.l.o.g. the log of the pdf is taken to simplify the optimization problem and For alternative methods, see [14]. Here, we considered scalar valued GPs, however, GP regression can applied under multidimensional outputs, by estimating a predictive distribution of each output dimension independently.

III. Quadcopter Dynamics
The proposed trajectory planning and motion control algorithms exploit a nominal dynamic motion model of the quadcopter. In this section, we introduce the basic principles of quadcopter modeling based on [18] and develop the nominal quadcopter model. As the first step, three frames are introduced: the inertial frame F i in North-East-Down (NED) coordinates, the vehicle frame F v fixed to the vehicle, but aligned with F i , and the body frame F b aligned with the body Figure 1: Inertial F i , vehicle F v , and body F b frames describing the geometric relations of the vehicle and the environment. Thrusts and angular velocities of the rotors are also illustrated.
of the vehicle. The transformation from F i to F v is only a translation, while F v and F b are connected by rotation only [19]. In Fig. 1, the three frames are displayed with the Euler angles characterizing the pose in the body frame (roll: φ, pitch: θ, yaw: ψ) together wit the direction of the rotor thrusts and angular velocities.
Based on these frames, the dynamic motion model of the quadcopter is formulated as where r = [ x y z ] is the position of the quadcopter in the inertial frame F i , e 3 is the unit vector of z axis in F b , m is the mass of the drone, F is the collective thrust of the propellers, and g is the gravitational accel- denotes the three-dimensional special orthogonal group, also called the rotation group. Furthermore, ω b is the angular velocity of the vehicle in the body frame, J b is the inertia matrix of the body of the vehicle, and τ = [ τ x τ y τ z ] is the vector of torques produced by the propellers. The notation· stands for the projection: R 3 → SO(3) ensuring thatxy = x × y for all x, y ∈ R 3 where × corresponds to the vector product. To simplify the notation, the coordinate frames are not indicated in the sequel, i.e. R = R v b , J = J b , ω = ω b . The dynamic model has four inputs, the collective thrust (F ), and the torques around the three axes of the body frame (τ ). Assuming ideal blade dynamics and a perfectly symmetric body, these inputs can be calculated from the individual thrusts T i provided by the motors: where l is the distance of two motors along the x axis, b is the drag constant, and c is the thrust constant. Furthermore, the thrust generated by each motor is considered to be proportional to the square of the corresponding angular velocity: T i = cω 2 i for i ∈ {1, 2, 3, 4}. In terms of input constraints, the individual rotor thrusts are in the range 0 ≤ T i ≤ T max , where T max depends on the specific quadcopter design.

A. Overview
The flip maneuver can be executed as a 360 degree rotation around the y axis of the quadcopter's body frame, displayed in Fig. 1. If the corresponding ideal actuation sequence of the individual motors in terms of T i is computed based on the nominal quadcopter model to execute this maneuver, then the actuation sequence can be implemented on the real quadcopter in terms of feedforward control. However, computation of such an actuation profile is difficult, due to (i) the complexity of the involved optimization problem to find a feasible motion trajectory under the given actuation constraints and (ii) due to unmodeled aerodynamical effects on the quadcopter that can significantly influence the system response.
The feedforward control proposed in [9] solves the problem above by tuning a fixed sequence of parameterized motion primitives via experiments. However, the approximation of the Jacobian matrix in the optimization loop requires many trials on the real drone and careful selection of the measurement data is needed to ensure the numerical convergence. We have modified the original algorithm at several points in order to improve its performance and adapt it to our specific design configuration. First, we tune the parameters of the motion primitives in simulation by using a high fidelity nonlinear model of the drone. Second, the optimization is solved by a Bayesian optimization method, using Gaussian Process surrogate function. The main advantages of the proposed method are that Bayesian optimization is numerically better conditioned and requires significantly less function evaluations than Jacobi approximation, as the evaluation points are systematically selected. This makes the proposed method computationally more favourable.
Unlike in [9], we perform the backflip in a '×' configuration as two rotors can produce a larger torque than one. The desired trajectory of the flip motion is within the x − z plane of the body frame (illustrated in Fig. 1) corresponding to the rotation of the pitch angle θ, therefore the equations of motion (7) can be simplified utilizing that the translation along the y axis and rotation around the x, z axes, i.e. the roll φ and the yaw ψ, are fixed to zero.
The simplified equations of motion are as follows: The direction of the thrusts and pitch are illustrated in Fig. 2.

B. Parametrized Primitives of the Maneuver
Similar to the control strategy laid down in [9], the backflip maneuver is divided to five main sections, that are illustrated in Fig. 3 and defined as follows. 1. Accelerate: Gain elevation and kinetic energy with near-maximal collective acceleration, while rotating slowly to the negative direction.
3. Coast: With low and uniform thrusts hold the angular velocity, wait for the drone to rotate.

4.
Stop Rotation: Use maximal differential thrust to decrease angular velocity and stop the rotation.

5.
Recover: Prevent the drone from falling to the ground by applying near-maximal collective thrust, and try to get back to hover mode.
Each of the five sections has three parameters, the collective acceleration U i , duration t i , and angular accelerationθ i , resulting in 15 parameters altogether. However, based on [9], the number of parameters can be reduced by applying bang-bang type control on a restricted control envelope. This means that the control actions U i andθ i are either zero or near-maximal at all sections. As a result, only the following 5 independent parameters remain: η = U 1 t 1 t 3 U 5 t 5 ∈ R 5 + . These parameters are tuned to minimize the norm of the final state error e ∈ R 5 , which is obtained by applying the actuation profile in Fig. 3 in open loop and taking the difference between the final and the initial state. Formally, the optimization problem can be written as follows: where t 0 and t f are the initial and final time instants of the maneuver, and bounds U min , U max , t min , t max are determined from the physical limitations of the drone.

C. Bayesian Parameter Optimization
In [9], the numerical optimization of η is based on iterative optimization, using the approximate Jacobian matrix of the final state error w.r.t the parameter vector. However, the numerical gradient approximation has vast computational cost, because the whole maneuver needs to be simulated in every approximation step and it suffers from convergence problems. Here, as our first contribution, we apply a Bayesian optimization approach to find the global optimum of (10). This approach does not require the calculation of derivatives and it is suitable for global optimization of cost functions that are expensive to evaluate, see [20,21]. To apply this approach, the optimization problem (10) is written as where η = x ∈ R n is the n = 5 dimensional vector of optimization variables, X is the feasible parameter set (bounded interval (hyper rectangle) of the search space for the parameters η), and f is the objective function, i.e., f (x) = − e(η = x) 2 . The core concept of Bayesian optimization is to evaluate the unknown objective function at limited number of points, giving , fit a surrogate parametric model based on GP regression on the data and optimize this surrogate model of the original objective function [14]. The optimization is performed iteratively, where in each step the next evaluation point is determined by minimizing a so called acquisition function, the objective function is evaluated at this point and the surrogate model is updated. The optimization stops if the minimum is reached with high confidence or the iteration reaches a certain number of evaluations.
The acquisition function blends the approximated objective (the mean of the GP) and the approximation uncertainty (the variance of the GP) in a scalar valued function that can be optimized by standard gradient-based procedure. A common acquisition function is expected improvement, defined as follows. After f is evaluated in N points giving the observation data set D N , a GP pre- N be the value of the best sample so far and . Now we would like to choose the next test point x N +1 , such that the expected improvement predicted by the GP model is the best w.r.t. x + N : where y = max(y, 0). The right hand side of (12) has an analytical form and the next test point is obtained via x N +1 = arg max x∈X EI N (x), the point with highest expected improvement. In the literature, there are other common acquisition functions, e.g. upper confidence bound or knowledge gradient [20]. We summarize the steps of Bayesian optimization using a GP surrogate model in Algorithm 1, based on [20]. With the mathematical model of the quadcopter and a suitable optimization algorithm, it is possible to simulate the maneuver with different parameter sets, optimize the motion, and implement it on the vehicle. For the implementation of the flip maneuver, a stabilizing feedback controller is also required to balance the quadcopter at the beginning and after the end of the maneuver. Set point stabilization is a common task in quadcopter control, hence a classical LTI feedback controller, designed  (4) and (5) for the linearized dynamics around hovering, for example PID or LQR, is suitable for this purpose [22].

V. Geometric Tracking Control with Trajectory Planning
The second considered approach for quadcopter backflipping is based on closed-loop control. For this purpose, as our second contribution, we propose a novel robust adaptive reference tracking controller to provide reliable tracking even in the case of modeling uncertainties and external disturbances. The proposed method is an extension of the geometric control law [11] applicable for trajectory tracking of aggressive maneuvers. Furthermore, we also introduce a novel optimization-based trajectory planning method for this geometric approach, which is essential for finding an efficient motion path for backflipping.

A. Robust Adaptive Geometric Control by Gaussian Processes
The proposed robust nonlinear geometric tracking control is based on the extension of the control law introduced in [11]. The control method is able to track a reference position and a reference attitude R d (t) ∈ SO(3), represented by rotation matrices.
To synthesize the control law, we use (7), describing the quadcopter dynamics, and augment it by an dditive statedependent disturbance: where x = [ rṙ R ω ] is the state vector associated with the quadcopter and ∆ r (x ), ∆ R (x ) ∈ R 3 are unknown disturbances.
For controlling the flight dynamics in (13), we propose to use the control law in [11] augmented by the adaptive state-dependent terms η r , η R to cope with ∆ r and ∆ R : where k r , k v , k R , k ω ∈ R are the controller gains and are error terms with r d , R d and ω d corresponding to the position, orientation and angular velocity references, tr(·) is the trace operator, and the vee operator (·) ∨ : SO(3) → R 3 is the inverse of the hat operator·. The attitude tracking error e R is interpreted as the gradient of the attitude error function characterized by [2] Ψ(R, Furthermore, the angular velocity error term satisfies the equationΨ = e R e ω . We identify the external disturbances from noisy observations using Gaussian Processes, more specifically in the form∆ The mean of the GP-s is then directly used in (14) to compensate the effect of disturbances. The uncertainty of the approximations, characterized by the covariances Σ r , Σ R is handled by introducing the additional terms µ r and µ R that make the controller robust to this uncertainty. To define these terms, we assume that the GP-s are trained until the true disturbances ∆ r (x ), ∆ R (x ) are inside the 95% confidence interval. We can now define µ r and µ R similarly to [11], as where c 1 , c 2 , r , R , τ are positive constants, τ > 2, δ r , δ R are the uncertainty bounds and · is the Euclidean vector norm. However, instead of estimating δ r , δ R as in [11], we utilize the uncertainty of the corresponding trained Gaussian Process. We calculate the standard deviation of a GP at an evaluation point as the radius of the bounding sphere of the ellipsoid defined by the square root of the covariance matrix, formulated as where · 2 is the 2,2 induced norm (also called spectral norm) of a matrix which is equal to its largest singular value. We define the uncertainty bounds using the 95% confidence interval of the normal distribution, namelŷ From the confidence interval, we calculate an ultimate bound for the difference between the disturbances and the adaptive terms over the operating domain, as follows: Now we can show for the considered uncertainty bound (21) that the following stability guarantee holds true: Theorem 1 Consider that the control force F and torque τ defined by (14) are applied on the uncertain system (13). If the initial condition satisfies then, for any ψ 1,max > 0, there exists a controller (in terms of the choice of parameters k r , k v , k R , k ω , c 1 , c 2 , r , R , τ ) such that all error terms in (15) are uniformly ultimately bounded.
Proof : Throughout the proof, we adapt the steps of Proposition 3 in [11] to the GP based uncertainty bounds.First, we derive the dynamics of the rotational and translational tracking error and construct Lyapunov functions for them.
Based on (13), (14), (15), the error dynamics can be expressed in the following form: Similar to [11], consider the following Lyapunov function candidates: First, let us focus on V 2 . For the attitude error function Ψ a lower and upper bound can be given in terms of the attitude error e r as follows: The detailed derivation of these bounds can be found in [23]. Note that, (25) implies that Ψ is positive definite and decrescent for all t ∈ R + . By using (25), the following upper bound can be derived for the time derivative of V 2 : where λ m , λ M denote the smallest and largest eigenvalue of the inertia matrix J. If the controller parameters are chosen such that W 2 is positive definite, then z 2 W 2 z 2 is nonnegative. Now we examine the second term on the r.h.s of (26) and construct an upper bound for this term as well. First, note that ∆ R − η R ≤ δ R holds by construction of δ R , under the assumption that ∆ r , ∆ R are inside the 95% confidence interval of the GP distribution. By using this inequality and the definition of the robust control law (18), the following upper bound can be obtained: where R is chosen to be a sufficiently small positive constant. According to the proof of Proposition 3 in [11], inequality (28) implies that the tracking errors e R and e ω are uniformly ultimately bounded. Consider now Lyapunov function candidate V 1 . Its time derivative can be given as follows: Similarly to the case of V 2 , the first three terms can be made negative by suitably choosing the parameters of the controller. The last term can be divided into two parts: the first one is e B (∆ r − η r + µ r ) and e B X is the second. By using (21) and (18), an upper bound can be derived for the first term: To obtain an upper bound for e B X, first we construct an upper bound for X by using (23b) : where A = −k r e r − k v e v − mge 3 + mr d − η r + µ r and we assume that the reference trajectory r d has been designed such that condition − mge 3 + mr d − η r < B holds for some B > 0. By using the following relation (see [11] for details), the upper bound for X can be expressed as follows: By substituting (30) and (33) into (29), we obtaiṅ According to the proof of Proposition 3 in [11], inequality (34) implies that the tracking errors e r , e v are uniformly ultimately bounded as well. This completes the proof.

B. Trajectory Planning for the Flip Maneuver
In order to use the geometric tracking controller to perform the flip maneuver, a suitable reference trajectory is needed. For this purpose, we introduce an optimization based trajectory design method which is an additional contribution of the paper. Based on the controller structure (14), first an attitude reference trajectory R d is constructed and then it is completed with a position reference r d . Similarly to the feedforward approach, the objective for trajectory planning is that the quadcopter should arrive as close to the starting point as possible, while keeping the control inputs within the allowed range during the maneuver.
The attitude reference is specified in unit quaternions: 3 ] , where q d,0 is the scalar part of the quaternion, and q d,2 corresponds to the pitch angle, as q d,1 = q d,2 = 0, because both the roll and yaw angles are zero during the flip. Utilizing that q d is a unit quaternion, we can express the third element of it as q d,2 = 1 − q 2 d,0 , hence it is sufficient to design a trajectory only for q d,0 . A 360 degree rotation around the y axis means that the scalar part of the attitude quaternion goes from 1 to -1. In the trajectory design it is important to stay within the q d,0 (t) ∈ [−1, 1] range, because only unit quaternions describe rotation. For this purpose, the smooth sigmoid function q d,0 (t) = 2 is chosen to describe the scalar part of the reference attitude, where the parameters are the speed of the maneuver v m and the execution time t m . The attitude quaternion reference trajectory is displayed in Fig. 4. Assuming that φ ≡ ψ ≡ 0 during the flip, the conversion to Euler angles yields θ = 2arccos(q 0,d ), where θ(t) ∈ [−π, π]. Hence the pitch angle goes smoothly from zero to π, jumps to −π, and goes smoothly to zero. Besides of rotation, the maneuver also requires translational motion, because without proper lifting at the beginning of the backflip, the quadcopter would fall to the ground due to gravity. The position reference is designed considering that the rotational and translational equations of the dynamical model are coupled. The translational motion of the flip maneuver is within the x − z plane, therefore y d (t) = 0. The other two equations of the translational dynamics in (7a) are where R i,j denotes the (i, j)-th entry of the rotation matrix R. However, assuming that the attitude tracking converges fast enough to the reference, we can substitute the reference rotation matrix in (36), resulting in the translational state-space representatioṅ where ζ is the reduced state vector capturing the translational motion, R d,i,j are the corresponding elements of the reference rotation matrix R d (converted from the reference quaternion q d ). As the motion equations are decoupled, the effect of gravity can be added to z which is denoted byz (modified state) in the equation. Notice that (37) is a Linear Time-Varying (LTV) state-space representation with the thrust force F = u as the only control input.
By discretizing (37), using complete, zero-order hold discretization with discretisation step size T s > 0: where k ∈ Z denotes the discrete time, i.e. ζ k expresses ζ(kT s ), a quadratic programming problem can be formulated over a finite horizon, similarly to model predictive control, to find a motion trajectory for executing the flip. The input of the model is the collective thrust of the propellers, u k = F k . For a fixed duration of the maneuver with N discrete time steps, the following quadratic optimization problem is formulated: where Q k ∈ R 4×4 and W k ∈ R are weight matrices, ζ 0 is the initial state, and X , U are constraint sets for the states and the control input, respectively. The only objective of the trajectory design is to minimize the final position error of the quadcopter and keep the position within a specified range, therefore the weight matrices are (W k , Q k ) = (0, 0) for k = 1 . . . N − 1, except for the weight of the final state that is Q N = diag(1, 0, 1, 0) while W N = 0. As all the other weights are zero, it is only required to define a final state position reference ζ d,N , the components of which are zero except for the effect of the gravity inz d,N = 0.5T 2 s N 2 g. We specify linear constraints for the states: x ∈ [x − , x + ], z ∈ [z − , z + ] to model the available space for the maneuver avoiding collisions with other objects or walls. We also define linear constraints for the control input, namely where τ k is the vector of the three torques around the three body axes, out of which τ x,k = τ z,k = 0 normally during the flip, l is the distance of the quadcopter center of mass and the propellers projected to the x − z plane, and F max is the maximal collective thrust of the rotors. The torque control input τ k is calculated from the reference attitude R d based on (14b) assuming that the orientation and angular velocity errors are zero. The numerical solution of the optimization problem (39) can be obtained easily by using an off-the-shelf QP solver, e.g. by quadprog in Matlab. Finally, we fit cubic splines on the discrete reference points {ζ k } N k=0 obtained in (39) to get a smooth trajectory.

A. Environment
The simulation study is based on the dynamic model of a Bitcraze Crazyflie 2.1 miniature quadcopter. The same drone is used in real experiments, presented in Section VII. For both simulation and control design, the considered physical parameters of the quadcopter are given in Table 1, which are based on [24]. The simulation have been executed in the OpenAI Gym environment with Py-Bullet physics engine [22]. The implementation of GP regression is based on GPyTorch [25], while the Bayesian Optimization is based on [26]. All of the simulation code used in this work is available at our GitHub 2 , and a video illustrating the simulation results can be found at https://youtu.be/-ifV1rozlx8. In this section and the oncoming sections, we display the measurement results with the z axis pointing upwards (in contrast to the NED convention discussed in Section III), because the backflip maneuver is more illustrative this way.

B. Bayesian Optimized Feedforward Control
The simulation of the flip starts with hovering for about 0.1 s, followed by executing the maneuver, and then switching back to PID control to stabilize the drone and bring it back to the initial setpoint. The motion primitive parameters were calculated as the solution of the optimization program (10), using 250 random initial function evaluations and 1000 iterations. The result of the numerical optimization is as follows: where the collective accelerations U * i are in m/s 2 , and the time is in seconds. Simulation results are displayed in Fig. 5, using the optimal parameter vector and a set of slightly (<10%) perturbed parameters as well to test the robustness of the scheme. On the left plot, the position of the quadcopter during the flip is shown, with snapshots from the simulation. The end of the optimal maneuver is around the coordinate (x, z) = (−0.4, 0) m with near-zero pitch angle, thus the final state error is only significant in the x position. By reaching that point, a PID controller [22] is switched on to stabilize the drone and bring it back to the initial position (origin). On the right, the trajectory of the pitch angle in Euler representation, the angular velocity, and the collective thrust as a control input are shown, where the five sections defined in Section IV can be recognised clearly. The figure illustrates that even small deviations from the optimal parameter set (<10%) result in significant decrease in performance. Figure 5: Backflipping in simulation by feedforward control. The position is displayed on the left, and the pitch angle θ, pitch angular velocityθ, and collective thrust F on the right. Orange lines represent the simulation with optimal parameters in (41), and grey lines represent the result of < 10% random changes in the parameter set. Figure 6: Backflipping in simulation by geometric control. Position and attitude are depicted on the left, the pitch angle θ, pitch angular velocityθ, and collective thrust F on the right.

C. Geometric Control and Trajectory Planning
The second approach to perform a flip maneuver is trajectory planning and reference tracking with geometric control. Based on the results of the flip with feedforward control, the parameters of the reference pitch trajectory are chosen to v m = 20 1/s, t m = 0.9 s, as illustrated in Fig. 4. The quadratic programming problem (39) is solved under the following contraints: with sampling time T s = 2.083 ms. The maximal collective thrust F max = 0.64 N is based on [24], and the position bounds are chosen such that the trajectory is feasible, and the quadcopter does not get too far from the initial point, e.g. to express the available flying space while avoiding collision with walls and obstacles. The duration of the flip is chosen to T = 0.9 s, thus the number of simulation steps is N = T /T s = 432. The quadratic optimization problem is solved by the quadprog function of Matlab within milliseconds of computation time.
The gains of the geometric controller have been determined based on the stability conditions detailed in [2], their numerical values are  Fig. 6. Similarly to the feedforward control, the maneuver starts and ends at hovering, but in this case the stabilizing controller is also the geometric control with position and yaw setpoints. The left plot illustrates the reference and simulated pose of the quadcopter during the backflip maneuver, and the right plot contains the trajectory of the pitch angle, angular velocity and collective thrust control input. The trajectory of both the angular velocity and the thrust input are smooth compared to the discontinuous angular acceleration and thrust of the feedforward control. At the discontinuities of the control input, the unmodeled transient behaviour of the actuator dynamics can be significant, therefore the geometric control approach is more robust to such uncertainties compared to the feedforward method. A possible solution would be to use a different parametrization of the control inputs instead of bang-bang control (e.g. spline parameters), however, such changes would make the feedforward design much more complex. Next, we evaluate the performance of the proposed robust adaptive geometric controller for backflipping. We assume that the most significant modelling errors compared to the real drone arise in the rotational dynamics, due to the inaccuracy of the inertia matrix and center of gravity, and other aerodynamic effects. In simulation, we apply an external disturbance characterized by ∆ R = 0.002 0.002 0 sin  In the example of the backflip maneuver, we use only two scalar adaptive terms: the roll and pitch term of η R , namely η R,x and η R,y , because most of the uncertainties arise in the roll and pitch motions. The adaptive terms are represented by two independent Gaussian Processes with the following four-dimensional input: the x and y element of the attitude quaternions (q 1 , q 2 ), and the angular velocity elements ω x , ω y . The adaptive GP terms introduced in Section V depend on the full state vector, however, in case of the backflip scenario, only these four inputs are relevant, and by reducing the input dimension, the model complexity is decreased radically without losing its expressiveness. For the Gaussian Processes, we use zero mean and squared exponential covariance function, given by (2). The signal variance and lengthscale hyperparameters are trained using maximum likelihood estimation on 125 training points. Due to the relatively small input dimension and number of training points, the training and evaluation of the GPs are fast and efficient.
The backflipping simulations by robust adaptive geometric control are shown in Fig. 7. We compare the results using nominal geometric control (without adaptive and robust terms), geometric control with GP mean (without robust terms), and robust adaptive geometric control given by (14), (15), (18). The most important variable is the attitude tracking error Ψ, described by (16). Our results show that the attitude error of the nominal geometric controller during backflipping is reduced by 40% using the mean of the GPs, and 80% using the proposed robust adaptive geometric control. The position error norm e r is less significant due to the effect of control input saturation, however, the adaptive and robust controllers achieve a better result here, as well.

A. Experimental Setup
The real experiments are performed with a Bitcraze Crazyflie 2.1 drone. Optitrack motion capture system [27] is used to provide high precision position and velocity in- formation. The drone and the positioning system is interconnected via a ground control PC, which runs the high level experiment management and data logging software component as well. The block diagram presenting the interconnection of the components is shown in Fig. 8. The quadrotor is equipped with an IMU containing a 3D accelerometer, gyroscope, magnetometer and barometer, and it has two microcontrollers: a STM32F405 for running the flight controller, and a nRF51822 for radio communication and power management. Additionally, we use an expansion deck for high speed logging of measurement data to a micro SD-card.
The quadcopter runs the original Bitcraze firmware, while on the server the Crazyswarm software platform is used to ease the implementation and configuration of high-level control components [28].

B. Optimized Feedforward Control
Firstly, we evaluate the results of performing the backflip with optimization-based feedforward control. Due to the differences of the simulation model and the real quadcopter dynamics, the parameters of the motion primitive sequence used in simulation were re-tuned so that the flip is executed with minimal final state error. The refined parameter set is The measurement results are displayed in Fig. 9, showing the trajectory of the position and orientation during the maneuver. It is important to note that an additional lift phase is added to the implementation to gain enough vertical velocity and height, because the quadcopter falls a significant distance in the recovery phase. During the additional lift phase, the PID controller 1 is used to achieve exact vertical lifting and horizontal orientation. Fig. 9 shows that the flip is executed with almost zero final error in the pitch, and also quite small position error. Compared to the simulation displayed in Fig 5, the maximal displacement from the origin in x direction is Figure 10: Backflipping measurement results with nominal geometric control. The trajectories show that the maneuver is performed successfully, and the drone gets back to the initial position at the end. around a half, and in z direction around double. The difference is due to the additional refinement of the motion sequence parameters, and the uncertainties of the simulation model. It is important to note that the performance of the feedforward controller is very sensitive to uncertainties in the dynamics and initial conditions. For example, if the flip maneuver begins when the orientation of the quadcopter is not exactly horizontal, the stability can be lost at the recovery phase. Hence the maneuver is only successful in about 6-7 trials out of 10 using the feedforward controller.

C. Trajectory Planning and Geometric Control
The experimental results of backflipping with the nominal geometric control are displayed in Fig. 10. The most important part of reference tracking is the pitch angle θ, because a fast, stable and accurate attitude tracking is required to perform the flip maneuver and recover successfully. As it is shown on the right plot of the measurement results, although the pitch is close to the reference the position converges with a higher delay compared to the simulation results. In spite of the imperfect position tracking, the geometric controller is able to perform the backflip maneuver exactly the same way ten out of ten times, which indicates that even the nominal geometric control algorithm is significantly more robust than the feedforward method.
Next, we demonstrate the experiments using the proposed adaptive geometric controller. First, we collect data points by performing agile maneuvers with the nominal geometric controller and fit a Gaussian Process on the measurement data. The motion control algorithm runs on-board the quadcopter at 500 Hz and the computational capacity of the microcontroller is limited, therefore the real-time evaluation of the GP would not be feasible. To address this problem, we generate a lookup-table by evaluating the trained GP on a grid of the input variables on the desktop PC, and upload it to the on-board computer of the quadcopter. The evaluation of the lookuptable requires only memory operations and linear interpolation, therefore it is computationally cheap. Another possible solution would be the use of sparse GP methods ( [14,29]), however, lookup-tables are easier to implement and provide good results.
The performance of the proposed adaptive geometric  controller is evaluated with external disturbance added to the quadcopter dynamics. We have found that a bolt attached to the drone has significant influence on the attitude dynamics, however, the vehicle is still able to perform the backflip maneuver. The modified configuration is illustrated in Fig. 12, while the measurement results are given in Fig. 11. Similarly to the simulation results, the attitude tracking error is significantly reduced by using the proposed control algorithm. It is also important that at the end of the backflip, the attitude error of the adaptive controller is almost zero, making it easier to recover and get back to hovering.

VIII. Conclusion
In this paper, a Bayesian optimization based feedforward control scheme and robust geometric reference tracking control method have been proposed for quadcopters to reliably perform the backflip maneuver in the presence of modeling uncertainties and external disturbances. While in the former, relatively simple approach, Bayesian optimization can be used to fine tune the feedforward sequence and cope with the uncertainties, the latter method utilizes Gaussian Process based augmented motion mod-(a) Optimization-based feedforward control, the trajectory is displayed in Fig. 9. (b) Robust geometric feedback control, the trajectory is displayed in Fig. 10. Figure 13: Composite images of the measurement results using a Bitcraze Crazyflie 2.1 quadcopter. els and it is shown to have stability guarantees. By comparing the approaches in numerical simulations and real experiments, the robust geometric scheme has been found to outperform the feedforward strategy both in terms of reliability and optimal tracking of the motion profile.
Exploiting the robustness of the control approach, the maneuver has been implemented for simultaneous backflipping with three drones, a video of this experiment is available at https://youtu.be/-ifV1rozlx8. In our oncoming research work, we intend to use learning methods to perform complex maneuvers with less expert knowledge and extend the capabilities of the miniature quadcopters even more.