Simple Model-Free Controller for the Stabilization of Planetary Inverted Pendulum

A simple model-free controller is presented for solving the nonlinear dynamic control problems. As an example of the problem, a planetary gear-type inverted pendulum (PIP) is discussed. To control the inherently unstable system which requires real-time control responses, the design of a smart and simple controller is made necessary. The model-free controller proposed includes a swing-up controller part and a stabilization controller part; neither controller has any information about the PIP. Since the input/output scaling parameters of the fuzzy controller are highly sensitive, we use genetic algorithm (GA) to obtain the optimal control parameters. The experimental results show the effectiveness and robustness of the present controller.


Introduction
To ensure better performance in diverse operating conditions, more and more modern control systems emerged. However, these control systems may fall outside of the scope of conventional control. Different from traditional model-based control, modern control exhibits less dependence on mathematical models. The model-free concept has been proposed that the controller does not contain any information about the system to be controlled [1]. Taking advantage of the concept, many high nonlinear systems are successfully to be controlled. Coelho et al. proposed a modelfree learning adaptive control (MFLAC) strategy that is based on pseudogradient concepts with compensation using an RBF neural network and DE optimization [2]. An adaptive higher-order differential feedback controller which does not depend on the model of the controlled chaotic system has also been studied [3]. These controllers inherently exhibited its robustness.
In the field of intelligent control, fuzzy logic control (FLC) has gained popularity as a model-free approach that often outperforms other conventional approaches such as nonlinear adaptive control or PID controls [4]. FLC provides a framework for approximate reasoning and allows expert knowledge to be translated into an executable rule set. It could deal with vague and incomplete information and exhibit robustness to noise and variations in system parameters [5]. Furthermore, when the system is too complex [6] or has high degree of nonlinearity [4] and the underlying processes are insufficiently understood [7], FLC plays an important role in robust control.
However, fuzzy systems have a well-known problem relating to the determination of their parameters: the membership functions, scaling factors, and rule base. To achieve better performance and improved robustness, neural networks [8], adaptive learning [9,10], and the genetic algorithm (GA) [11,12] are being used in designing such controllers. A lot of studies proposed that merged techniques provide a more accurate and robust solution than that derived from any single technique. In any practical problem, it is worth considering which should be optimized and how merging these technologies can provide an alternative to a strictly knowledge-driven reasoning system. In [13], a novel approach was proposed to represent continuousvalued input parameters using linguistic terms and then extracted fuzzy rules from trained binary single-layer neural networks. A knowledge base was learned from interval and fuzzy data for regression problems by applying the GA [14]. In addition, GA could be used to determine the membership functions in fuzzy systems [15,16]. Scaling parameters, which describe input normalizations and output denormalization, play a role similar to that of gain coefficients in conventional PID controllers [17]. For a successful design of FLC, proper selection of input and output scaling factors is critical tasks, which in many cases, is performed through trial and error or based on some training data [18]. An interesting fuzzy controller which include seven rules was proposed to for self-tuning its scaling factors [19]. To tune the optimal parameters of the fuzzy controller at some operating points, Serra and Bottura used GA off-line to get the optimal scaling parameters [20]. Hameed et al. first tuned all these scaling parameters by the GA, and then fixed the input scaling factors and tuned the output scaling factor by the fuzzy logic [21].
This paper proposes a simple model-free control strategy for the planetary-gear-type inverted pendulum (PIP). The control strategy includes a swing-up controller and a stabilization controller, and neither of them requires a mathematical model of the PIP. For the PIP, the scaling parameters are very important and tuned by the developed GA. Rest of this paper is organized as follows. Section 2 provides a description of the PIP. Section 3 presents the smart and simple structure of the fuzzy controller. Section 4 presents the simulation and experimental results. Finally, Section 5 presents discussions and concludes the paper.

Planetary-Gear-Type Inverted Pendulum
A PIP consists of a star gear, a planet gear, an encoder, a gear base, a pendulum, and motors, as shown in Figure 1. The angular acceleration of the star gear causes the pendulum to swing, which hooks with the planetary gear. Unlike prevailing inverted pendulum systems, the PIP does not possess the winding wire problem and the length limitation of the platform [22]. The mechanical parameters are described and their values are listed in Table 1.

Control Strategy
This section presents an alternative strategy to process control using a simple model-free control in which input/output scaling parameters are optimized basing on the GA. The proposed control structure is shown in Figure 2. The strategy consists of the swing-up strategy and the fuzzy controller, neither of which has any information other than the pendulum angle and pendulum velocity.
Remark 1. This basic concept of the strategy with inverted pendulum involves two parts: the swing-up strategy and stabilization in the upright position method [23,24]. Actually, along with the idea, several studies concentrated efforts on making the planetary-gear-type inverted model upright [22,25,26].

Remark 2.
The adjustment of the controller gains is offline precomputed. It may not provide any feedback to compensate for incorrect schedules [20]. Nevertheless, the Journal of Control Science and Engineering unchanged gain schedule may prevent instability due to frequent and rapid changes in the controller gains. In fact, the inverted pendulum is a classic example of an inherently unstable system, and it requires real-time control responses. To the best of our knowledge, the key parameters involved in genetic optimization in most studies about the inverted pendulum were computed off-line and were kept invariant during the control process.

Swing-Up Controller.
In this study, a simple swingup strategy is proposed. Comparing with energy-based fuzzy swing-up controller [27], the proposed method is comparatively simple and very competitive.

Fuzzy Controller
Using the error vector is the input, the output of the fuzzy controller is represented as a function, y = f (e 1 , e 2 ).

Fuzzy Rule
Base. The function f is in general a complex nonlinear relationship between the inputs and the output, and it is expressed as a fuzzy rule base consisting of the following rules: If e 1 is A i and e 2 is B j , then y is G i j , where i = 1, 2, . . . , 7 and j = 1, 2, . . . , 7.

Membership Function and Defuzzifier.
The membership function is used to describe the uncertainty and imprecise information. The membership value of input e 1 is evaluated by u 1i as follows: The triangular input membership function is depicted in Figure 3.
Another input membership function of e 2 is defined similar to that of e 1 with parameters. B i , i = 1, 2, . . . , 7.
For simplicity, a singleton fuzzification is used, as depicted in Figure 4. The center-of-gravity defuzzification method is adopted [28]. Then, the fuzzy controller function can be represented as follows: . ( In (3), λ 1 , λ 2 , and λ 3 are the parameters that will be optimized by GA in the subsequent step.

Design of Input/Output Scaling Factors.
As mentioned above, fuzzy systems design is composed of three important components, the membership functions, scaling factors, and rule base. There is no standard and systematic method to adjust the shape, the parameters of the membership function, and the rule base to achieve some desired performance. However, modification of rule base maybe cause considerable step changes in the shape of control surface, and modification of membership functions shape can cause only local changes in the shape of control surface [28]. Additionally, modification of many parameters at a time easily results in big computation cost. So, the definition of membership functions and the establishment of control rules  base are usually designed subjective [29]. Different experts maybe obtain different experiences. It is difficult to acquire good control performance for the system whose scaling factors are just totally obtained from experts experience. In many cases, the adjusting (heuristic tuning) of scaling factors is done through trial and error or based on some training data. As global heuristic search method, genetic algorithms were used for scaling factors, which significantly simplifies the choice of the controller scaling factors for the defined control index [28]. Along with the idea, the input/output scaling gains are optimized by the GA, which was first proposed by Holland and was inspired by natural population genetics to evolve solutions to problems [30]. It consists of a number of biologically inspired steps, as shown in Figure 5. A number of approaches are available for implementing each of these steps.

Coding Strategy.
The most widely used encoding method for classifiers is standard binary mapping. For the problem under consideration, each chromosome is represented by an n 1 + n 2 + n 3 -bit-long chromosome, which comprises three decision variables that include the input/output gains. Each design variable is designed as var dec (i) = ni j=1 2 varbin( j) 2 n − 1 var Li dec − var Si dec , i = 1, 2, 3 (4) in which var Li dec and var Si dec are the upper and lower bound decimal values of the design variables, respectively, var bin ( j) is the jth element of the parameter binary vector, and var dec (i) is the corresponding decimal value of the design variable. Table 2 lists the parameters of the coding strategy, selected by experience (other selections might be possible).

Fitness Evaluation.
The fitness of a control design problem is a scalar measure of the overall performance of the controller, which indicates the quality of the solution that the chromosome values lead to. Based on these fitness values, the chromosomes that will be used to form the new generation are selected. The proper design strategy of the fitness function can pull the current state toward the desired state quickly and does not require too much computation time. The objective is to force the tracking errors to zero. Therefore, the fitness function is chosen as where e 1 (i) and e 2 (i) denote the errors of the pendulum angle and angular velocity of the pendulum for the ith training sample, respectively; N is the total number of training samples; P is the number of calculated training samples, which satisfies P < N; α is a weighting factor.  that at different stages of the GA, the individual fitness function needs to be expanded or reduced, incorporating the nonlinear transformation for the fitness function [31].

Genetic Operators.
To select individuals with high fitness to produce new individuals for the next population, the selection strategy using the following steps.
Step 1. Select s 1 numbers of chromosomes with the highest fitness values.
Step 2. Select s 2 numbers of chromosomes randomly based on a constructed random table and choose the one with the highest fitness value among s 1 chromosomes.

Crossover and Mutation.
Crossover refers to information exchange between individuals in a population in order to produce new individuals. We adopted the standard single crossover method, which takes two input individuals, selects a random point with a probability p c , and exchanges the subindividuals behind the selected point. Mutation is traditionally performed in order to increase the diversity of the genetic information. The local maxima can be avoided. The bitwise mutation method for changing a single element is implemented with a probability p m . Table 3 lists the parameters of the GA over the problem configuration described above. parameters [32][33][34]. It is found that the larger the population size is, the higher the fitness value will be. The higher the value of p c is, the quicker the new solutions will introduce into the population. As p c increases, a solution can be disputed faster than selection can exploit them. Typical values of p c are in the range of 0.5-1.0. For the parameter p m , large values will transform the GA into a purely random search algorithm. Small values may cause the premature convergence of the GA to suboptimal solutions. Typically, the value of p m is chosen to be in the range of 0.005-0.1.

Remark 5.
According to (5), we can see the fitness function based on the combination of the mean-squared error and the mean-squared derivative error with a weighting factor. The choice of the weighting factor during the experimental design dose affects the final decisions. As shown in Figure 6, with the increasing weighting factor, the fitness value is increasing. However, the relationship between the mean-squared errors and the mean-squared derivative errors seems not to be very regularly in Figures 7 and 8. It looks like that when the weighting factor α ≤ 0.05, it offers the smaller compromise of the mean-squared errors and the mean-squared derivative errors, we choose the α = 0.01 as the weighting factor value.

Experimental Test
The behavior of the proposed model-free controller was investigated on a physical device. The hardware-in-the-loop controller consists of an RT-DAC/PCI motion card (which is  inserted in the PC directly), PIP terminal card, and VisSim software. Combined with the C function library and the Windows dynamic link library (DLL), the motion controller uses the PC as its host and communicates information by PCI104 versions of BUS. Our experimental steps are as follows.
Step 1. Optimize the input/output gains of the fuzzy controller using Matlab.
Step 2. Use the C programming language and DLL in Windows environment to generate the corresponding fuzzycontroller.dll file.
Step 3. Input the fuzzycontroller.dll file into the VisSim software to control the PIP plant, as shown in Figure 9. The corresponding block (termed the fuzzycontroller block) is generated, as shown in Figure 10.
In this experimental application, the pendulum angle θ can be measured by an encoder in the plant, and the pendulum angle velocityθ can be measured by the VisSim software. Based on the two states, the PIP terminal board finally outputs the command voltage to the dc motor of the plant. The control system interface of the VisSim interface is shown in Figure 10.
The integration computation is implemented by the Runge-Kutta method. The scaling factors are λ 1 = 44.9939, λ 2 = 5.7387, and λ 3 = −70.5739 with the highest the pendulum can remain stationary for as long as possible (http://www.youtube.com/watch?v=v k 43Q9QVY). The second video demonstrates that the pendulum can remain upright after receiving stick punches (http://www.youtube .com/watch?v=HCD pnwR7g4). The third video demonstrates that the inverted pendulum will return to the upright position from arbitrary positions (http://www.youtube.com/ watch?v=5roOm DXBS8). All of the videos illustrate the effectiveness of the simple model-free controller; the second and third videos in particular also validate the robustness of the controller.
Remark 6. As mentioned in Remark 1, there are several efforts devoted to the PIP [22,26]. The sliding mode control technology is applied [22,25]. The control scheme in reference [26] consisted in fuzzy swing-up controller, fuzzy sliding balance controller, and fuzzy energy compensation mechanism [26]. However, all of these controllers were devised by preknown model knowledge with Lyapunov stability theory. The model-free controller in this paper did not involve any preknown model knowledge. It regarded the nonlinear rotation dynamic behavior with uncertain disturbance as a whole process. Without preknown model knowledge and complicated designing progress under mathematic theory, the simple model-free controller maybe gain more extensive use. Unfortunately, as other controllers with GA optimization [11][12][13][14][15][16], it lacks instrict mathematic theory support especially convergence stability proof. It is an open range for more exploration to mathematic theory about the controller based on GA optimization algorithm in the future.

Conclusion
In this paper, a simple model-free controller for a PIP is designed. It consists of a swing-up controller and a fuzzy controller, neither containing any information about the plant. This is why we termed it a model-free strategy. The input/output scaling parameters of the controller are optimized by using the GA, which significantly simplifies the choice of these parameters for the defined controller.
The experimental results, shown in the figures and videos, demonstrate the robustness and effectiveness of the strategy.