On Multidimensional Linear Modelling Including Real Uncertainty

. The theoretical background for abstract formalization of vague phenomenon of complex systems is fuzzy set theory. In the paper are deﬁned vague data as specialized fuzzy sets - fuzzy numbers and there is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague regression parameters. To identify the fuzzy coeﬃcients of model the genetic algorithm is used. The linear approximation of vague function together with its possibility area are analytically and graphically expressed. The suitable numerical experiments are performed namely in the task of two-dimensional fuzzy function modelling and the time series fuzzy regression analysis as well.


Introduction
Regression models are often used in engineering practice wherever there is a need to reflect more independent variables together with the effects of other unmeasured disturbances and influences.In classical regression, we assume that the relationship between dependent variables and independent variables of the model is well-defined and sharp.In the real world, however, hampered by the fact that this relationship is more or less non-specific and vague.This is particularly true when modelling complex systems which are difficult to define, difficult to measure or in cases where it is incorporated into the human element [8].
The suitable theoretical background for abstract formalization of vague phenomenon of complex systems is fuzzy set theory.In the paper are defined vague data as specialized fuzzy sets -fuzzy numbers.Next, a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters is identified using the genetic algorithms.

Ordinary Lienar Model
The ordinary linear regression model of the investigated system [11] is given by a linear combination of values of its input variables: where (x 1 , . . ., x n ) are input variables and (A 0 , A 1 , . . ., A n ) are ordinary regression coefficients.
The conventional regression model is based on the assumption that the system characteristics is defined as sharp, precise and deviations between the observed and estimated values of dependent variables are results of observation errors.The origin of a deviation between the observed and estimated values of dependent variables may not be of significant extent caused by poor local variables of the system structure.The causes of these variations are in a not very sharp nature of the system parameters.Such fuzzy phenomenon must also be reflected in the fuzziness of the corresponding parameters of the model.

Uncertainty Interval Linear Model
The development of the indeterminate regression model is the development of the model of vagueness, using the formalization of uncertainty rather than nu- merical intervals [3], [9]: Regression coefficients are numeric interval: where a is the middle of the interval is and e is half of its width.For one dimensional function: the interval regression model is depicted in Fig. 1.

Interval Model Fuzzification
Regression models reflecting the vagueness of the modelled systems are called fuzzy regression models [10], [1] and [8].The indeterminate nature of the fuzzy regression model is represented by the fuzzy output values and the fuzzy regression coefficients in the form of specialized fuzzy sets -fuzzy numbers.The shape of fuzzy linear regression model is given by: where x 0 = 1 and A 0 , A 1 , . . ., A n are fuzzy regression coefficients -fuzzy sets.The fuzzy set A is defined as image, which assigns to every element x of universum X number µ A (x) ∈ 0, 1 as a degree of its membership function A [6], [7]: At least piecewise continuous function µ A (x) = f (x) is called membership function, which defined  fuzzy set A conclusively.The membership function is usually in engineering praxis approximated by broken line (Fig. 2).
Triangular fuzzy set A then formalizes uncertain number (fuzzy number) "about x 2 ".The degree of uncertainty of number x 2 is defined as the width of the carrier bearing of the fuzzy set A as closed interval x 1 , x 3 (Fig. 2).Parameters of that fuzzy sets constitute structured vector of values of breaking point [x 1 , x 2 , x 3 ].Using this vector fuzzy sets are computerformalized.
In the fuzzy regression model, the fuzzy regression coefficients (fuzzy numbers) A are defined using its triangular shape membership function µ A (x) (Fig. 3), where α is the mean value (core) of fuzzy number A and c is a half of the width of the carrier bearing A i = {α, c}.The output variable Y of fuzzy regression model (Eq.( 5)) is fuzzy number defined using the triangular membership function (similar Fig. 3), where β is the mean value (core) of fuzzy number Y and d is a half of the width of the carrier bearing Y = {β, d}.
Fuzzy regression modelling (Eq.( 5)) requires operation with fuzzy numbers.For this types of operations it is needed to use relations of fuzzy arithmetic with usage of extensional principle.

Fuzzy Arithmetic Application
Extensional principle (principle of extension) allows to transfer operation over ordinary numbers to operation over fuzzy numbers.It allows to create fuzzy arithmetic for computing with imprecise (fuzzy) numbers [2].
Let consider universum U and V and function f , which maps U to V , i.e.: and fuzzy set A ⊆ U .Fuzzy set A then in V induces fuzzy set, whose membership function is defined by relation: Using the extension principle fuzzy numbers arithmetic can be defined [6].Take the case of the sum of two fuzzy numbers m ("about m") and n ("about n").These relations are needed for calculation of output value Y (Eq.( 5)): µ m⊕ n = sup x,y/z=x+y min (µ m (x) , µ n (x)) . (10)

Identification Method Description
Fuzzy number Y 0 j is mentioned of a triangular type.The values d j can be calculate by the formula: where j = 1, 2, . . ., m is the number of observations.Finding values α i and c i as searched parameters of fuzzy regression coefficients A i (Fig. 3) is defined as an optimization issue.
Fitness of the linear regression fuzzy model to the given data is measured through the Bass-Kwakernaaks's index H, Fig. 4, [4].Adequacy of the observed and estimated values is conditioned by the relation (Eq.( 12)), the maximum intersection (consistency) of two fuzzy sets, the estimated Y * j and the examined Y 0 j ; must be higher than the set value H: Only if the condition (Eq.( 12)) is fulfilled we assume good estimation of the observed output value Y 0 j .The relation (Eq.( 13)) is satisfied under the condition (Fig. 4): Consider the determined level H the boundary of intervals Y * ,H and relations (Eq.(13), Eq. ( 14) can be expressed: According to the Fig. 3, it can be written: The conditions (Eq.(13), Eq. ( 14)) can be written in the shape: The requirement on adequacy of the estimated and observed values will be complemented by the requirement on minimum possible total uncertainty of the identified fuzzy regression function: where i = 0, 1, . . ., n is the number of input values of the regression function and j = 1, 2, . . ., m is the number of observations.
To solve the minimization problem under the condition, many authors use the linear programming method.Nevertheless, in this paper we use the genetic algorithm method to solve this problem [4].Mainly, the reason is that the authors are oriented to use unconventional methods of artificial intelligence in order to prove their quality and efficiency in solving complex tasks.Genetic algorithms are a representative of evolutionary methods; their higher computational complexity is nowadays eliminated by high-performance computing.They are widely used in the search for optimal solutions.They can be well used for the identification of fuzzy regression models where they deal with the task of finding the optimal fuzzy regression coefficients as triangular fuzzy numbers.The identification of fuzzy regression coefficients -fuzzy numbers: A 0 , A 1 , . . ., A n , was divided into two tasks: • the identification of the mean value (core) α i of fuzzy number A i and • the identification of c i as a half of the width of the carrier bearing A i = {α i , c i }.
The tasks are solved by using the genetic algorithm in series.First the identification of α i and then the identification of c i are done.Thus, the optimization of the fuzzy linear regression model is a two-step process when two genetic algorithms, designated G1 and G2, are used.For the identification of the mean value (core) α i of fuzzy number A i the minimization of the fitness function J 1 is defined in the form: and the genetic algorithm GA1 is used.For the identification of as a half of the width of the carrier bearing A i the minimization of the fitness function J 2 is defined in the form: and the genetic algorithm GA2 with two constraints (Eq.( 21)) is used.Minimization of the fitness function J 2 is based on the previous identification of the role of the mean value (core) α i and uses the already identified values of α i for determining the width of the carrier bearing α i .The value of H = 0.5 is expertly determined in the next part of paper.

Genetic Algorithms Utilization
As mentioned before, the classical method of linear programming used for the identification of fuzzy regression coefficients [11] was substituted by using a genetic algorithm (GA) [4].Mainly, the reason is that the authors are oriented to use unconventional methods of artificial intelligence in order to prove their quality and efficiency in solving complex tasks.Genetic algorithms are a representative of evolutionary methods; their higher computational complexity is nowadays eliminated by high-performance computing.They are widely used in the search for optimal solutions.They can be well used for the identification of fuzzy regression models where they deal with the task of finding the optimal fuzzy regression coefficients as triangular fuzzy numbers.
The identification of fuzzy regression coefficientsfuzzy numbers: A 0 , A 1 , . . ., A n , was divided into two tasks: • the identification of the mean value (core) α i of fuzzy number A i and • the identification of c i as a half of the width of the carrier bearing The tasks are solved by using the genetic algorithm in series.First the identification of α i and then the identification of c i are done.
As it was mentioned before, the genetic algorithm is an unconventional optimisation method, which is used for minimization of the target optimization function  (fitness function).It is used instead of conventional methods, such as the linear programming method.
GA is a seeking procedure which looks for the best solutions according to the fitness function based on the processes observed in Nature, on the principle of natural selection and genetic laws, i.e. selection, crossover and mutation.The basis of GA is to use a character string, also called a chromosome, in which parameters of an optimized model are stored.An example of a chromosome which is composed of three parameters k 1 , k 2 and k 3 expressed by three 5-bit binary words is shown in Fig. 5.
Individual bits represent the string of chromosome genes; at the particular optimization step their specific values represent binary codes of three parameters of the model.Each chromosome is evaluated by the size of its fitness function, the value of which determines the distance of a solution (which is represented by a particular chromosome) from the optimal solution.
The set of evaluated n-chromosomes represents one population, the best individuals (solutions) of which are genetic operations of selection and are picked out for follow-up populations.Selected individuals are subjected to genetic operations of crossover, in which two individuals (parents) interchange gene circuits and generate two new chromosomes -offspring with different combinations of k 1 , k 2 and k 3 .The descendants, who were generated this way, then form a new population where individuals (solutions) appear to have better characteristics (better fitness function value) than the best individual in the population of parents.Then, an appropriate follow-up offspring population is created (solution step, iteration) and the genetic crossing procedure is repeated.Good convergence for finding an optimal individual (solution) is supported by a genetic operation -mutation.The features of genetic operations as selection, crossover and mutation are defined by setting the internal parameters of the genetic algorithm in the way that the convergence of a solution to optimum is favorable.
The procedure of the genetic algorithm is usually finished by a solution step (population) in which the values of the fitness function of the best current individual and the best individual in the last step vary less than the specified limit (stop-criterion).As an optimal solution is then determined the best chromosome of the last population.Corresponding (coded) parameter values are used in the optimal model.The main tasks while designing a genetic algorithm are the method of encoding the optimized parameters to a chromosome string and the definition of its fitness function.Optimization of the fuzzy linear regression model is a two-step process when two genetic algorithms, designated G1 and G2, are used.

Fuzzy Model Verification
For proving of efficiency of proposed method, the two dimensional linear function in form: was chosen.The set of Y 0 with ten members using (Eq.( 19), Eq. ( 20)) was created.For creating the set of Y 0 the values of x 1 and x 2 were chosen randomly from the standard uniform distribution on the open interval (0, 1) but multiplied by random integer.For fuzzification of observed value a = 0.1 was used.The result can be seen in Fig. 6, [5].
The usage in economic area is depicted in Fig. 7.As an input data the unemployment rate in the Czech republic for years 2009 to 2011 was used.

Conclusion
Abstract mathematical models of complex systems are often not very adequate because they do not accurately reflect the natural uncertainty and vagueness of the real world.The suitable theoretical background for abstract formalization of vague phenomenon of complex systems could be fuzzy set theory, which was shortly described.In the paper vague data as specialized fuzzy sets -fuzzy numbers are defined and it is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters.

Fig. 7 :
Fig. 7: Fuzzy linear regression function for unemployment in the Czech Republic.
Interval and fuzzy regression technology are discussed, the linear fuzzy regression model is proposed.It is used the effective genetic algorithm instead of commonly used linear programming method for identification of fuzzy regression coefficients of the model.The two-dimensional numerical example and practical economic usage are presented and the possibility area of vague model is graphically illustrated.Next research will be focused on model vague non-linear systems.Faculty of Electrical Engineering and Computer Science, Department of Cybernetics and Biomedical Engineering.Between 1964 and 1971 he worked as researcher in Research institute of metallurgy VZKG Ostrava-Vitkovice and between 1972 and 1992 as researcher and head of development in VUHZ-Research Institute of metallurgy.