Algorithm for Conguring Fuzzy Inference Systems by Reference Points Based on the Average Value

In the article an algorithm for configuring Sugeno type fuzzy inference systems based on statistical data is proposed. The algorithm uses the principle of operation based on selecting the area around the reference points, finding the average value in the selected areas, and using it to configure the fuzzy logic output system. The work of the algorithm takes place under the conditions of changing the number of functions belonging to input variables and the number of points of statistical data, on the basis of which the models were configured.


Introduction
Fuzzy logic inference systems are widely used for solving problems of automatic control and decision support. One of the reasons for this is the possibility of approximation with a given accuracy a continuous function f: X→Y if the domain of its definition (X) is a compact set [1,2]. The main limitation of using fuzzy inference systems in practice is that the number of rules required for the synthesis of a complete fuzzy inference system increases exponentially and is bounded from above by the value k n (where n is the number of input variables and k is the maximum number of membership functions (MF) in the input variable).
Many authors [2,3] have studied methods and algorithms for configuring fuzzy inference systems based on splitting the domain of the approximated function definition by membership functions of input variables into n-dimensional hyper-rectangular sections. At the same time, the known works focus on the theoretical possibilities of implementing such a configuration, but insufficient attention is paid directly to the algorithms for its implementation. This is true in cases where the fuzzy inference system contains a large (more than 100) number of inference rules, which makes it difficult to manually configure the fuzzy inference system by experts.
The article presents the results of theoretical and experimental studies of configuring Sugeno type fuzzy inference systems by reference points based on statistical data, since Sugeno type fuzzy inference systems (FIS) are widely used in practice [3,4].

Problem statement
Configuration algorithm proposed in the article will be applied to Sugeno type fuzzy inference systems in which each membership function has only one point with the maximum value of the membership function (in the literature these points are called core of the membership function) and the value at this point is equal to 1. The definition area of each input variable is split into m i segments with the same length by reference points x ij (ithe number of the input variables, jthe number of the reference points, the total number of reference points for the ith input variable is m i +1). Each reference point x ij corresponds to the one membership function of the i-th input variable with core in x ij . There are no restrictions on form of the membership functions, but it is recommended that they meet the following condition: where µ ij (x)the j-th membership function of the i-th input variable.
Based on previous statements the domain of the function modeled by the FIS with a full consistent set of rules would be split by the cores of the membership functions on n-dimensional (hyper-) rectangles [3], where n is the number of input variables (in case of one input variable the scope would be broken into line segments). The article will consider FIS with triangular membership functions (MF) of input variables, since triangular MF are the simplest, most frequently used and fit the condition (1). An example of splitting a three-dimensional space with two input variables is shown on Fig. 1.
-FIS output in reference points x 11 x 12 x 13 x 14 x 15 x 21 x 22 x 23 x 24 x 25 Fig. 1. Example of splitting a three-dimensional space with two input variables with a uniform distribution of membership functions. x 1 , x 2domain of the first and second input variables.
ythe output value of fuzzy inference system. µ(x 1 ), µ(x 1 )the grades of membership functions of the first and second input variables. In a system of fuzzy logical output of the Sugeno type of zero order, the defuzzification procedure (obtaining the resulting value) is performed using the formula: where y is the output value of the fuzzy inference system; w iis the grade of fulfillment of the premises of the i-th rule, b i -is the value of the conclusion of the i-th rule for a Sugeno-type FIS with constant function in the conclusion. In this case, fuzzy inference systems could be configured on calculating specified output value of output y' for the specified values of input variables [x' 1 , x' 2 , , x' n ] by changing the conclusions of the fuzzy inference rules using the formula: where b' ithe corrected value of the conclusion of the i-th rule. This formula would be used for configuring FIS in reference points which are related to the cores of the membership functions on n-dimensional (hyper-) rectangles.

Algorithm description
Algorithm is based on calculating the average value of points in the neighborhood of reference points and applying the result to configure the fuzzy inference system. A graphical representation of the principle of algorithm is shown in fig. 2. The figure shows the reference points, the boundaries of the statistical data selection interval for performing adjustment around each reference point and the statistical data point.
Statistical data points that fall within the interval boundaries are used to form the values of the modeled function at reference points, which are used to configure the fuzzy inference system.  4. If there are no points in given boundaries then increase half of the boundaries interval on Hi, (further in article Hi equal h i on the first step) and repeat steps 3 and 4 until in boundaries would be one or more statistical data points or boundaries size would be more than the domain of one of the input variables. 5. Calculate mean value for selected statistical data points -y'. 6. Configure fuzzy inference systems: set FIS output value to y' in the taken reference point using formula 3. 7. If all reference points have been taken for the configuration of FIS, then stop. Or take the next reference point and go to step 3. Let us show the convergence of the algorithm by proving the following theorem. Theorem: Let be a continuous function f(x), where x ∈ X, Xa compact set on R n , the function g(x) as approximation function f(x) by the FIS with K membership functions for each of n input variable, H(K,N)procedure realized by proposed algorithm for configuring the FIS based on N points defined by the implementation of a random variable with a mathematical expectation M[ ( )] = f(x), hthe points selection interval for setting up the FIS system at reference points during the operation of the proposed algorithm.
It is required to prove that there is an FIS that, after performing the procedure, H(N) approximates the function f(x) with the specified accuracy Ɛ by N→∞, h→0.
Proof: In accordance with the theorem proved by B. Kosko [1], the fuzzy inference system, which realized function g(x), approximates with the specified accuracy Ɛ any continuous function f(x) if x ∈ X, Xa compact set on R n . Consequently, it is possible to configure the FIS so that condition |f(x)g(x)| < Ɛ would be true for any small constant Ɛ > 0. In accordance to this theorem Ɛ→0 when K→∞, The proposed algorithm uses the formula for precise tuning the FIS at reference points. To configure the system value FIS to each reference point x'∈ X adjustment points from the interval are used x ∈ [x'-h, x'+h], with values as a manifestation of a random variable . The kernels of the membership functions of the input variables of the FIS are in the reference points. When the number of membership functions tends to infinity, the number of reference points also tends to infinity. One of the properties of compactness of a set X is that X bounded. In this regard, when the number of reference points tends to infinity, the distance between them tends to zero. In case the FIS is able to approximate the function with a given accuracy, which is provided by a sufficient number of reference points, when decrease the interval h→0 and increase N→∞, for any reference point there will be an infinite number of points to configuring FIS, and the sample average calculated by the algorithm will tend to the true value of the mathematical expectation of the random variable ( ). In this case, the condition will be met: and as a consequence we get expression 4. The theorem is proved. Consequence: the algorithm allows us to configure the FIS for approximate the function with a given accuracy Ɛ, if the FIS can approximate this function with a given accuracy by its structure and the number of adjustment points N for each reference point fit the requirement: where σ is the standard deviation of the random variable , ⌈ ⌉ -the operation of rounding up. The inequality is obtained in accordance with the Central limit theorem. Average standard deviation σ' distributions of the mathematical expectation of a random sample from n random variable implementations equal to: If you set a requirement that: 6   , (8) then in this case for 99.99966 % of samples mathematical expectation will be less than Ɛ. If we substitute equation 7 in 8 then, as a result , we obtain the formula 6.
These points were set by random pairs input variables (x 1i, x 2i ) i with range in x 1 [0, 10] and x 2 [0, 10], represents a number of generated points. Each pair input variable (x 1i, x 2i ) i generated the value of y i by the normal distribution law with its mean calculated by formula 4 and mean square error 1.3. Fig. 3 shows graphs of the base function and the 3000 points that were generated with its mean value represented by base function. Fig. 3. a) Base function graph, b) Graph representation of 3000 points generated with mean value represented by base function Algorithm evaluation was made by comparison FIS configured by proposed algorithm on generated statistical data set (next marked as FIS-a), FIS configured using formula 3 on base function value in reference points (next marked as FIS-b) and FIS configured using adaptive neuro-fuzzy inference system (ANFIS) [6] (next marked as FIS-c). Evaluation was made by the following criteria: Mean error  for FIS outputs calculated by formula: Graphs for FIS configuration runtime, mean, max errors in depending on the number of membership functions in input variables are shown on fig. 4 and calculation results in Table 2.  The obtained data allow us to conclude that with an increase in the number of membership functions and as a consequence with an increase in the number of reference points the average error outputs of the FIS configured using the proposed algorithm (FIS-a) decreases. The value of the average error of the FIS-a is comparable to the error of the FIS configured based on the exact value of the modeled function (FIS-b). In this case, the maximum error values can be several times higher than the maximum error value of the FIS-b, which can be explained by the occurrence of individual "peaks" of maxima or minima corresponding to outliers. Examples of surfaces plotted using FIS-and FIS-b outputs with 15 membership functions for each input variable are shown in the fig. 5. FIS-a was configured using 3000 points for getting data to plot fig. 5.  Table 3.  1,92 1,76 1,78 1,69 1,82 1,68 1,68 1,64 1,65 1,68 1,66 1,67 1,66 1,63  10 5,03 2,17 1,79 1,52 1,50 1,61 1,49 1,50 1,48 1,44 1,47 1,46 1,48 1,46 1,45 1,43  11 4,95 1,98 1,65 1,50 1,41 1,42 1,40 1,36 1,33 1,33 1,34 1,27 1,28 1,26 1,25 1,27  12 4,89 1,90 1,47 1,31 1,31 1,26 1,22 1,29 1,23 1,18 1,08 1,11 1,14 1,12 1,12 1,11  13 4,90 2,07 1,46 1,23 1,26 1,21 1,14 1,11 1,12 1,07 1,07 1,05 1,05 1,01 1,01 0 Based on the results obtained, it follows that the mean error  value decreases with an increase of the number of random points for configuring FIS and the number of membership functions. However, it is always greater than the mean error value of the FIS configured on the exact value of the function in the reference points, but it is asymptotically closer to it. The mean error value decreasing is not linear. For considered in the article's example, an increase in the number of membership functions greater than 17 with the number of adjustment points greater than 1600 results in a decrease in the error value of no more than 10% relative to the maximum and minimum error values.

Conclusions
The algorithm considered in the article allows you to configure fuzzy inference systems based on statistical data. The advantage of the proposed algorithm is the relatively high speed compare to ANFIS. It could be important for configuring FIS with big number of input variable and membership functions. As a further development of the topic, it is planned to consider the development of algorithms in terms of changing the size of intervals between reference points to achieve more accurate adjustment in areas with a large amount of statistical data.

Funding
This work was supported by the grant of the President of the Russian Federation for state support of leading scientific schools of the Russian Federation NSH-2553.2020.8.

Ethics declarations
Conflict of interest The authors declare that they have no conflict of interest. Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.