THE EVOLVING ADAPTIVE NEURAL NETWORK FOR DATA PROCESSING WITH MISSING OBSERVATIONS

The problem of computational intelligence systems synthesis in on-line mode, capable for processing stochastic signals with missing observations in the data is considered. An adaptive approach based on using of orthogonal polynomials is developed.


INTRODUCTION
Artificial neural networks, neuro-fuzzy systems, hybrid systems of computational intelligence currently are widespread for solving data processing problems, Data Mining, prediction, identification and control of nonlinear stochastic and chaotic objects and systems [1][2][3][4][5].
The most attractive properties of these systems are their universal approximation properties and learning ability, usually understood as a possibility of tuning the parameters by optimizing some quality criterion (objective function, learning criterion). In a wider sense, it is possible to configure the system architecture too. Currently, there exist number of approaches, but the most widely used is so-called constructive approach in which the system of computational intelligence, starting with the simplest architecture, gradually increasing its complexity and at the same time tuning their parameters to achieve the desired quality of the solution. This approach has formed a new direction in computational intelligence, known as evolving systems [6,7]. At the same time, most of these systems process information in a batch mode, which makes them difficult to use in cases where the data for processing are fed in on-line mode in the form of time series.
In many practical applications involving the processing of sequences of real data, there exists situation when some observations of the controlled time series, for whatever reasons, are missing (lost). It is understood that for the normal operation of the neural network or a hybrid system, these missing data must somehow be restored. The problem of restoring the missing values has received sufficient attention [8][9][10], in this case as the most effective are neural networks [11][12][13][14]. However, the known approaches for restoration of missing values in time series are effective only in cases when the whole data set is given a priori, the amount of missing values is known, and the time series has a fixed number of observations. It is natural that in problems where data are fed for processing in real time, the number of missing values is unknown beforehand and the sequence has nonstationary character, the known approaches are ineffective.
So, the problem of synthesis of computational intelligence systems in on-line mode, capable for processing stochastic signals with missing observation in the data is interesting and useful.

SEQUENTIAL RESTORATION OF MISSING OBSERVATIONS IN THE TIME SERIES
The proposed approach is based on the use of classical orthogonal polynomials [15] and first of all Chebyshev polynomials (T-systems) [16,17], that have a number of good properties in terms of the approximation problem using quadratic criterion [16].
or using recurrent relation In some situations it is easier to solve the problem in the interval [0,1] x ∈ , which can be used with biased Chebyshev polynomials: It is also easy to write biased polynomials for any interval and achievable accuracy of approximation -based on the variance but in this case, we assume that at some moments in the discrete time k measuremens or have not been done or lost. Let's introduce two subsets: , contains moments with missing observations.
Then the coefficients of the approximating polynomial can be calculated using the following relations: T means that the corresponding polynomials are orthogonal on the interval . After this it is easy to restore the missing observations in the form values of the time series, it's possible to form a sequence containing all N values, referred 1 2ˆ, ,..., N x x x . Next, let's introduce the parameter vector where vector w is calculated by the standard method of least squares After that we can finally writê Using relations (6) and (7) processing is realized in batch mode for a fixed number of points N. If the data are fed to the processing sequentially we have to organize on-line data processing. In [17] for this purpose it was proposed to use recurrent least-squares method in the form but it should be noted that with the arrival of the new ( 1) N +th observation 1 N x + , the structure of the approximating polynomials essentially changes, so that It is clear that this fact significantly complicates the realization of on-line processing. To keep the structure of the approximating polynomials, we can organize data processing on a sliding window s, with at each moment k , we will use the computer time 1, 2,..., k s = , connected with real time so that the calculations are provided only in Then the estimate of type (6) and estimates of missing observationŝˆ( Implementation of the described approach is very convenient with the apparatus of artificial neural networks using ortho-synapse [18], shown in Fig. 1 and trained by the algorithm (10). Ortho-synapse on the structure coincides with the nonlinear synapse of neo-fuzzy neuron [19,20], but instead of the membership functions it contains the orthogonal activation functions ls T , making the learning process easier and quicker. It is also important that due to using of a sliding window, these activation functions are not changed in the learning process. The value of the sliding window is selected from empirical considerations 1 s h ≥ + .
Thus if approximating sequence is nonstationary, the value should not significantly exceed the number of parameter estimates.

ARCHITECTURE OF ORTHOGONAL NEURAL NETWORK
Using orthogonal polynomials as activation functions leads to the creation of a whole group of orthogonal neural networks [21][22][23][24][25][26][27][28][29] possessing good approximating properties and high speed learning of synaptic weights. In [30][31][32][33] growing orthogonal networks based on the ortho-synapses and ortho-neurons have been proposed [18]. These networks are characterized by simplicity of learning of synaptic weights and architecture. Fig. 2 shows the architecture of orthogonal neural network for data processing with the lost observations that implements a nonlinear mapping x containing a priori unknown number of missing values are fed. Note also that the external training signal k y also can contain gaps. is used by learning algorithm to tune both the weights and architecture. The present architecture contains 2 1 n + ortho-synapses and ( 1) h + (2 1) n + synaptic weights to be estimated at the same time it is very important that the output signal ˆk y depends linearly on the weights.

SYNAPTIC ADAPTATION
Let's introduce the standard criterion of learning  To tune the vector of synaptic weights w in a sequential mode it's possible to use a recurrent least squares method (8) processing non-stationary signals is using adaptive procedures that have tracking properties, such as the same method of least squares on the sliding window, which in this situation, can be written as thus, due to the diagonality of matrix ( ) s P N calculation of estimation (14) even at large n and h does not cause difficulty.

ARCHITECTURAL ADAPTATION
The number of activation functions 1 h + in each orthosynapse i O S − of output layer is chosen rather arbitrarily, , so if it is found that the synthesized neural network does not provide the required quality of the information processing, the number of these functions can be increased (or decreased, if necessary) in on-line mode directly in the learning process. Thus network evolutionary properties by adapting the structure, due to orthogonality with activation functions for the calculated correction synaptic weights made simply.
where (( 1) ) h n n + × -matrix , 1( ) It's obvious that architectural adaptation is more difficult than synaptic. The possibility of its implementation in online mode, by using of orthogonal activation functions, can achieve the required approximating properties in the learning process.

CONCLUSION
The evolving neural network that due to using orthogonal activation functions tunes the synaptic weights and structure in the learning process is proposed. Another important feature of the proposed network is connected with the possibility to process in on-line mode information, spoiled by missing values in the data. The neural network under consideration is characterized by high speed and can to process distorted nonlinear nonstationary stochastic and chaotic signals in real time.