Convergence analysis of a simple minor component analysis algorithm
Introduction
As an important feature extraction technique, minor component analysis (MCA) has been widely applied to total least squares (TLS) (Gao, Ahmad, & Swamy, 1992), moving target indication (Klemm, 1987), clutter cancellation (Barbarossa, Daddio, & Galati, 1987), computer vision (Cirrincione, 1998), curve and surface fitting (Xu, Oja, & Suen, 1992), digital beamforming (Griffiths, 1983), frequency estimation (Mathew & Reddy, 1994) and bearing estimation (Schmidt, 1986), etc.
Minor component is the direction in which the data have the smallest variance. Although eigenvalue decomposition (EVD) or singular value decomposition (SVD) can be used to extract minor component, these traditional matrix algebraic approaches are usually unsuitable for high-dimensional online input data. Neural networks can be used to solve the task of MCA without calculating the correlation matrix of input data in advance, which makes neural networks approaches more suitable for online extraction of minor component.
Many MCA neural networks algorithms have been proposed and extensively analysed. However, some existing MCA algorithms face the norm divergence problem (Cirrincione et al., 2002, Taleb and Cirrincione, 1999). In order to guarantee convergence, many stabilization methods have been used for developing MCA algorithms (Chen and Amari, 2001, Möller, 2004, Oja, 1992). However, the introduction of these stabilization methods increases the computational complexity of MCA algorithms. It is very interesting to develop convergent MCA algorithms with low computational complexity. Recently, two efficient MCA algorithms are proposed in Feng, Bao, and Jiao (1998) and Ouyang, Bao, Liao, and Ching (2001), called Feng and AMEX respectively. AMEX and Feng algorithms have simple expressions and lower computational complexity. However, there still exists a divergence problem in AMEX and Feng algorithms when the correlation matrix of input data is singular (Peng & Yi, 2006). In this paper, we will propose a simple MCA algorithm which has a lower computational complexity and a more satisfactory convergence property.
Almost all MCA neural networks algorithms are described by stochastic discrete time (SDT) systems. Traditionally, convergence of SDT system is analyzed via a corresponding deterministic continuous time (DCT) system. To use this DCT method, some restrictive conditions must be satisfied. One important condition is the learning rates must approach zero (Ljung, 1977). However, in many practical applications, the learning rate is often taken as a small constant due to the round-off limitations and tracking requirements (Yi, Ye, Lv, & Tan, 2005). Hence, convergence of DCT system does not imply convergence of the original SDT system when the learning rate is a constant (Zufiria, 2002). Recently, a deterministic discrete time (DDT) method is used to study dynamics of SDT systems (Yi et al., 2005, Zhang, 2003, Zufiria, 2002). This DDT method transforms SDT system into a corresponding deterministic discrete time (DDT) system and does not require the learning rate to approach zero. DDT systems preserve the discrete time nature of the original SDT systems and can shed some light on the convergence characteristics of SDT systems. It seems more reasonable to analyse dynamics of SDT system via DDT method. In this paper, we will analyse convergence of the proposed MCA algorithm via a corresponding DDT system and obtain the conditions to guarantee convergence.
This paper is organized as follows. In Section 2, we review some existing MCA algorithms. In Section 3, a simple MCA algorithm is proposed to extract a minor component. Its convergence is analysed using the DDT method in Section 4. In Section 5, some simulation results are presented to illustrate the theoretical results achieved. Finally, some conclusions are given in Section 6.
Section snippets
Learning algorithms for MCA
Let us consider a single linear neuron with the following input output relation: where is the neuron output, the input sequence is a zero mean stationary stochastic process and is the weight vector of the neuron. Although linear neurons are the simplest units to build neural networks, they have many important applications in signal processing. Oja (1982) has found that a simple linear neuron with an unsupervised
The proposed MCA algorithm
By adding a penalty term to anti-Hebbian rule, we can obtain an interesting MCA learning algorithm as follows: where is the learning rate.
It is essential to analyse convergence of the proposed algorithm and derive the convergence conditions. As discussed in Section 1, deterministic discrete time (DDT) method is a more reasonable analysis approach than the traditional deterministic continuous time (DCT) method. The DDT system
Convergence analysis
In this section, the convergence of DDT system (9) will be analyzed. We will prove that if then weight vector will converge to minor component of input data in (9), where is the initial weight vector, is the largest eigenvalue of the correlation matrix and is the eigenvector associated with the smallest eigenvalue of the correlation matrix .
For studying the dynamics of the DDT system (9), the following lemmas are useful: Lemma 1 For all , it
Simulation results
From the analysis in Section 4, to guarantee the convergence, the learning rate should satisfy . In many applications, based on the problem-specific knowledge, an upper bound of can often be estimated without computing the correlation matrix (Zhang, 2003). Thus, from the application point of view, choosing a suitable learning rate is an easy task. Based on the selection of the learning rate , initial weight vector can be chosen from the invariant set to guarantee
Conclusions
A simple MCA algorithm is proposed to extract one single minor component in this paper. The convergence of the proposed algorithm is analysed using a corresponding DDT system. The convergence analysis shows that almost all trajectories starting from an invariant set converge to minor component of input data if the learning rate satisfies some mild conditions. Simulation results illustrate the theoretical results achieved.
Acknowledgements
This work was supported by National Science Foundation of China under Grant 60471055 and Specialized Research Fund for the Doctoral Program of Higher Education under Grant 20040614017.
References (24)
- et al.
Unified stabilization approach to principal and minor components extraction algorithms
Neural Networks
(2001) Principal components, minor components and linear neural networks
Neural Networks
(1992)- et al.
Modified Hebbian learning for curve and surface fitting
Neural Networks
(1992) On the dicrete-time dynamics of a PCA Learning Algorithm
Neurocomputing
(2003)- et al.
Comparison of optimum and linear prediction technique for clutter cancellation
IEE Proceedings. Part F-Communications, Radar and Signal Processing
(1987) - et al.
Algorithm for accelerated convergence of adaptive PCA
IEEE Transactions on Neural Networks
(2000) - Cirrincione, G. (1998). A neural approach to the structure from motion problem. Ph.D. dissertation. LIS INPG...
- et al.
The MCA EXIN neuron for the minor component analysis
IEEE Transactions on Neural Networks
(2002) - et al.
Total least mean squares algorithm
IEEE Transactions on Signal Processing
(1998) - et al.
Neural network learning algorithms for tracking minor subspace in high-dimensional data stream
IEEE Transactions on Neural Network
(2005)
Learning algorithm for total least squares adaptive signal processing
Electronics Letters
Adaptive array processing, a tutorial
IEE Proceedings, Part F-Communications, Radar and Signal Processing
Cited by (25)
Comparative analysis on thermal non-destructive testing imagery applying Candid Covariance-Free Incremental Principal Component Thermography (CCIPCT)
2017, Infrared Physics and TechnologyCitation Excerpt :Despite of the substantial PCT’s performance in the field, there is a need for better and more efficient technique implying PCA. An efficient method which does not involve the calculation of the covariance matrix and evaluates the principal component is the Incremental Principal Component Analysis (IPCA) technique [11–13]. However, this method is played by the problem of convergence to face high dimensional vectors.
Multi-view feature extraction based on slow feature analysis
2017, NeurocomputingContinual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots
2017, Artificial IntelligenceEigenfilter design of linear-phase FIR digital filters using neural minor component analysis
2014, Digital Signal Processing: A Review JournalCitation Excerpt :Consequently, the optimal filter coefficients of the LS design are equivalent to solving the eigenvector corresponding to the smallest eigenvalue of a real, symmetric, and positive-definite matrix, the elements of which are related to the filter specifications. Neural networks [20–31] exhibit massive connectivity and nonlinear properties and have been used to solve computationally burdensome optimization problems, particularly for real-time signal processing applications. Recent literature [18,32–37] has proposed various neural-network-based techniques for designing distinct types of digital filter demonstrating favorable performance in a massive parallelism.
Slow Down to Go Better: A Survey on Slow Feature Analysis
2024, IEEE Transactions on Neural Networks and Learning Systems