Convergence analysis of a simple minor component analysis algorithm

doi:10.1016/j.neunet.2007.07.001

Neural Networks

Volume 20, Issue 7, September 2007, Pages 842-850

https://doi.org/10.1016/j.neunet.2007.07.001 Get rights and content

Abstract

Minor component analysis (MCA) is a powerful statistical tool for signal processing and data analysis. Convergence of MCA learning algorithms is an important issue in practical applications. In this paper, we will propose a simple MCA learning algorithm to extract minor component from input signals. Dynamics of the proposed MCA learning algorithm are analysed using a corresponding deterministic discrete time (DDT) system. It is proved that almost all trajectories of the DDT system will converge to minor component if the learning rate satisfies some mild conditions and the trajectories start from points in an invariant set. Simulation results will be furnished to illustrate the theoretical results achieved.

Introduction

As an important feature extraction technique, minor component analysis (MCA) has been widely applied to total least squares (TLS) (Gao, Ahmad, & Swamy, 1992), moving target indication (Klemm, 1987), clutter cancellation (Barbarossa, Daddio, & Galati, 1987), computer vision (Cirrincione, 1998), curve and surface fitting (Xu, Oja, & Suen, 1992), digital beamforming (Griffiths, 1983), frequency estimation (Mathew & Reddy, 1994) and bearing estimation (Schmidt, 1986), etc.

Minor component is the direction in which the data have the smallest variance. Although eigenvalue decomposition (EVD) or singular value decomposition (SVD) can be used to extract minor component, these traditional matrix algebraic approaches are usually unsuitable for high-dimensional online input data. Neural networks can be used to solve the task of MCA without calculating the correlation matrix of input data in advance, which makes neural networks approaches more suitable for online extraction of minor component.

Many MCA neural networks algorithms have been proposed and extensively analysed. However, some existing MCA algorithms face the norm divergence problem (Cirrincione et al., 2002, Taleb and Cirrincione, 1999). In order to guarantee convergence, many stabilization methods have been used for developing MCA algorithms (Chen and Amari, 2001, Möller, 2004, Oja, 1992). However, the introduction of these stabilization methods increases the computational complexity of MCA algorithms. It is very interesting to develop convergent MCA algorithms with low computational complexity. Recently, two efficient MCA algorithms are proposed in Feng, Bao, and Jiao (1998) and Ouyang, Bao, Liao, and Ching (2001), called Feng and AMEX respectively. AMEX and Feng algorithms have simple expressions and lower computational complexity. However, there still exists a divergence problem in AMEX and Feng algorithms when the correlation matrix of input data is singular (Peng & Yi, 2006). In this paper, we will propose a simple MCA algorithm which has a lower computational complexity and a more satisfactory convergence property.

Almost all MCA neural networks algorithms are described by stochastic discrete time (SDT) systems. Traditionally, convergence of SDT system is analyzed via a corresponding deterministic continuous time (DCT) system. To use this DCT method, some restrictive conditions must be satisfied. One important condition is the learning rates must approach zero (Ljung, 1977). However, in many practical applications, the learning rate is often taken as a small constant due to the round-off limitations and tracking requirements (Yi, Ye, Lv, & Tan, 2005). Hence, convergence of DCT system does not imply convergence of the original SDT system when the learning rate is a constant (Zufiria, 2002). Recently, a deterministic discrete time (DDT) method is used to study dynamics of SDT systems (Yi et al., 2005, Zhang, 2003, Zufiria, 2002). This DDT method transforms SDT system into a corresponding deterministic discrete time (DDT) system and does not require the learning rate to approach zero. DDT systems preserve the discrete time nature of the original SDT systems and can shed some light on the convergence characteristics of SDT systems. It seems more reasonable to analyse dynamics of SDT system via DDT method. In this paper, we will analyse convergence of the proposed MCA algorithm via a corresponding DDT system and obtain the conditions to guarantee convergence.

This paper is organized as follows. In Section 2, we review some existing MCA algorithms. In Section 3, a simple MCA algorithm is proposed to extract a minor component. Its convergence is analysed using the DDT method in Section 4. In Section 5, some simulation results are presented to illustrate the theoretical results achieved. Finally, some conclusions are given in Section 6.

Section snippets

Learning algorithms for MCA

Let us consider a single linear neuron with the following input output relation: $y (k) = w^{T} (k) x (k), (k = 0, 1, 2, \dots),$ where $y (k)$ is the neuron output, the input sequence ${x (k) | x (k) \in R^{n} (k = 0, 1, 2, \dots)}$ is a zero mean stationary stochastic process and $w (k) \in R^{n} (k = 0, 1, 2, \dots)$ is the weight vector of the neuron. Although linear neurons are the simplest units to build neural networks, they have many important applications in signal processing. Oja (1982) has found that a simple linear neuron with an unsupervised

The proposed MCA algorithm

By adding a penalty term $[\frac{1}{2 η} - w^{T} (k) w (k)] \cdot w (k)$ to anti-Hebbian rule, we can obtain an interesting MCA learning algorithm as follows: $w (k + 1) = 1.5 w (k) - η y (k) x (k) - η [w^{T} (k) w (k)] w (k),$ where $η > 0$ is the learning rate.

It is essential to analyse convergence of the proposed algorithm and derive the convergence conditions. As discussed in Section 1, deterministic discrete time (DDT) method is a more reasonable analysis approach than the traditional deterministic continuous time (DCT) method. The DDT system

Convergence analysis

In this section, the convergence of DDT system (9) will be analyzed. We will prove that if $η λ_{1} < 0.5, {‖ w (0) ‖}^{2} \leq \frac{1}{2 η} and w^{T} (0) v_{n} \neq 0,$ then weight vector $w (k)$ will converge to minor component of input data in (9), where $w (0)$ is the initial weight vector, $λ_{1}$ is the largest eigenvalue of the correlation matrix $R$ and $v_{n}$ is the eigenvector associated with the smallest eigenvalue of the correlation matrix $R$ .

For studying the dynamics of the DDT system (9), the following lemmas are useful:

Lemma 1

For all $s \in [0, \frac{1}{2 η}]$ , it

Simulation results

From the analysis in Section 4, to guarantee the convergence, the learning rate $η$ should satisfy $0 < η λ_{1} < 0.5$ . In many applications, based on the problem-specific knowledge, an upper bound of $λ_{1}$ can often be estimated without computing the correlation matrix $R$ (Zhang, 2003). Thus, from the application point of view, choosing a suitable learning rate is an easy task. Based on the selection of the learning rate $η$ , initial weight vector $w (0)$ can be chosen from the invariant set $S$ to guarantee

Conclusions

A simple MCA algorithm is proposed to extract one single minor component in this paper. The convergence of the proposed algorithm is analysed using a corresponding DDT system. The convergence analysis shows that almost all trajectories starting from an invariant set converge to minor component of input data if the learning rate satisfies some mild conditions. Simulation results illustrate the theoretical results achieved.

Acknowledgements

This work was supported by National Science Foundation of China under Grant 60471055 and Specialized Research Fund for the Doctoral Program of Higher Education under Grant 20040614017.

References (24)

T. Chen et al.
Unified stabilization approach to principal and minor components extraction algorithms
Neural Networks
(2001)
E. Oja
Principal components, minor components and linear neural networks
Neural Networks
(1992)
L. Xu et al.
Modified Hebbian learning for curve and surface fitting
Neural Networks
(1992)
Q. Zhang
On the dicrete-time dynamics of a PCA Learning Algorithm
Neurocomputing
(2003)
S. Barbarossa et al.
Comparison of optimum and linear prediction technique for clutter cancellation
IEE Proceedings. Part F-Communications, Radar and Signal Processing
(1987)
C. Chatterjee et al.
Algorithm for accelerated convergence of adaptive PCA
IEEE Transactions on Neural Networks
(2000)
Cirrincione, G. (1998). A neural approach to the structure from motion problem. Ph.D. dissertation. LIS INPG...
G. Cirrincione et al.
The MCA EXIN neuron for the minor component analysis
IEEE Transactions on Neural Networks
(2002)
D.Z. Feng et al.
Total least mean squares algorithm
IEEE Transactions on Signal Processing
(1998)
D.Z. Feng et al.
Neural network learning algorithms for tracking minor subspace in high-dimensional data stream
IEEE Transactions on Neural Network
(2005)

K. Gao et al.

Learning algorithm for total least squares adaptive signal processing

Electronics Letters

(1992)

J.W. Griffiths

Adaptive array processing, a tutorial

IEE Proceedings, Part F-Communications, Radar and Signal Processing

(1983)

Cited by (25)

Comparative analysis on thermal non-destructive testing imagery applying Candid Covariance-Free Incremental Principal Component Thermography (CCIPCT)
2017, Infrared Physics and Technology
Citation Excerpt :
Despite of the substantial PCT’s performance in the field, there is a need for better and more efficient technique implying PCA. An efficient method which does not involve the calculation of the covariance matrix and evaluates the principal component is the Incremental Principal Component Analysis (IPCA) technique [11–13]. However, this method is played by the problem of convergence to face high dimensional vectors.
Thermal and infrared imagery creates considerable developments in Non-Destructive Testing (NDT) area. Here, a thermography method for NDT specimens inspection is addressed by applying a technique for computation of eigen-decomposition which refers as Candid Covariance-Free Incremental Principal Component Thermography (CCIPCT). The proposed approach uses a shorter computational alternative to estimate covariance matrix and Singular Value Decomposition (SVD) to obtain the result of Principal Component Thermography (PCT) and ultimately segments the defects in the specimens applying color based K-medoids clustering approach. The problem of computational expenses for high-dimensional thermal image acquisition is also investigated. Three types of specimens (CFRP, Plexiglas and Aluminium) have been used for comparative benchmarking. The results conclusively indicate the promising performance and demonstrate a confirmation for the outlined properties.
Multi-view feature extraction based on slow feature analysis
2017, Neurocomputing
In this paper, we proposed to apply IncSFA to represent the feature of 3D model and employed graph matching to handle similarity measure problem between two different 3D model. First, we built the input data in order to guarantee it suitable for SFA mode according to structure information of 3D model. Second, SFA method utilizes iterations learning method to extract slow feature for each 2D views recorded from 3D model. Finally, weighted bipartite graph matching is leveraged to compute the similarity between query model and candidate model. Extensive comparison experiments were on the popular ETH dataset. The results demonstrate the superiority of the proposed method.
Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots
2017, Artificial Intelligence
In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? We propose here Continual Curiosity driven Skill Acquisition (CCSA). CCSA makes robots intrinsically motivated to acquire, store and reuse skills. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning to learn how to get these intrinsic rewards. CCSA also does this, but unlike previous implementations, the world model is a set of compact low-dimensional representations of the streams of high-dimensional visual information, which are learned through incremental slow feature analysis. These representations augment the robot's state space with new information about the environment. We show how this information can have a higher-level (compared to pixels) and useful interpretation, for example, if the robot has grasped a cup in its field of view or not. After learning a representation, large intrinsic rewards are given to the robot for performing actions that greatly change the feature output, which has the tendency otherwise to change slowly in time. We show empirically what these actions are (e.g., grasping the cup) and how they can be useful as skills. An acquired skill includes both the learned actions and the learned slow feature representation. Skills are stored and reused to generate new observations, enabling continual acquisition of complex skills. We present results of experiments with an iCub humanoid robot that uses CCSA to incrementally acquire skills to topple, grasp and pick-place a cup, driven by its intrinsic motivation from raw pixel vision.
Eigenfilter design of linear-phase FIR digital filters using neural minor component analysis
2014, Digital Signal Processing: A Review Journal
Citation Excerpt :
Consequently, the optimal filter coefficients of the LS design are equivalent to solving the eigenvector corresponding to the smallest eigenvalue of a real, symmetric, and positive-definite matrix, the elements of which are related to the filter specifications. Neural networks [20–31] exhibit massive connectivity and nonlinear properties and have been used to solve computationally burdensome optimization problems, particularly for real-time signal processing applications. Recent literature [18,32–37] has proposed various neural-network-based techniques for designing distinct types of digital filter demonstrating favorable performance in a massive parallelism.
This paper proposes a minor component analysis-based neural learning algorithm for designing linear-phase finite impulse response digital filters. The objective function to be minimized in the least-squares design can be formulated as the eigenvalue problem for solving an appropriate real, symmetric, and positive-definite matrix. To achieve the eigenfilter design, an alternative neural learning rule based on the minor component analysis algorithm is exploited. The optimal filter coefficients corresponding to the eigenvector of the smallest eigenvalue of the positive-definite matrix can be achieved in an iterative manner, avoiding the complex computation of eigenvalue decomposition. Furthermore, the learning step parameter that affects the convergence performance is investigated empirically. The simulation results indicate that the proposed neural-based approach can be applied to eigenfilter design and yields a lower computational complexity compared with traditional matrix algebraic-based approaches.
A neural networks learning algorithm for minor component analysis and its convergence analysis
2008, Neurocomputing
The eigenvector associated with the smallest eigenvalue of the autocorrelation matrix of input signals is called minor component. Minor component analysis (MCA) is a statistical approach for extracting minor component from input signals and has been applied in many fields of signal processing and data analysis. In this letter, we propose a neural networks learning algorithm for estimating adaptively minor component from input signals. Dynamics of the proposed algorithm are analyzed via a deterministic discrete time (DDT) method. Some sufficient conditions are obtained to guarantee convergence of the proposed algorithm.
Slow Down to Go Better: A Survey on Slow Feature Analysis
2024, IEEE Transactions on Neural Networks and Learning Systems

View all citing articles on Scopus

View full text

Convergence analysis of a simple minor component analysis algorithm

Abstract

Introduction

Section snippets

Learning algorithms for MCA

The proposed MCA algorithm

Convergence analysis

Simulation results

Conclusions

Acknowledgements

Neural Networks

Neural Networks

Neural Networks

Neurocomputing

Comparison of optimum and linear prediction technique for clutter cancellation

IEE Proceedings. Part F-Communications, Radar and Signal Processing

Algorithm for accelerated convergence of adaptive PCA

IEEE Transactions on Neural Networks

The MCA EXIN neuron for the minor component analysis

IEEE Transactions on Neural Networks

Total least mean squares algorithm

IEEE Transactions on Signal Processing

Neural network learning algorithms for tracking minor subspace in high-dimensional data stream

IEEE Transactions on Neural Network

Learning algorithm for total least squares adaptive signal processing

Electronics Letters

Adaptive array processing, a tutorial

IEE Proceedings, Part F-Communications, Radar and Signal Processing