Statistical mechanics of learning multiple orthogonal signals: Asymptotic theory and fluctuation effects

D. C. Hoyle and M. Rattray

Phys. Rev. E 75, 016101 – Published 9 January 2007

Abstract

The learning of signal directions in high-dimensional data through orthogonal decomposition or principal component analysis (PCA) has many important applications in physics and engineering disciplines, e.g., wireless communication, information theory, and econophysics. The accuracy of the orthogonal decomposition can be studied using mean-field theory. Previous analysis of data produced from a model with a single signal direction has predicted a retarded learning phase transition below which learning is not possible, i.e., if the signal is too weak or the data set is too small then it is impossible to learn anything about the signal direction or magnitude. In this contribution we show that the result can be generalized to the case where there are multiple signal directions. Each nondegenerate signal is associated with a retarded learning transition. However, fluctuations around the mean-field solution lead to large finite size effects unless the signal strengths are very well separated. We evaluate the one-loop contribution to the mean-field theory, which shows that signal directions are indistinguishable from one another if their corresponding population eigenvalues are separated by $O (N^{- τ})$ with exponent $τ > \frac{1}{3}$ , where $N$ is the data dimension. Numerical simulations are consistent with the analysis and show that finite size effects can persist even for very large data sets.

Received 18 May 2006

DOI:https://doi.org/10.1103/PhysRevE.75.016101

Authors & Affiliations

D. C. Hoyle^*

North West Institute for Bio-Health Informatics, University of Manchester, School of Medicine, Stopford Building, Oxford Road, Manchester M13 9PT, United Kingdom

M. Rattray^†

School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester M13 9PL, United Kingdom

^*URL: www.nibhi.org.uk. Electronic address:david.hoyle@manchester.ac.uk
^†URL: www.cs.man.ac.uk/∼magnus. Electronic address:magnus@cs.man.ac.uk

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand

Issue

Vol. 75, Iss. 1 — January 2007

Reuse & Permissions

Access Options

Author publication services for translation and copyediting assistance advertisement

Images

Figure 1
(a) Learning curves, at fixed $N = 3200$ , for the first two principal components. The population covariance contains two symmetry breaking directions, with $A_{1}^{2} = 20, A_{2}^{2} = 10$ , and we have set $σ^{2} = 1$ . Simulation values (solid symbols) are averages over 1000 data sets. The solid and dashed lines represent the theoretical results given by Eq. (23). Standard errors of the simulation averages are less than the size of the plotted symbols. (b) Learning curves, at fixed $N = 200$ , for the first two principal components. All other parameters as the same as for Fig. 1a. (c) Convergence to the asymptotic value of $R_{11}^{2}$ , at fixed $α = 0.2$ . The asymptotic value predicted by Eq. (23) is denoted by the horizontal dashed line, while the solid symbols represent simulation averages over 1000 data sets. Standard errors of the simulation averages are less than the size of the plotted symbols. The inset shows simulation estimates of the sample variance for $R_{11}^{2}$ . (d) Learning curves, at fixed $N = 3200$ , for the first three principal components. The population covariance contains three symmetry breaking directions, with $A_{1}^{2} = 20$ , $A_{2}^{2} = 10$ , $A_{3}^{2} = 20 ∕ 3 = 1 ∕ 0.15$ . Other parameters and simulation settings are as for (a).Reuse & Permissions
Figure 2
Plots of overlaps $R^{2}$ between the first principal component and the signal directions for different system sizes $N$ . We have fixed $α = 0.2$ and set $σ^{2} = 1$ . The population covariance contains two signal directions with similar signal strengths $A_{1} = \bar{A} + Δ A, A_{2} = \bar{A} - Δ A$ . The signal strength separation is an increasingly weak function of $N$ , i.e., $Δ A \sim N^{- τ}, τ > 0$ . The solid symbols show simulation averages for different values of $τ$ . Simulation averages are over 1000 data sets, except for the largest value of $N$ for which simulation averages are over 100 data sets. (a) Plot of $R_{11}^{2}$ . The upper dashed line shows the asymptotic value of $R_{11}^{2}$ predicted by Eq. (23) for $A_{1} = \bar{A}$ , while the lower dashed line is drawn at half this value. (b) Plot of $R_{12}^{2}$ . The dashed line is drawn at half the asymptotic value of $R_{11}^{2}$ predicted by Eq. (23) for $A_{1} = \bar{A}$ .Reuse & Permissions
Figure 3
Plot of $R^{2}$ for the first and third principal components. We have fixed $α = 0.5$ and set $σ^{2} = 1$ . The population covariance $C$ contains four signal directions. $A_{1}$ and $A_{2}$ are of similar strength, with $A_{1} - A_{2} \sim N^{- 1}$ , while $A_{3}$ and $A_{4}$ are of similar strength, with $A_{3} - A_{4} \sim N^{- 0.1}$ . The solid symbols show averages over 1000 simulated data sets for different values of $N$ . (a) shows $R_{11}^{2}$ and $R_{12}^{2}$ . The dashed line shows the common predicted asymptotic value of $R_{11}^{2}$ and $R_{12}^{2}$ . (b) shows $R_{33}^{2}$ and $R_{34}^{2}$ . The dashed line shows the predicted asymptotic value of $R_{33}^{2}$ .Reuse & Permissions

Physical Review E

covering statistical, nonlinear, biological, and soft matter physics