Abstract
We propose a neighbourhood-preserving method called LMB for generating a low-dimensional representation of the data points scattered on a nonlinear manifold embedded in high-dimensional Euclidean space. Starting from an exemplary data point, LMB locally applies the classical Multidimensional Scaling (MDS) algorithm on small patches of the manifold and iteratively spreads the dimension reduction process. Differs to most dimension reduction methods, LMB does not require an input for the reduced dimension, as LMB could determine a well-fit dimension for reduction in terms of the pairwise distances of the data points. We thoroughly compare the performance of LMB with state-of-the-art linear and nonlinear dimension reduction algorithms on both synthetic data and real-world data. Numerical experiments show that LMB efficiently and effectively preserves the neighbourhood and uncovers the latent embedded structure of the manifold. LMB also has a low complexity of \(O(n^2)\) for n data points.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sorzano, C.O.S., Vargas, J., Montano, A.P.: A survey of dimensionality reduction techniques (2014). arXiv preprint arXiv: 1403.2877
Sarveniazi, A.: An actual survey of dimensionality reduction. Am. J. Comput. Math. 4(5), 55–72 (2014)
van der Maaten, L.J., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: a comparative review. J. Mach. Learn. Res. 10(1–41), 66–71 (2009)
Jiang, X., Gao, J., Hong, X., Cai, Z.: Gaussian processes autoencoder for dimensionality reduction. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS, vol. 8444, pp. 62–73. Springer, Heidelberg (2014)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)
Borg, I., Groenen, P.J.: Modern Multidimensional Scaling: Theory and Applications. Springer Science & Business Media, New York (2005)
Brown, T.A.: Confirmatory Factor Analysis for Applied Research. Guilford Publications, New York (2015)
Li, M., Yuan, B.: 2d-lda: a statistical linear discriminant analysis for image matrix. Pattern Recogn. Lett. 26(5), 527–532 (2005)
Hardoon, D.R., Shawe-Taylor, J.: Convergence analysis of kernel canonical correlation analysis: theory and practice. Mach. Learn. 74(1), 23–38 (2009)
Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13(4), 411–430 (2000)
Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis. In: Gerstner, W., Hasler, M., Germond, A., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327. Springer, Heidelberg (1997)
Bernstein, M., De Silva, V., Langford, J.C., Tenenbaum, J.B.: Graph approximations to geodesics on embedded manifolds, Technical report, Department of Psychology, Stanford University (2000)
Saul, L., Roweis, S.: “Think globally, fit locally: unsupervised learning of nonlinear manifolds,” Technical report MS CIS-02-18 (2002)
Donoho, D.L., Grimes, C.: Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003)
Zhang, Z., Zha, H.: Nonlinear dimension reduction via local tangent space alignment. In: Liu, J., Cheung, Y., Yin, H. (eds.) IDEAL 2003. LNCS, vol. 2690. Springer, Heidelberg (2003)
Hastie, T.: Principal curves and surfaces, Technical report, DTIC Document (1984)
Horn, B.K., Hilden, H.M., Negahdaripour, S.: Closed-form solution of absolute orientation using orthonormal matrices. JOSAA 5(7), 1127–1135 (1988)
Lee, J.A., Verleysen, M. et al.: Quality assessment of nonlinear dimensionality reduction based on k-ary neighborhoods. In: FSDM, pp. 21–35 (2008)
Acknowledgments
This work is supported by National Natural Science Foundation (61472147) and National Science Foundation of Hubei Province (2015CFB566).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: A Brief Introduction of the MDS Algorithm
Appendix: A Brief Introduction of the MDS Algorithm
Let n be the size of the data and D be the matrix of Euclidean pairwise distances. MDS generates the coordinates X for all the data points with their center at the origin. Each column of X represents a data point. MDS first calculates the matrix \(X^TX\) by Eq. (9). The diagonal entries of H are \(1-\frac{1}{n}\) and the rests are \(-\frac{1}{n}\).
\(X^TX\) is positive semi-definite, and can decomposed as Eq. (10), which leads to Eq. (11).
If the data points are distributed in an m-dimensional space, the first m diagonal entries of \(\sigma \) are non-zero. Extracting the first d entries of \(\sigma \) and the corresponding eigenvectors generates an approximation of the original data in a lower d-dimensional space. The approximation is very accurate if the first d entries are the most significant ones and the rests are close to zero. If MDS is applied on the k-nearest neighbourhoods on a d-dimensional manifold, \(\sigma \) is a k+1 by k+1 matrix and the first d entries are significant while the rests are almost zero. However, If MDS is applied on all the data points of a manifold, \(\sigma \) is a n by n matrix, and due to the global geometry of the manifold, the first m \((m>d)\) entries of \(\sigma _{n\times n}\) are significant. Therefore, only the first d dimensions are not enough to well represent the data points, and MDS on the whole nonlinear manifold may result in an inaccurate approximation in the d-dimensional space.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ma, Y., He, K., Hopcroft, J., Shi, P. (2016). Nonlinear Dimension Reduction by Local Multidimensional Scaling. In: Zhu, D., Bereg, S. (eds) Frontiers in Algorithmics. FAW 2016. Lecture Notes in Computer Science(), vol 9711. Springer, Cham. https://doi.org/10.1007/978-3-319-39817-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-39817-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39816-7
Online ISBN: 978-3-319-39817-4
eBook Packages: Computer ScienceComputer Science (R0)