Abstract
Machine learning is a field of science where a mathematical model learns to represent, classify, regress, or cluster data and/or makes appropriate decisions. This book introduces dimensionality reduction, also known as manifold learning, which is a field of machine learning. Dimensionality reduction transforms data to another lower-dimensional subspace for better representation of data. This chapter defines dimensionality reduction and enumerates its main categories as an introduction to the next chapters of the book.
The world is in the Hilbert space,
And is vast and all-encompassing.
But it is so simple,
And falls on a low-dimensional submanifold.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
David H Ackley, Geoffrey E Hinton, and Terrence J Sejnowski. “A learning algorithm for Boltzmann machines”. In: Cognitive science 9.1 (1985), pp. 147–169.
Zeyuan Allen-Zhu, Yuanzhi Li, and Yingyu Liang. “Learning and generalization in overparameterized neural networks, going beyond two layers”. In: Advances in neural information processing systems 32 (2019).
Jonathan L Alperin. Local representation theory: Modular representations as an introduction to the local representation theory of finite groups. Vol. 11. Cambridge University Press, 1993.
Shaeela Ayesha, Muhammad Kashif Hanif, and Ramzan Talib. “Overview and comparative study of dimensionality reduction techniques for high dimensional data”. In: Information Fusion 59 (2020), pp. 44–58.
Elnaz Barshan et al. “Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds”. In: Pattern Recognition 44.7 (2011), pp. 1357–1371.
Mikhail Belkin and Partha Niyogi. “Laplacian eigenmaps and spectral techniques for embedding and clustering”. In: Advances in neural information processing systems 14 (2001), pp. 585–591.
Yoshua Bengio, Aaron Courville, and Pascal Vincent. “Representation learning: A review and new perspectives”. In: IEEE transactions on pattern analysis and machine intelligence 35.8 (2013), pp. 1798–1828.
Yoshua Bengio et al. “Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering”. In: Advances in neural information processing systems 16 (2003), pp. 177–184.
Yoshua Bengio et al. Spectral clustering and kernel PCA are learning eigenfunctions. Tech. rep. Departement d’Informatique et Recherche Operationnelle, Technical Report 1239, 2003.
Bernhard E Boser, Isabelle M Guyon, and Vladimir N Vapnik. “A training algorithm for optimal margin classifiers”. In: Proceedings of the fifth annual workshop on Computational learning theory. 1992, pp. 144–152.
Jane Bromley et al. “Signature verification using a “Siamese” time delay neural network”. In: International Journal of Pattern Recognition and Artificial Intelligence 7.04 (1993), pp. 669–688.
Lawrence Cayton. Algorithms for manifold learning. Tech. rep. University of California at San Diego, 2005.
Trevor F Cox and Michael AA Cox. Multidimensional scaling. Chapman and hall/CRC, 2000.
Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. “Testing the manifold hypothesis”. In: Journal of the American Mathematical Society 29.4 (2016), pp. 983–1049.
Ronald A Fisher. “The use of multiple measurements in taxonomic problems”. In: Annals of eugenics 7.2 (1936), pp. 179–188.
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 2. Springer series in statistics New York, NY, USA, 2009.
Benjamin Fruchter. Introduction to factor analysis. Van Nostrand, 1954.
Kenji Fukumizu, Francis R Bach, and Michael I Jordan. “Kernel dimensionality reduction for supervised learning”. In: Advances in neural information processing systems. Vol. 16. 2003.
Benyamin Ghojogh. “Data Reduction Algorithms in Machine Learning and Data Science”. PhD thesis. University of Waterloo, 2021.
Benyamin Ghojogh, Fakhri Karray, and Mark Crowley. “Eigenvalue and generalized eigenvalue problems: Tutorial”. In: arXiv preprint arXiv:1903.11240 (2019).
Benyamin Ghojogh, Fakhri Karray, and Mark Crowley. “Generalized subspace learning by Roweis discriminant analysis”. In: International Conference on Image Analysis and Recognition. Springer. 2020, pp. 328–342.
Benyamin Ghojogh et al. “Feature selection and feature extraction in pattern analysis: A literature review”. In: arXiv preprint arXiv:1905.02845 (2019).
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier neural networks”. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings. 2011, pp. 315–323.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
Ian Goodfellow et al. “Generative adversarial nets”. In: Advances in neural information processing systems. Vol. 27. 2014.
Arthur Gretton et al. “Measuring statistical dependence with Hilbert-Schmidt norms”. In: International conference on algorithmic learning theory. Springer. 2005, pp. 63–77.
Raia Hadsell, Sumit Chopra, and Yann LeCun. “Dimensionality reduction by learning an invariant mapping”. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06) Vol. 2. IEEE. 2006, pp. 1735–1742.
Ji Hun Ham et al. “A kernel view of the dimensionality reduction of manifolds”. In: International Conference on Machine Learning. 2004.
Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. “A fast learning algorithm for deep belief nets”. In: Neural computation 18.7 (2006), pp. 1527–1554.
Geoffrey E Hinton and Sam T Roweis. “Stochastic neighbor embedding”. In: Advances in neural information processing systems. 2003, pp. 857–864.
Geoffrey E Hinton and Terrence J Sejnowski. “Optimal perceptual inference”. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition Vol. 448. IEEE, 1983.
Thomas Hofmann, Bernhard Schölkopf, and Alexander J Smola. “Kernel methods in machine learning”. In: The annals of statistics (2008), pp. 1171–1220.
William B Johnson and Joram Lindenstrauss. “Extensions of Lipschitz mappings into a Hilbert space”. In: Contemporary mathematics 26 (1984).
Ian Jolliffe. Principal component analysis. Springer, 2011.
Mahmut Kaya and Hasan S ̧akir Bilge. “Deep metric learning: A survey”. In: Symmetry 11.9 (2019), p. 1066.
Diederik P Kingma and Max Welling. “Auto-encoding variational Bayes”. In: International Conference on Learning Representations. 2014.
John A Lee and Michel Verleysen. Nonlinear dimensionality reduction Springer Science & Business Media, 2007.
John M Lee. Introduction to Smooth Manifolds. Springer Science & Business Media, 2013, pp. 1–31.
Ker-Chau Li. “Sliced inverse regression for dimension reduction”. In: Journal of the American Statistical Association 86.414 (1991), pp. 316–327.
Tong Lin and Hongbin Zha. “Riemannian manifold learning”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 30.5 (2008), pp. 796–809.
Laurens van der Maaten and Geoffrey Hinton. “Visualizing data using t-SNE”. In: Journal of machine learning research 9.Nov (2008), pp. 2579–2605.
Alireza Makhzani et al. “Adversarial autoencoders”. In: arXiv preprint arXiv:1511.05644 (2015).
Leland McInnes, John Healy, and James Melville. “UMAP: Uniform manifold approximation and projection for dimension reduction”. In: arXiv preprint arXiv:1802.03426 (2018).
Sebastian Mika et al. “Fisher discriminant analysis with kernels”. In: Proceedings of the 1999 IEEE signal processing society workshop on Neural networks for signal processing IX. IEEE. 1999, pp. 41–48.
Sebastian Mika et al. “Invariant feature extraction and classification in kernel spaces”. In: Advances in neural information processing systems. 2000, pp. 526–532.
Andrew Ng, Michael Jordan, and Yair Weiss. “On spectral clustering: Analysis and an algorithm”. In: Advances in neural information processing systems 14 (2001), pp. 849–856.
Karl Pearson. “LIII. On lines and planes of closest fit to systems of points in space”. In: The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2.11 (1901), pp. 559–572.
Ali Rahimi and Benjamin Recht. “Random Features for Large-Scale Kernel Machines”. In: Advances in neural information processing systems. Vol. 20. 2007.
Ali Rahimi and Benjamin Recht. “Weighted sums of random kitchen sinks: replacing minimization with randomization in learning”. In: Advances in neural information processing systems. 2008, pp. 1313–1320.
Sam Roweis. “EM algorithms for PCA and SPCA”. In: Advances in neural information processing systems 10 (1997), pp. 626–632.
Sam T Roweis and Lawrence K Saul. “Nonlinear dimensionality reduction by locally linear embedding”. In: Science 290.5500 (2000), pp. 2323–2326.
John W Sammon. “A nonlinear mapping for data structure analysis”. In: IEEE Transactions on computers 100.5 (1969), pp. 401–409.
Lawrence K Saul and Sam T Roweis. “Think globally fit locally: unsupervised learning of low dimensional manifolds”. In: Journal of machine learning research 4.Jun (2003), pp. 119–155.
Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. “Kernel principal component analysis”. In: International conference on artificial neural networks. Springer. 1997, pp. 583–588.
Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. “Nonlinear component analysis as a kernel eigenvalue problem”. In: Neural computation 10.5 (1998), pp. 1299–1319.
Florian Schroff, Dmitry Kalenichenko, and James Philbin. “FaceNet: A unified embedding for face recognition and clustering”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, pp. 815–823.
Ravid Shwartz-Ziv and Naftali Tishby. “Opening the black box of deep neural networks via information”. In: arXiv preprint arXiv:1703.00810 (2017).
Mahdi Soltanolkotabi, Adel Javanmard, and Jason D Lee. “Theoretical insights into the optimization landscape of over-parameterized shallow neural networks”. In: IEEE Transactions on Information Theory 65.2 (2018), pp. 742–769.
Nitish Srivastava et al. “Dropout: a simple way to prevent neural networks from overfitting”. In: The journal of machine learning research 15.1 (2014), pp. 1929–1958.
Harry Strange and Reyer Zwiggelaar. Open Problems in Spectral Dimensionality Reduction. Springer, 2014.
Joshua B Tenenbaum, Vin De Silva, and John C Langford. “A global geometric framework for nonlinear dimensionality reduction”. In: Science 290.5500 (2000), pp. 2319–2323.
Michael E Tipping and Christopher M Bishop. “Probabilistic principal component analysis”. In: Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61.3 (1999), pp. 611–622.
Naftali Tishby, Fernando C Pereira, and William Bialek. “The information bottleneck method”. In: The 37th annual Allerton Conference on Communication, Control, and Computing 1999, pp. 368–377.
Jianzhong Wang. Geometric structure of high-dimensional data and dimensionality reduction. Vol. 5. Springer, 2012.
Kilian Q Weinberger and Lawrence K Saul. “An introduction to nonlinear dimensionality reduction by maximum variance unfolding”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 6. 2006, pp. 1683–1686.
Kilian Q Weinberger, Fei Sha, and Lawrence K Saul. “Learning a kernel matrix for nonlinear dimensionality reduction”. In: Proceedings of the twenty-first international conference on Machine learning. 2004, p. 106.
Yair Weiss. “Segmentation using eigenvectors: a unifying view”. In: Proceedings of the seventh IEEE international conference on computer vision. Vol. 2. IEEE. 1999, pp. 975–982.
Max Welling, Michal Rosen-Zvi, and Geoffrey E Hinton. “Exponential Family Harmoniums with an Application to Information Retrieval.” In: Advances in neural information processing systems. Vol. 4. 2004, pp. 1481–1488.
Liu Yang and Rong Jin. “Distance metric learning: A comprehensive survey”. In: Michigan State University 2.2 (2006), p. 4.
Guoqiang Zhong et al. “An overview on data representation learning: From traditional feature learning to recent deep learning”. In: The Journal of Finance and Data Science 2.4 (2016), pp. 265–278.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ghojogh, B., Crowley, M., Karray, F., Ghodsi, A. (2023). Introduction. In: Elements of Dimensionality Reduction and Manifold Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-10602-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-10602-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10601-9
Online ISBN: 978-3-031-10602-6
eBook Packages: Computer ScienceComputer Science (R0)