Abstract
Scatterplot visualization techniques are known as a useful method that shows the correlations of variables on the axes, as well as revealing patterns or abnormalities in the multidimensional data sets. They are often used in the early stage of the exploratory analysis. Scatterplot techniques have the drawback that they are not quite effective in showing a high number of dimensions where each plot in two-dimensional space can only present a pair-wise of two variables on the x-axis and y-axis. Scatterplot matrices and multiple scatterplots provide more plots that show more pair-wise variables, yet also compromise the space due to the space division for the plots. This chapter presents a comprehensive review of multi-dimensional visualization methods. We introduce a hybrid model to support multidimensional data visualization from which we present a hybrid scatterplots visualization to enable the greater capability of individual scatterplots in showing more information. Particularly, we integrate star plots with scatterplots for showing the selected attributes on each item for better comparison among and within individual items, while using scatterplots to show the correlation among the data items. We also demonstrate the effectiveness of this hybrid method through two case studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dzemyda, G., Kurasova, O., Zilinskas, J.: Multidimensional Data Visualization: Methods and Applications. Springer, Berlin (2012)
Ward, M.O.: A taxonomy of glyph placement strategies for multidimensional data visualization. Inf. Vis. 1, 194–210 (2002)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)
Konstorum, A., Jekel, N., Vidal, E., Laubenbacher, R.: Comparative analysis of linear and nonlinear dimension reduction techniques on mass cytometry data. bioRxiv, 273862 (2018). https://doi.org/10.1101/273862
Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Artificial Neural Networks—ICANN‘97, (1997)
Borg, I., Groenen, P.J.F.: Modern Multidimensional Scaling: Theory and Applications. Springer, New York (2005)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 373–1396 (2003)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Van der Maaten L, Hinton G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
McInnes, L., Healy, J., Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 (2018). [stat.ML]
Sumithra, V., Surendran, S.: A review of various linear and non linear dimensionality reduction techniques. Int. J. Comput. Sci. Inf. Technol. 6, 2354–2360 (2015)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)
Trutschl, M., Cvek, U., Grinstein, G.: Intelligently resolving Point Occlusion. In: IEEE Symposium On Infomation Visualization, pp. 131–136. Seattle, WA (2003)
Liu, S., Maljovec, D., Wang, W., Bremer, P.T., Pascucci, V.: Visualizing high-dimensional data: advances in the past decade. IEEE Trans. Vis. Comput. Graph. 23, 1249–1268 (2017)
Ward, M.O.: XmdvTool: integrating multiple methods for visualizing multivariate data. In: Proceedings of the Conference on Visualization. Los Alamitos, CA (1994)
Becker, R.A., Cleveland, W.S., Shyu, M.J.: The design and control of trellis display. J. Comput. Stat. Graph. 5, 123–155 (1996)
Steed, C.A., Ricciuto, D.M., Shipman, G., Smith, B., Thornton, P.E., Wang, D., Shi, X., Williams, D.N.: Big data visual analytics for exploratory earth system simulation analysis. Comput. Geosci. 61, 71–82 (2013)
Chambers, J., Cleveland, W., Kleiner, B., Tukey, P.: Graphical Methods for Data Analysis. Wadsworth (1983)
Sangli, S., Kaur, G., Karki, B.B.: Star plot visualization of ultrahigh dimensional multivariate data. In: International Conference on Advances in Big Data Analytics, pp. 91–97 (2016)
Chernoff, H.: The use of faces to represent points in k-dimensional space graphically. J. Am. Stat. Assoc. 68, 361–368 (1973)
Chambers, J.M.: Graphical Methods for Data Analysis (Statistics). Chapman & Hall, CRC (1983)
Nguyen, Q.V., Huang, M.L., Simoff, S.: Enhancing scatter plots with Start-plots for visualising multi-dimensional data. In: 24th International Conference on Information Visualisation, pp. 80–85 (2020)
Burch, M., Bott, F., Beck, F., Diehl, S.: Cartesian versus radial—a comparative evaluation of two visualization tools. In: International Symposium on Visual Computing, pp. 151–160 (2008)
Packham, I.S.J., Rafiq, M.Y., Borthwick, M.F., Denham, S.L.: Interactive visualisation for decision support and evaluation of robustness—in theory and in practice. Adv. Eng. Inform. 19, 263–280 (2005)
Friendly, M., Denis, D.: The early origins and development of the scatter plot. J. Hist. Behav. Sci. 41, 103–130 (2005)
Sedlmair, M., Munzner, T., Tory, M.: Empirical guidance on scatterplot and dimension reduction technique choices. IEEE Trans. Vis. Comput. Graph. 19, 2634–2643 (2013)
Tory, M., Sprague, D., Wu, F., So, W.Y., Munzner, T.: Spatialization design: comparing points and landscape. IEEE Trans. Vis. Comput. Graph. 13, 1262–1269 (2007)
Tory, M., Swindells, C., Dreezer, R.: Comparing dot and landscape spatialization for visual memory differences. IEEE Trans. Vis. Comput. Graph. 15, 1033–1039 (2009)
Rensink, R.A., Baldridge, G.: The perception of correlation in scatter plot. Comput. Graph. Forum 29, 1203–1210 (2010)
Cleveland, W.S., McGill, R.: The many faces of a scatterplot. J. Am. Stat. Assoc. 79, 807–822 (1984)
Cui, Q., Ward, M.O., Rundensteiner, E.A.: Enhancing scatterplot matrices for data with ordering or spatial attributes. In: Visualization and Data Analysis (2006)
Nguyen, Q.V., Simoff, S., Qian, Y., Huang, M.L.: Deep exploration of multidimensional data with linkable scatterplots. In: 9th International Symposium on Visual Information Communication and Interaction, pp. 43–50. Dallas, Texas (2016)
Nguyen, Q.V., Qian, Y., Huang, M.L., Zhang, J.: TabuVis: a tool for visual analytics multidimensional datasets. Sci. China Inf. Sci. 052105(12), (2013)
Nguyen, Q.V., Qian, Y., Huang, M.L., Zhang, J.: TabuVis: a light weight visual analytics system for multidimensional data. In: International Symposium on Visual Information Communication and Interaction, pp. 61–64 (2012). https://doi.org/10.1145/2397696.2397705
Nguyen, Q.V., Miller, N., Arness, D., Huang, W., Huang, M.L., Simoff, S.: Evaluation on interactive visualization data with scatterplots. Vis. Inf. (2020). https://doi.org/10.1016/j.visinf.2020.09.004
Huang, M.L., Nguyen, Q.V., Zhang, K. (eds.): Visual Information Communication. Springer, Berlin (2010)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., Lang, M.: Benchmark for filter methods for feature selection in high-dimensional classification data. Comput. Stat. Data Anal. 143, 106839 (2020)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)
Yin, H.: Learning nonlinear principal manifolds by self-organising maps. In: Principal Manifolds for Data Visualization and Dimension Reduction. Lecture Notes in Computer Science and Engineering (LNCSE), vol. 58, pp. 68–95. Springer, Berlin (2007)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv. Neural. Inf. Process. Syst. 14, 586–691 (2001)
Van der Maaten L, Hinton G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
McInnes, L., Healy, J., Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:180203426 (2018). [stat.ML]
Shneiderman B.: The eyes have it: a task by data type taxonomy for information visualization. In: 1996 IEEE Symposium on Visual Languages, pp. 336–343 (1996)
Nguyen, Q.V., Nelmes, G., Huang, M.L., Simoff, S., Catchpoole, D.: Interactive visualization for patient-to-patient comparison. Genomics Inform. 12, 263–276 (2014)
Nguyen, Q.V., Gleeson, A., Ho, N., Huang, M.L., Simoff, S., Catchpoole, D.: Visual analytics of clinical and genetic datasets of acute lymphoblastic leukaemia. In: 2011 International Conference on Neural Information Processing (ICONIP 2011), pp. 113–120. Shanghai, China (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Nguyen, Q.V., Huang, M.L., Simoff, S. (2022). Using Hybrid Scatterplots for Visualizing Multi-dimensional Data. In: Kovalerchuk, B., Nazemi, K., Andonie, R., Datia, N., Banissi, E. (eds) Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery. Studies in Computational Intelligence, vol 1014. Springer, Cham. https://doi.org/10.1007/978-3-030-93119-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-93119-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93118-6
Online ISBN: 978-3-030-93119-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)