Abstract
The process of Data Analysis in Machine Learning (ML) is very huge, it involves a task beginning with defining the business objective, collecting data, preprocessing the data, selecting, building and testing models, monitoring and validating against stated objectives. This requires more time for the user to get the result when each step is done manually. During analysis, not everyone checks with accuracy for all the models that exist. While dealing with ML, the data analysts usually come across lots of errors that are difficult to analyze and solve. The main objective of the paper is to perform the instinctive data analysis tool for Machine Learning in an easier way. This tool just needs the dataset, and all the data analysis required is done automatically and the result is generated within a short period. Different kinds of datasets can be provided for analysis, Eg: Numerical Dataset, Categorical Dataset, unlabelled data, etc. Around 40 regression and classifier models are available for testing here. The two main categories of Machine learning techniques have been used which are Supervised and Unsupervised. For the demo, Kaggle datasets are used, the iris dataset is used for classification, and the vegetable dataset is used for regression. This will be immensely useful for individual purposes, software companies, new budding ML engineering, and data scientists.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley Publishing Company, Reading (1977)
Roth, S., Chuah, M., Kerpedjiev, S., Kolojejchick, J., Lucas, P.: Towards an information visualization workspace: combining multiple means of expression. Hum. Comput. Interact. J. 12, 131–185 (1997)
Chambers, J.M., Cleveland, W.S., Kleiner, B., Tukey, P.A.: Graphical Methods for Data Analysis. Chapman & Hall, New York (1983)
Gower, J.C., Ross, G.J.S.: Minimal spanning trees and single linkage cluster analysis. Appl. Stat. 18, 54–64 (1969)
Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley & Sons, New York (1994)
Fails, J.A., Olsen, J.: Interactive machine learning. In: IUI 2003: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp. 39–45. ACM, New York (2003)
Ware, M., Frank, E., Holmes, G., Hall, M., Witten, I.H.: Interactive machine learning: letting users build classifiers. Int. J. Hum. Comput. Stud. 55, 281–292 (2001)
Willsa, G., Wilkinsonb, L.: AutoVis: Automatic visualization, Chicago, Illinois 60606, vol. 9, no. 1, pp. 47–69 (2008)
Bertini, E., Lalanne, D.: Surveying the complementary role of automatic data analysis and visualization in knowledge discovery. In: VAKD 2009, 28 June 2009, Paris. Copyright 2009 ACM 978-1-60558-670-0...$5.00
Automated Machine Learning, 20 May 2019. https://www.datarobot.com/platform/automated-machine-learning/
Balaji, A., Allen, A.: Benchmarking Automatic Machine Learning Frameworks. arXiv preprint arXiv:1808.06492 (2018)
Waring, J., et al.: Artificial intelligence. Medicine 104, 101822 (2020)
Zhang, S., Zhang, C., Yang, Q.: Data preparation for data mining. Appl. Artif. Intell. 17(5–6), 375–381 (2003)
Tuggener, L., et al.: Automated machine learning in practice: state of the art and recent results. In: 2019 6th Swiss Conference on Data Science (SDS), pp. 31–36. IEEE (2019)
Bengio, Y., et al.: Learning deep architectures for AI. Found. Trends R Mach. Learn. 2(1), 1–127 (2009)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Kraska, T., Talwalkar, A., Duchi, J.C., Griffith, R., Franklin, M.J., Jordan, M.I.:. MLBase: a distributed machine-learning system. In: CIDR, vol. 1, p. 2-1 (2013)
Olson, R.S., Moore, J.H.: TPOT: a tree-based pipeline optimization tool for automating machine learning. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds.) Proceedings of the Workshop on Automatic Machine Learning, volume 64 of Proceedings of Machine Learning Research, pp. 66–74, New York. PMLR, 24 June 2016
Ware, M., Frank, E., Holmes, G., Hall, M., Witten, I.H.: Interactive machine learning: letting users build classifiers. Int. J. Hum. Comput. Stud. 55, 281–292 (2001)
Wimmer, J., Towsey, M., Planitz, B., Roe, P., Williamson, I.: Scaling Acoustic Data Analysis through Collaboration and Automation. Microsoft QUT eResearch Centre Queensland University of Technology Brisbane, Australia (2010)
Stuper, A.J., Jurs, P.C.: A computer system for automated data analysis using pattern recognition techniques. J. Chem. Inf. Model. 16(2), 99–105 (1976)
Acknowledgement
We would like to show our gratitude to Dr. J. Venkatesh from the Chennai Institute of Technology for sharing his pearls of wisdom with us during this research and for comments that greatly improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Varshini, R.S., Madhushree, T., Priyadharshini, R., Priya, K.Y., Akshara, A.S., Venkatesh, J. (2022). Instinctive Data Analysis in Machine Learning and Summary Exhibitor. In: Kahraman, C., Tolga, A.C., Cevik Onar, S., Cebi, S., Oztaysi, B., Sari, I.U. (eds) Intelligent and Fuzzy Systems. INFUS 2022. Lecture Notes in Networks and Systems, vol 505. Springer, Cham. https://doi.org/10.1007/978-3-031-09176-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-09176-6_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09175-9
Online ISBN: 978-3-031-09176-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)