Skip to main content

Classification

Data-Driven Categorization of Objects in Tourism

  • Chapter
  • First Online:
  • 3169 Accesses

Part of the book series: Tourism on the Verge ((TV))

Abstract

Classification, the task of assigning objects to a given set of categories, is used in almost every field. One important sub-branch of classification consists of methods that learn classification functions from example data. The following chapter will provide an overview of the most basic concepts and methods of this type of data-driven classification. We will first highlight the basic ideas behind classification, along with some examples related to tourism. Thereafter, we will introduce measures of classification performance, which are necessary to direct data-driven training of classification functions and/or to evaluate classification results. As an essential part of this chapter, we will provide self-contained, yet stripped-down, descriptions of the most crucial data-driven classification methods. As such, we will focus on nearest neighbor classifiers, logistic regression, Naïve Bayes, decision trees and ensemble variants thereof, support vector machines, and finally, artificial neural networks. All of the concepts and methods will then be applied to a specific use case in an accompanying Jupyter notebook, demonstrating the practical implementation of these concepts and methods through the use of Python and the machine learning framework scikit-learn.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., … Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems.

    Google Scholar 

  • Allison, P. D. (2002). Missing data. Quantitative applications in the social sciences (Vol. 136). SAGE Publications.

    Google Scholar 

  • Ben-Hur, A., Ong, C. S., Sonnenburg, S., Schölkopf, B., & Rätsch, G. (2008). Support vector machines and kernels for computational biology.PLoS Comput. Biol., 4(10), e1000173. https://doi.org/10.1371/journal.pcbi.1000173

    Article  Google Scholar 

  • Bergstra, J., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems, 24.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Google Scholar 

  • Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (Eds.). (1984). Classification and regression trees. CRC Press.

    Google Scholar 

  • Chollet, F. (2018). Deep learning with python. Safari tech books online. Manning Publications.

    Google Scholar 

  • Clevert, D.-A., Unterthiner, T., & Hochreiter, S. (2016). Fast and accurate deep network learning by exponential linear units (ELUs). In Proceedings of the Fourth International Conference on Learning Representations. San Juan, Puerto Rico.

    Google Scholar 

  • Cortes, C., & Vapnik, V. N. (1995). Support vector networks. Machine Learning, 20, 273–297.

    Google Scholar 

  • Cox, D. R. (1966). Some procedures connected with the logistic qualitative response curve. In F. N. David (Ed.), Research papers in probability and statistics (festschrift for J. Neyman) (pp. 55–71). John Wiley & Sons.

    Google Scholar 

  • Cramer, J. S. (2002). The origins of logistic regression (Tinbergen institute working paper no. 2002-119/4). https://doi.org/10.2139/ssrn.360300

  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874.

    Article  Google Scholar 

  • Fix, E., & Hodges, J. (1951). Discriminatory analysis. Nonparametric discrimination; consistency properties. Randolph Field, TX.

    Google Scholar 

  • Florkowski, C. M. (2008). Sensitivity, specificity, receiver-operating characteristic (ROC) curves and likelihood ratios: Communicating the performance of diagnostic tests. Clinical Biochemist Reviews, 29(Suppl. 1), S83–S87.

    Google Scholar 

  • Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (pp. 315–323).

    Google Scholar 

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

    Google Scholar 

  • Hand, D. J., & Yu, K. (2001). Idiot's Bayes: Not so stupid after all? International Statistical Review, 69(3), 385. https://doi.org/10.2307/1403452

    Article  Google Scholar 

  • Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. The Annals of Statistics, 26(2), 451–471.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (second) Springer series in statistics. Springer.

    Book  Google Scholar 

  • Hinton, G. (2012). Neural networks for machine learning online course. Retrieved from http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

  • Hoerl, A. E. (1962). Application of ridge analysis to regression problems. Chemical Engineering Progress, 58, 54–59.

    Google Scholar 

  • Horton, N. J., & Kleinman, K. P. (2007). Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models. The American Statistician, 61(1), 79–90. https://doi.org/10.1198/000313007X172556

    Article  Google Scholar 

  • John, G. H., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In P. Besnard & S. Hanks (Eds.), Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (pp. 338–345). Morgan Kaufmann.

    Google Scholar 

  • Jolliffe, I. (2014). Principal component analysis. In B. S. Everitt & D. Howell (Eds.), Encyclopedia of statistics in behavioral science. John Wiley & Sons. https://doi.org/10.1002/9781118445112.stat06472

    Chapter  Google Scholar 

  • Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In Proceedings of the 3th International Conference for Learning Representations, San Diego, CA.

    Google Scholar 

  • Le Cessie, S., & van Houwelingen, J. C. (1992). Ridge estimators in logistic regression. Journal of the Royal Statistical Society: Series C, 41(1), 191. https://doi.org/10.2307/2347628

    Article  Google Scholar 

  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  • Luntz, A., & Brailovsky, V. (1969). On estimation of characters obtained in statistical procedure of recognition. Techicheskaya Kibernetica, 3. (in Russian).

    Google Scholar 

  • Mason, L., Baxter, J., Bartlett, P., & Frean, M. (1999). Boosting algorithms as gradient descent. Advances in neural information processing systems (Vol. 12, pp. 512–518). MIT Press.

    Google Scholar 

  • McKinney, W. (2010). Data structures for statistical computing in python. In Proceedings of the Python in Science Conference, Proceedings of the 9th Python in Science Conference (pp. 56–61). SciPy. https://doi.org/10.25080/Majora-92bf1922-00a

    Chapter  Google Scholar 

  • Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.

    Google Scholar 

  • Opitz, D., & Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11, 169–198. https://doi.org/10.1613/jair.614

    Article  Google Scholar 

  • Oskolkov, N. (2021). Dimensionality reduction. In R. Egger (Ed.), Tourism on the verge. Applied data science in tourism: Interdisciplinary approaches, methodologies and applications. Springer.

    Google Scholar 

  • Palme, J., Hochreiter, S., & Bodenhofer, U. (2015). KeBABS: An R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15), 2574–2576. https://doi.org/10.1093/bioinformatics/btv176

    Article  Google Scholar 

  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., … Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems 32 (pp. 8024–8035). Curran Associates.

    Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, É. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(85), 2825–2830.

    Google Scholar 

  • Platt, J. C. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In P. J. Bartlett, B. Schölkopf, D. Schuurmans, & A. J. Smola (Eds.), Advances in large margin classifiers (pp. 61–74). MIT Press.

    Google Scholar 

  • Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

    Google Scholar 

  • Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann.

    Google Scholar 

  • Ramos-Henríquez, J. M., Gutiérrez-Taño, D., & Díaz-Armas, J. (2021). Value proposition operationalization in peer-to-peer platforms using machine learning. Tourism Management, 84, 104288. https://doi.org/10.1016/j.tourman.2021.104228

    Article  Google Scholar 

  • Reif, J., & Schmücker, D. (2020). Exploring new ways of visitor tracking using big data sources: Opportunities and limits of passive mobile data for tourism. Journal of Destination Marketing and Management, 18, 100481.https://doi.org/10.1016/j.jdmm.2020.100481

    Article  Google Scholar 

  • Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. https://doi.org/10.1038/323533a0

    Article  Google Scholar 

  • Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197–227.https://doi.org/10.1007/BF00116037

    Article  Google Scholar 

  • Schapire, R. E. (2003). The boosting approach to machine learning: An overview. In D. D. Denison, M. H. Hansen, C. C. Holmes, B. Mallick, & B. Yu (Eds.), Lecture notes in statistics: Vol. 171. Proceedings MSRI workshop on nonlinear estimation and classification (pp. 149–171). Springer.

    Chapter  Google Scholar 

  • Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117. https://doi.org/10.1016/j.neunet.2014.09.003

    Article  Google Scholar 

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Adaptive computation and machine learning. MIT Press.

    Google Scholar 

  • Stekhoven, D. J., & Bühlmann, P. (2012). Missforest – Non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), 112–118. https://doi.org/10.1093/bioinformatics/btr597

    Article  Google Scholar 

  • Stöckl, A., & Bodenhofer, U. (2021). Regression. In R. Egger (Ed.), Tourism on the verge. Applied data science in tourism: Interdisciplinary approaches, methodologies and applications. Springer.

    Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. The Journal of the Royal Statistical Society, Series B, 58(1), 267–288.

    Google Scholar 

  • Tran, N., Schneider, J.-G., Weber, I., & Qin, A. K. (2020). Hyper-parameter optimization in classification: To-do or not-to-do. Pattern Recognition, 103, 107245. https://doi.org/10.1016/j.patcog.2020.107245

    Article  Google Scholar 

  • Vapnik, V. N. (1998). Statistical learning theory. Adaptive and learning systems. Wiley Interscience.

    Google Scholar 

  • Veloso, B. M., Leal, F., Malheiro, B., & Burguillo, J. C. (2019). On-line guest profiling and hotel recommendation. Electronic Commerce Research and Applications, 34, 100832. https://doi.org/10.1016/j.elerap.2019.100832

    Article  Google Scholar 

  • Wu, T.-F., Lin, C.-J., & Weng, R. C. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005.

    Google Scholar 

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. The Journal of the Royal Statistical Society, Series B, 67(2), 301–320.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Stöckl .

Editor information

Editors and Affiliations

Further Readings and Other Sources

Further Readings and Other Sources

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bodenhofer, U., Stöckl, A. (2022). Classification. In: Egger, R. (eds) Applied Data Science in Tourism. Tourism on the Verge. Springer, Cham. https://doi.org/10.1007/978-3-030-88389-8_10

Download citation

Publish with us

Policies and ethics