Binarised regression tasks: methods and evaluation metrics

Hernández-Orallo, José; Ferri, Cèsar; Lachiche, Nicolas; Martínez-Usó, Adolfo; Ramírez-Quintana, M. José

doi:10.1007/s10618-015-0443-9

Binarised regression tasks: methods and evaluation metrics

Published: 30 November 2015

Volume 30, pages 848–890, (2016)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

José Hernández-Orallo¹,
Cèsar Ferri¹,
Nicolas Lachiche²,
Adolfo Martínez-Usó¹ &
…
M. José Ramírez-Quintana¹

746 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Some supervised tasks are presented with a numerical output but decisions have to be made in a discrete, binarised, way, according to a particular cutoff. This binarised regression task is a very common situation that requires its own analysis, different from regression and classification—and ordinal regression. We first investigate the application cases in terms of the information about the distribution and range of the cutoffs and distinguish six possible scenarios, some of which are more common than others. Next, we study two basic approaches: the retraining approach, which discretises the training set whenever the cutoff is available and learns a new classifier from it, and the reframing approach, which learns a regression model and sets the cutoff when this is available during deployment. In order to assess the binarised regression task, we introduce context plots featuring error against cutoff. Two special cases are of interest, the \( UCE \) and \( OCE \) curves, showing that the area under the former is the mean absolute error and the latter is a new metric that is in between a ranking measure and a residual-based measure. A comprehensive evaluation of the retraining and reframing approaches is performed using a repository of binarised regression problems created on purpose, concluding that no method is clearly better than the other, except when the size of the training data is small.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Notes

Note that some people can buy a house that is much cheaper than its maximum mortgage, especially if they buy it as an investment or to refurbish it afterwards.
It is worth noting that the training process is entirely repeated in the retraining alternative, having nothing to do with any kind of incremental learning or adaptation of the previous model. This use of the term ‘retraining’, understood as building a different model each time a new cutoff is set, can often be found in the active learning research field (Guo and Schuurmans 2008; Sammut and Webb 2011).
Note that region is here used to refer to an interval (continuous subset of values) within all the possible cutoff values. This interval will usually be narrow.
For the interested reader, it is worth mentioning that Theorem 1 is connected to Theorem 11 (and corollary 12) by Hernández-Orallo et al. (2012), where the expected loss of the score-uniform threshold choice method for a uniform distribution of operating contexts (cost proportions or skews) is shown to be equal to \( MAE \). Two comments must be done, though. First, here we are talking about the \( MAE \) of a regression model while in Hernández-Orallo et al. (2012) the result holds for a soft classifier with estimated probabilities between 0 and 1—upon which the \( MAE \) is calculated. Second, here the decision rule is taking the operating context into account—the cutoff is used at each point of the curve, while in Hernández-Orallo et al. (2012) the result is obtained by the score-uniform threshold choice method, which completely ignores the operating context. Nevertheless, this is still an interesting connection as both are assuming a uniform distribution of contexts.
Quantile regression aims at estimating either the conditional median or other quantiles of the goal variable.
For example, consider three neighbours with outputs \(y= \{1, 2, 6\}\), and a cutoff \(c=2.5\). If we consider equal weights and mean for the prediction, for regression, the average \(\bar{y} = 4.5 \ge c = 2.5\) and predicts “above the cutoff”, but for classification there is only one neighbour above the cutoff so it predicts “below the cutoff”. Only for \(k=1\) would both approaches be equal.
See http://www.dsic.upv.es/~flip/BinarisedRegression/.

References

Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2014) Aggregative quantification for regression. Data Min Knowl Discov 28(2):475–518
Article MathSciNet MATH Google Scholar
Bi J, Bennett KP (2003) Regression error characteristic curves. In: Twentieth international conference on machine learning (ICML-2003). Washington, DC
Brooks AD (2007) knnflex: a more flexible KNN. R package version 1.1.1
Cohen I, Goldszmidt M (2004) Properties and benefits of calibrated classifiers. Knowl Discov Database 2004:125–136
Google Scholar
Drummond C, Holte R (2000) Explicitly representing expected cost: an alternative to ROC representation. In: Knowledge discovery and data mining, pp 198–207
Drummond C, Holte R (2006) Cost curves: an improved method for visualizing classifier performance. Mach Learn 65:95–130
Article Google Scholar
Fawcett T (2006) ROC graphs with instance-varying costs. Pattern Recognit Lett 27(8):882–891
Article MathSciNet Google Scholar
Fawcett T, Provost F (1997) Adaptive fraud detection. Data Min Knowl Discov 1(3):291–316
Article Google Scholar
Federal Financial Institutions Examination Council (2013) Home mortgage disclosure act (HMDA). http://www.ffiec.gov/hmda/
Ferri C, Hernández-Orallo J (2004) Cautious classifiers. In: Proceedings of the 1st international workshop on ROC analysis in artificial intelligence (ROCAI-2004), pp 27–36
Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38
Article Google Scholar
Flach P (2003) The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: Machine learning, proceedings of the twentieth international conference (ICML 2003), pp 194–201
Guo Y, Schuurmans D (2008) Discriminative batch mode active learning. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. Curran Associates, Inc, pp 593–600
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer New York Inc., New York
Book MATH Google Scholar
Hernández-Orallo J (2013) ROC curves for regression. Pattern Recognit 46(12):3395–3411
Article MATH Google Scholar
Hernández-Orallo J (2014) Probabilistic reframing for context-sensitive regression. ACM Trans Knowl Discov Data 8(3)
Hernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res (JMLR) 13:2813–2869
MathSciNet MATH Google Scholar
Hornik K, Buchta C, Zeileis A (2009) Open-source machine learning: R meets Weka. Comput Stat 24(2):225–232. doi:10.1007/s00180-008-0119-7
Article MathSciNet MATH Google Scholar
Hsu CN, Knoblock CA (1998) Discovering robust knowledge from databases that change. Data Min Knowl Discov 2(1):69–95
Article Google Scholar
Kocjan E, Kononenko I (2009) Regression as cost-sensitive classification. In: International multiconference on information society, pp 38–41
Koenker R (2005) Quantile regression, vol 38. Cambridge University Press, Cambridge
Book MATH Google Scholar
Langford J, Oliveira R, Zadrozny B (2012) Predicting conditional quantiles via reduction to classification. arXiv:1206.6860
Langford J, Zadrozny B (2005) Estimating class membership probabilities using classifier learners. In: Proceedings of the tenth international workshop on artificial intelligence and statistics (AISTAT05), pp 198–205
Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. In: Fifth european conference on speech communication and technology. Citeseer
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Piatetsky-Shapiro G, Masand B (1999) Estimating campaign benefits and modeling lift. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, p 193
Pietraszek T (2007) On the use of ROC analysis for the optimization of abstaining classifiers. Mach Learn 68(2):137–169
Article Google Scholar
Prati RC, Batista GE, Monard MC (2011) A survey on graphical methods for classification predictive performance evaluation. IEEE Trans Knowl Data Eng 23:1601–1618. doi:10.1109/TKDE.2011.59
Article Google Scholar
Rosset S, Perlich C, Zadrozny B (2007) Ranking-based evaluation of regression models. Knowl Inf Syst 12(3):331–353
Article Google Scholar
Sammut C, Webb G (2011) Encyclopedia of machine learning. Encyclopedia of machine learning. Springer, New York
MATH Google Scholar
Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Sci Am 283(4):82–87
Article Google Scholar
Torgo L (2005) Regression error characteristic surfaces. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, pp 697–702
Torgo L, Gama J (1996) Regression by classification. In: Advances in artificial intelligence. Springer, pp 51–60
The keel-dataset repository (2002). http://www.keel.es/
Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289
Article MathSciNet Google Scholar
Zillow (2013) Zillow API. http://www.zillow.com/howto/api/APIOverview.htm

Download references

Acknowledgments

We thank the anonymous reviewers for their comments, which have helped to improve this paper significantly. We thank Peter Flach and Meelis Kull for their insightful comments and very useful suggestions. This work was supported by the Spanish MINECO under Grant TIN 2013-45732-C4-1-P and by Generalitat Valenciana PROMETEOII2015/013. This research has been developed within the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economía y Competitividad in Spain (PCIN-2013-037) and the Agence Nationale pour la Recherche in France (ANR-12-CHRI-0005-03).

Author information

Authors and Affiliations

Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Camí de Vera s/n, 46022, Valencia, Spain
José Hernández-Orallo, Cèsar Ferri, Adolfo Martínez-Usó & M. José Ramírez-Quintana
ICube, Université de Strasbourg, CNRS, 300 Bd Sebastien Brant, BP 10413, 67412, Illkirch Cedex, France
Nicolas Lachiche

Authors

José Hernández-Orallo
View author publications
You can also search for this author in PubMed Google Scholar
Cèsar Ferri
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Lachiche
View author publications
You can also search for this author in PubMed Google Scholar
Adolfo Martínez-Usó
View author publications
You can also search for this author in PubMed Google Scholar
M. José Ramírez-Quintana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José Hernández-Orallo.

Additional information

Responsible editor: Eamonn Keogh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernández-Orallo, J., Ferri, C., Lachiche, N. et al. Binarised regression tasks: methods and evaluation metrics. Data Min Knowl Disc 30, 848–890 (2016). https://doi.org/10.1007/s10618-015-0443-9

Download citation

Received: 22 May 2014
Accepted: 13 November 2015
Published: 30 November 2015
Issue Date: July 2016
DOI: https://doi.org/10.1007/s10618-015-0443-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Binarised regression tasks: methods and evaluation metrics

Abstract

Access this article

Similar content being viewed by others

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Binarised regression tasks: methods and evaluation metrics

Abstract

Access this article

Similar content being viewed by others

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation