Abstract
Instance ranking problems intend to recover the ordering of the instances in a data set with applications in scientific, social and financial contexts. In this work, we concentrate on the global robustness of parametric instance ranking problems in terms of the breakdown point which measures the fraction of samples that need to be perturbed in order to let the estimator take unreasonable values. Existing breakdown point notions do not cover ranking problems so far. We propose to define a breakdown of the estimator as a sign-reversal of all components which causes the predicted ranking to be potentially completely inverted; therefore, we call it the order-inversal breakdown point (OIBDP). We will study the OIBDP, based on a linear model, for several different carefully distinguished ranking problems and provide least favorable outlier configurations, characterizations of the order-inversal breakdown point and sharp asymptotic upper bounds. We also compute empirical OIBDPs.
Similar content being viewed by others
References
Agarwal, S. (2010). Learning to rank on graphs. Machine Learning, 81(3), 333–357.
Agarwal, S., Sengupta, S. (2009). Ranking genes by relevance to a disease. Proceedings of the 8th annual international conference on computational systems bioinformatics, 37–46.
Alfons, A., Croux, C., Gelper, S. (2013). Sparse least trimmed squares regression for analyzing high-dimensional large data sets. The Annals of Applied Statistics, 7(1), 226–248.
Alqallaf, F., Van Aelst, S., Yohai, V. J., et al. (2009). Propagation of outliers in multivariate data. The Annals of Statistics, 37(1), 311–331.
Averbukh, V., Smolyanov, O. (1967). The theory of differentiation in linear topological spaces. Russian Mathematical Surveys, 22(6), 201–258.
Becker, C., Gather, U. (1999). The masking breakdown point of multivariate outlier identification rules. Journal of the American Statistical Association, 94(447), 947–955.
Brefeld, U., Scheffer, T. (2005). AUC maximizing support vector learning. Proceedings of the ICML 2005 workshop on ROC analysis in machine learning, 92–99.
Bühlmann, P., Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.
Bühlmann, P., Van De Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. Berlin, Heidelberg: Springer Science & Business Media.
Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H. W. (2006). Adapting ranking SVM to document retrieval. Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, 186–193. ACM.
Chu, L. Y., Nazerzadeh, H., Zhang, H. (2020). Position ranking and auctions for online marketplaces. Management Science, 66(8), 3617–3634.
Clémençon, S., Achab, M. (2017). Ranking data with continuous labels through oriented recursive partitions. Advances in neural information processing systems, 4603–4611.
Clémençon, S., Vayatis, N. (2007). Ranking the best instances. Journal of Machine Learning Research, 8(Dec), 2671–2699.
Clémençon, S., Vayatis, N. (2008). Tree-structured ranking rules and approximation of the optimal ROC curve. Proceedings of the 2008 conference on algorithmic learning theory. Lecture Notes in Artificial Intelligence, Vol. 5254, 22–37.
Clémençon, S., Vayatis, N. (2010). Overlaying classifiers: a practical approach to optimal scoring. Constructive Approximation, 32(3), 619–648.
Clémençon, S., Lugosi, G., Vayatis, N. (2008). Ranking and empirical minimization of U-statistics. The Annals of Statistics, 36(2), 844–874.
Clémençon, S., Depecker, M., Vayatis, N. (2013a). Ranking forests. Journal of Machine Learning Research, 14(Jan), 39–73.
Clémençon, S., Depecker, M., Vayatis, N. (2013b). An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods. Pattern Analysis and Applications, 16(4), 475–496.
Clémençon, S., Robbiano, S., Vayatis, N. (2013c). Ranking data with ordinal labels: Optimality and pairwise aggregation. Machine Learning, 91(1), 67–104.
Davies, P. L. (1993). Aspects of robust linear regression. The Annals of Statistics, 21(4), 1843–1899.
Davies, P. L., Gather, U. (2005). Breakdown and groups. The Annals of Statistics, 33(3), 977–1035.
Donoho, D. L. (2006). High-dimensional centrally symmetric polytopes with neighborliness proportional to dimension. Discrete & Computational Geometry, 35(4), 617–652.
Donoho, D. L., Huber, P. J. (1983). The notion of breakdown point. A Festschrift for Erich L. Lehmann, 157–184.
Donoho, D. L., Stodden, V. (2006). Breakdown point of model selection when the number of variables exceeds the number of observations. The 2006 IEEE international joint conference on neural network proceedings, 1916–1921. IEEE.
Freund, Y., Iyer, R., Schapire, R. E., et al. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4(Nov), 933–969.
Friedman, J., Hastie, T., Tibshirani, R. (2001). The elements of statistical learning. Springer Series in Statistics, Vol. 1. New York, NY: Springer New York.
Fürnkranz, J., Hüllermeier, E. (2011). Preference learning, Vol. 19. 01 ISBN 978-3-642-14124-9. https://doi.org/10.1007/978-3-642-14125-6.
Fürnkranz, J., Hüllermeier, E., Vanderlooy, S. (2009). Binary decomposition methods for multipartite ranking. Joint European conference on machine learning and knowledge discovery in databases, 359–374. Berlin, Heidelberg: Springer.
Gather, U., Hilker, T. (1997). A note on Tyler’s modification of the mad for the stahel-donoho estimator. Annals of Statistics, 25(5), 2024–2026.
Genton, M. G. (1998). Spatial breakdown point of variogram estimators. Mathematical Geology, 30(7), 853–871.
Genton, M. G. (2003). Breakdown-point for spatially and temporally correlated observations. Developments in robust statistics, 148–159. Heidelberg: Springer.
Genton, M. G., & Lucas, A. (2003). Comprehensive definitions of breakdown points for independent and dependent observations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1), 81–94.
Hampel, F. R. (1971). A general qualitative definition of robustness. The Annals of Mathematical Statistics, 42(6), 1887–1896.
Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the American Statistical Association, 69(346), 383–393.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P., et al. (1986). Robust statistics: The approach based on influence functions. New York: Wiley-Interscience.
He, X. (2005). Discussion of "breakdown and groups" by P.L. Davies and U. Gather. arXiv: math/0508501.
Hennig, C. (2008). Dissolution point and isolation robustness: robustness criteria for general cluster analysis methods. Journal of Multivariate Analysis, 99(6), 1154–1176.
Herbrich, R., Graepel, T., Obermayer, K. (1999a). Support vector learning for ordinal regression. 9th international conference on artificial neural networks: ICANN ’99, 97–102. IET.
Herbrich, R., Graepel, T., Obermayer, K. (1999b). Regression models for ordinal data: A machine learning approach. Citeseer.
Hodges, J. L., Jr. (1967). Efficiency in normal samples and tolerance of extreme values for some estimates of location. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1, 163–186.
Hothorn, T. (2019). TH.data: TH’s data archive, URL https://CRAN.R-project.org/package=TH.data. R package version 1.0-10.
Huber, P. J., Ronchetti, E. (2009). Robust statistics. New Jersey: John Wiley & Sons.
Hubert, M. (1997). The breakdown value of the \(L_1\) estimator in contingency tables. Statistics & Probability Letters, 33(4), 419–425.
Hubert, M., Rousseeuw, P. J., Van Aelst, S. (2008). High-breakdown robust multivariate methods. Statistical Science, 23(1), 92–119.
Joachims, T. (2002). Optimizing search engines using clickthrough data. Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, 133–142. ACM.
Kanamori, T., Takenouchi, T., Eguchi, S., et al. (2004). The most robust loss function for boosting. Neural information processing, 496–501. Berlin, Heidelberg: Springer.
Kayala, M. A., Azencott, C.-A., Chen, J. H., et al. (2011). Learning to predict chemical reactions. Journal of Chemical Information and Modeling, 51(9), 2209–2222.
Lai, H., Pan, Y., Liu, C., et al. (2013). Sparse learning-to-rank via an efficient primal-dual algorithm. IEEE Transactions on Computers, 62(6), 1221–1233.
Laporte, L., Flamary, R., Canu, S., et al. (2014). Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1118–1130.
Maronna, R. A., Martin, R. D., Yohai, V. J., et al. (2019). Robust statistics: theory and methods (with R). Chichester, England: John Wiley & Sons.
Meinshausen, N., Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473.
Mohan, A., Chen, Z., Weinberger, K. (2011). Web-search ranking with initialized gradient boosted regression trees. Proceedings of the learning to rank challenge, 77–89. PMLR.
Morrison, J. L., Breitling, R., Higham, D. J., et al. (2005). Generank: Using search engine technology for the analysis of microarray experiments. BMC Bioinformatics, 6(1), 1–14.
Page, L., Brin, S., Motwani, R., et al. (1999). The pagerank citation ranking: Bringing order to the web. Technical Report Nr. 1999-66, Stanford InfoLab, November URL http://ilpubs.stanford.edu:8090/422/. Previous number = SIDL-WP-1999-0120.
Pahikkala, T., Tsivtsivadze, E., Airola, A. et al. (2007). Learning to rank with pairwise regularized least-squares. SIGIR 2007 workshop on learning to rank for information retrieval, Vol. 80, 27–33.
Pahikkala, T., Airola, A., Naula, P. et al. (2010). Greedy RankRLS: A linear time algorithm for learning sparse ranking models. SIGIR 2010 workshop on feature generation and selection for information retrieval, 11–18. ACM.
Pickett, K. S. (2006). Audit planning: A risk-based approach. New Jersey: John Wiley & Sons.
Qian, C., Tran-Dinh, Q., Fu, S., et al. (2019). Robust multicategory support matrix machines. Mathematical Programming, 176(1–2), 429–463.
Rakotomamonjy, A. (2004). Optimizing area under Roc curve with SVMs. Proceedings of the ECAI-2004 workshop on ROC analysis in AI, 71–80.
Rieder, H. (1994). Robust Asymptotic Statistics, Vol. 1. New York: Springer Verlag.
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79(388), 871–880.
Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. Mathematical Statistics and Applications, 8(37), 283–297.
Rousseeuw, P. J., Hubert, M. (2011). Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 73–79.
Rousseeuw, P. J., Leroy, A. M. (2005). Robust regression and outlier detection, Vol. 589. Hoboken, New Jersey: John Wiley & Sons.
Rousseeuw, P. J., Van Driessen, K. (2006). Computing LTS regression for large data sets. Data Mining and Knowledge Discovery, 12(1), 29–45.
Ruckdeschel, P., Horbenko, N. (2012). Yet another breakdown point notion: EFSBP. Metrika, 75(8), 1025–1047.
Rudin, C. (2009). The p-norm push: A simple convex ranking algorithm that concentrates at the top of the list. Journal of Machine Learning Research, 10(Oct), 2233–2271.
Sakata, S., White, H. (1995). An alternative definition of finite-sample breakdown point with applications to regression model estimators. Journal of the American Statistical Association, 90(431), 1099–1106.
Sakata, S., White, H. (1998). High breakdown point conditional dispersion estimation with application to S & P 500 daily returns volatility. Econometrica, 529–567.
Schölkopf, B., Herbrich, R., Smola, A. (2001). A generalized representer theorem. Computational Learning Theory, 416–426. Berlin, Heidelberg: Springer.
Sculley, D. (2010). Combined regression and ranking. Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, 979–988.
Stromberg, A. J., Ruppert, D. (1992). Breakdown in nonlinear regression. Journal of the American Statistical Association, 87(420), 991–997.
Tian, Y., Shi, Y., Chen, X., et al. (2011). AUC maximizing support vector machines with feature selection. Procedia Computer Science, 4, 1691–1698.
Torgo, L., Ribeiro, R. (2007). Utility-based regression. European conference on principles of data mining and knowledge discovery, 597–604. Berlin, Heidelberg: Springer.
Von Mises, R. (1947). On the asymptotic distribution of differentiable statistical functions. The Annals of Mathematical Statistics, 18(3), 309–348.
Wang, S., Nan, B., Rosset, S., et al. (2011). Random lasso. The Annals of Applied Statistics, 5(1), 468.
Werner, D. (2006). Funktionalanalysis. Berlin, Heidelberg: Springer.
Werner, T. (2021a). A review on instance ranking problems in statistical learning. Machine Learning, 111(2), 415–463.
Werner, T. (2021b). Trimming stability selection increases variable selection robustness. arXiv:2111.11818.
Werner, T. (2022). Elicitability of instance and object ranking. Decision Analysis, 19(2), 123–140.
Yoganarasimhan, H. (2020). Search personalization using machine learning. Management Science, 66(3), 1045–1070.
Zhao, J., Yu, G., Liu, Y. (2018). Assessing robustness of classification using angular breakdown point. Annals of Statistics, 46(6B), 3362.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The online version of this article contains supplementary material.
Supplementary Information
Below is the link to the electronic supplementary material.
About this article
Cite this article
Werner, T. Quantitative robustness of instance ranking problems. Ann Inst Stat Math 75, 335–368 (2023). https://doi.org/10.1007/s10463-022-00847-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-022-00847-1