research-article

Bagging gradient-boosted trees for high precision, low variance ranking models

Authors:
Yasser Ganjisaffar

University of California, Irvine, Irvine, CA, USA

University of California, Irvine, Irvine, CA, USA
View Profile

,
Rich Caruana

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Cristina Videira Lopes

University of California, Irvine, Irvine, CA, USA

University of California, Irvine, Irvine, CA, USA
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 85–94https://doi.org/10.1145/2009916.2009932

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 85–94

ABSTRACT

Recent studies have shown that boosting provides excellent predictive performance across a wide variety of tasks. In Learning-to-rank, boosted models such as RankBoost and LambdaMART have been shown to be among the best performing learning methods based on evaluations on public data sets. In this paper, we show how the combination of bagging as a variance reduction technique and boosting as a bias reduction technique can result in very high precision and low variance ranking models. We perform thousands of parameter tuning experiments for LambdaMART to achieve a high precision boosting model. Then we show that a bagged ensemble of such LambdaMART boosted models results in higher accuracy ranking models while also reducing variance as much as 50%. We report our results on three public learning-to-rank data sets using four metrics. Bagged LamdbaMART outperforms all previously reported results on ten of the twelve comparisons, and bagged LambdaMART outperforms non-bagged LambdaMART on all twelve comparisons. For example, wrapping bagging around LambdaMART increases NDCG@1 from 0.4137 to 0.4200 on the MQ2007 data set; the best prior results in the literature for this data set is 0.4134 by RankBoost.

References

Microsoft learning to rank datasets. http://research.microsoft.com/en-us/projects/mslr/.Google Scholar
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. Google ScholarDigital Library
L. Breiman. Bagging predictors. Mach. Learn., 24:123--140, August 1996. Google ScholarDigital Library
L. Breiman. Random forests. Machine Learning, 45:5--32, 2001. 10.1023/A:1010933404324. Google ScholarDigital Library
L. Breiman. Using iterated bagging to debias regressions. Mach. Learn., 45:261--277, December 2001. Google ScholarDigital Library
C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical report, Microsoft Research Technical Report MSR-TR-2010--82, 2010.Google Scholar
C. J. C. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In NIPS, pages 193--200, 2006.Google ScholarDigital Library
C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. N. Hullender. Learning to rank using gradient descent. In ICML, pages 89--96, 2005. Google ScholarDigital Library
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML'07: Proceedings of the 24th international conference on Machine learning, pages 129--136, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
O. Chapelle, Y. Chang, and T.-Y. Liu. The Yahoo! learning to rank challenge. http://learningtorankchallenge.yahoo.com, 2010.Google Scholar
O. Chapelle and M. Wu. Gradient descent optimization of smoothed information retrieval metrics. Inf. Retr., 13:216--235, June 2010. Google ScholarDigital Library
K. Crammer and Y. Singer. Pranking with ranking. In Advances in Neural Information Processing Systems 14, pages 641--647. MIT Press, 2001.Google Scholar
Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4:933--969, 2003. Google ScholarDigital Library
J. H. Friedman. Stochastic gradient boosting. Technical report, Technical report, Dept. Statistics, Stanford Univ., 1999.Google Scholar
J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189--1232, 2000.Google ScholarCross Ref
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Comput., 4:1--58, January 1992. Google ScholarDigital Library
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 115--132, Cambridge, MA, 2000. MIT Press.Google Scholar
T. K. Ho, J. Hull, and S. Srihari. Decision combination in multiple classifier systems. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 16(1):66 --75, Jan. 1994. Google ScholarDigital Library
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR'00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pages 41--48, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
T. Joachims. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'06, pages 217--226, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
P. Li, C. J. C. Burges, and Q. Wu. Mcrank: Learning to rank using multiple classification and gradient boosting. In NIPS, 2007.Google ScholarDigital Library
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009. Google ScholarDigital Library
D. Y. Pavlov, A. Gorodilov, and C. A. Brunk. BagBoo: a scalable hybrid bagging-the-boosting model. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM'10, pages 1897--1900, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
T. Qin, T.-Y. Liu, J. Xu, and H. Li. LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13:346--374, 2010. 10.1007/s10791-009--9123-y. Google ScholarDigital Library
D. Sculley. Combined regression and ranking. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'10, pages 979--988, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
D. Sorokina, R. Caruana, and M. Riedewald. Additive groves of regression trees. In Proceedings of the 18th European conference on Machine Learning, ECML'07, pages 323--334, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarDigital Library
Y. L. Suen, P. Melville, and R. J. Mooney. Combining bias and variance reduction techniques for regression trees. In In Proceedings of the European Conference on Machine Learning (ECML'05), pages 741--749, 2005. Google ScholarDigital Library
M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma. FRank: a ranking method with fidelity loss. In SIGIR'07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 383--390, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
G. Valentini and T. G. Dietterich. Low bias bagged support vector machines. In Proceedings of the International Conference on Machine Learning, ICML'03, pages 752--759. Morgan Kaufmann, 2003.Google Scholar
M. N. Volkovs and R. S. Zemel. Boltzrank: learning to maximize expected ranking gain. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML'09, pages 1089--1096, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
G. I. Webb. Multiboosting: A technique for combining boosting and wagging. Mach. Learn., 40:159--196, August 2000. Google ScholarDigital Library
Q. Wu, C. Burges, K. Svore, and J. Gao. Ranking, boosting and model adaptation. Technical report, Microsoft Technical Report MSR-TR-2008--109, 2008.Google Scholar
J. Xu and H. Li. AdaRank: a boosting algorithm for information retrieval. In SIGIR'07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 391--398, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR'07, pages 271--278, New York, NY, USA, 2007. ACM. Google ScholarDigital Library

Index Terms

Bagging gradient-boosted trees for high precision, low variance ranking models
1. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

Using boosting to prune bagging ensembles

Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory ...
Read More
Margin distribution based bagging pruning

Bagging is a simple and effective technique for generating an ensemble of classifiers. It is found there are a lot of redundant base classifiers in the original Bagging. We design a pruning approach to bagging for improving its generalization power. The ...
Read More
Web-search ranking with initialized gradient boosted regression trees
YLRC'10: Proceedings of the 2010 International Conference on Yahoo! Learning to Rank Challenge - Volume 14

In May 2010 Yahoo! Inc. hosted the Learning to Rank Challenge. This paper summarizes the approach by the highly placed team Washington University in St. Louis. We investigate Random Forests (RF) as a low-cost alternative algorithm to Gradient Boosted ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bagging
learning-to-rank
tree ensembles
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 120
  Total Citations
  View Citations
- 1,359
  Total Downloads
- Downloads (Last 12 months)83
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Bagging gradient-boosted trees for high precision, low variance ranking models

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using boosting to prune bagging ensembles

Margin distribution based bagging pruning

Web-search ranking with initialized gradient boosted regression trees