Algorithm portfolio selection as a bandit problem with unbounded losses

Gagliolo, Matteo; Schmidhuber, Jürgen

doi:10.1007/s10472-011-9228-z

Algorithm portfolio selection as a bandit problem with unbounded losses

Published: 01 April 2011

Volume 61, pages 49–86, (2011)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Matteo Gagliolo¹ &
Jürgen Schmidhuber^2,3

325 Accesses
21 Citations
Explore all metrics

Abstract

We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Allenberg, C., Auer, P., Györfi, L., Ottucsák, G.: Hannan consistency in on-line learning in case of unbounded losses under partial monitoring. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) Algorithmic Learning Theory—ALT. LNCS, vol. 4264, pp. 229–243. Springer (2006)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)
Article MATH MathSciNet Google Scholar
Babai, L.: Monte-Carlo Algorithms in Graph Isomorphism Testing. Technical Report 79-10, Univ. de Montréal, Dép. de mathématiques et de statistique (1979)
Battiti, R., Brunato, M., Mascia, F.: Reactive search and intelligent optimization. In: Operations Research/Computer Science Interfaces, vol. 45. Springer Verlag, Berlin (2008)
Google Scholar
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inf. Theory 51(6), 2152–2162 (2005)
Article MathSciNet Google Scholar
Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. In: Auer, P., Meir, R. (eds.) 18th Annual Conference on Learning Theory—COLT. LNCS, vol. 3559, pp. 217–232. Springer (2005)
Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. Mach. Learn. 66(2–3), 321–352 (2007)
Article Google Scholar
Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of independent processes. In: Eighteenth National Conference on Artificial Intelligence—AAAI, pp. 719–724. AAAI Press (2002)
Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of shared resources. J. Artif. Intell. Res. 19, 73–138 (2003)
MATH MathSciNet Google Scholar
Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G.: Longitudinal Data Analysis. Chapman & Hall/CRC Press (2008)
Gagliolo, M., Zhumatiy, V., and Schmidhuber, J.: Adaptive online time allocation to search algorithms. In: Boulicaut, J.F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) Machine Learning: ECML 2004. Proceedings of the 15th European Conference on Machine Learning. LNCS, vol. 3201, pp. 134–143. Springer (2004)
Gagliolo, M.: Universal search. Scholarpedia 2(11), 2575 (2007)
Google Scholar
Gagliolo, M.: Online dynamic algorithm portfolios. PhD thesis, IDSIA/University of Lugano, Lugano, Switzerland (2010)
Gagliolo, M., Legrand, C.: Algorithm survival analysis. In: Bartz-Beielstein, T., Chiarandini, M., Paquete, L., Preuss, M. (eds.) Experimental Methods for the Analysis of Optimization Algorithms, pp. 161–184. Springer, Berlin, Heidelberg (2010)
Chapter Google Scholar
Gagliolo, M., Legrand, C., Birattari, M.: Mixed-effects modeling of optimisation algorithm performance. In: Stützle, T., Birattari, M., Hoos, H.H. (eds.) Engineering Stochastic Local Search Algorithms—SLS. LNCS, vol. 5752, pp. 150–154. Springer (2009)
Gagliolo, M., Schmidhuber, J.: A neural network model for inter-problem adaptive online time allocation. In: Duch, W., et al. (eds.) Artificial Neural Networks: Formal Models and Their Applications—ICANN, Proceedings, Part 2. LNCS, vol. 3697, pp. 7–12. Springer, Berlin (2005)
Google Scholar
Gagliolo, M., Schmidhuber, J.: Learning dynamic algorithm portfolios. Ann. Math. Artif. Intell. 47(3–4), 295–328 (2006)
MATH MathSciNet Google Scholar
Gagliolo, M., Schmidhuber, J.: Learning restart strategies. In: Veloso, M.M. (ed.) Twentieth International Joint Conference on Artificial Intelligence—IJCAI, vol. 1, pp. 792–797. AAAI Press (2007)
Gagliolo, M., Schmidhuber, J.: Towards distributed algorithm portfolios. In: Corchado, J.M., et al. (eds.) International Symposium on Distributed Computing and Artificial Intelligence—DCAI. Advances in Soft Computing, vol. 50, pp. 634–643. Springer (2008)
Gagliolo, M., Schmidhuber, J.: Algorithm selection as a bandit problem with unbounded losses. In: Blum, C., Battiti, R. (eds.) Learning and Intelligent Optimization. 4th International Conference, LION 4, Venice, Italy, January 18–22 (2010) Selected Papers, Lecture Notes in Computer Science, vol. 6073, pp. 82–96. Springer, Berlin, Heidelberg (2010)
Google Scholar
Gomes, C.P., Selman, B.: Algorithm portfolios. Artif. Intell. 126(1–2), 43–62 (2001)
Article MATH MathSciNet Google Scholar
Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1–2), 67–100 (2000)
Article MATH MathSciNet Google Scholar
Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations & Applications. Morgan Kaufmann (2004)
Horvitz, E.J., Zilberstein, S.: Computational tradeoffs under bounded resources (editorial). Artif. Intell. 126(1–2), 1–4 (2001)
Article Google Scholar
Huberman, B.A., Lukose, R.M., Hogg, T.: An economics approach to hard computational problems. Science 27, 51–53 (1997)
Article Google Scholar
Hutter, F., Hamadi, Y.: Parameter Adjustment Based on Performance Prediction: Towards an Instance-aware Problem Solver. Technical Report MSR-TR-2005-125, Microsoft Research, Cambridge, UK, (2005)
Kaplan, E.L., Meyer, P.: Nonparametric estimation from incomplete samples. J. Am. Stat. Assoc. 73, 457–481 (1958)
Article Google Scholar
Klein, J.P., Moeschberger, M.L.: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn. Springer, Berlin (2003)
MATH Google Scholar
Kolen, J.F.: Faster learning through a probabilistic approximation algorithm. In: IEEE International Conference on Neural Networks, vol. 1, pp. 449–454 (1988)
Leyton-Brown, K., Nudelman, E., Shoham, Y.: Learning the empirical hardness of optimization problems: the case of combinatorial auctions. In: Van Hentenryck, P. (ed.) ICCP: International Conference on Constraint Programming (CP). LNCS, vol. 2470, pp. 556–572. Springer (2002)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Article MATH MathSciNet Google Scholar
Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)
Article MATH MathSciNet Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill Science/Engineering/Math (1997)
Muselli, M., Rabbia, M.: Parallel trials versus single search in supervised learning. In: Simula, O. (ed.) Second International Conference on Artificial Neural Networks—ICANN, pp. 24–28. Elsevier (1991)
Nudelman, E., Leyton-Brown, K., Hoos, H.H., Devkar, A., Shoham, Y.: Understanding random SAT: beyond the clauses-to-variables ratio. In: Wallace, M. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 3258, pp. 438–452. Springer (2004)
Petrik, M.: Learning Parallel Portfolios of Algorithms. Master’s thesis, Comenius University (2005)
Petrik, M.: Statistically Optimal Combination of Algorithms. Presented at SOFSEM (2005)
Petrik, M., Zilberstein, S.: Learning parallel portfolios of algorithms. Ann. Math. Artif. Intell. 48(1–2), 85–106 (2006)
MATH MathSciNet Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, New York (1994)
MATH Google Scholar
Rice, J.R.: The algorithm selection problem. In: Rubinoff, M., Yovits, M.C. (eds.) Advances in Computers, vol. 15, pp. 65–118. Academic Press (1976)
Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527–535 (1952)
Article MATH MathSciNet Google Scholar
Sayag, T., Fine, S., Mansour, Y.: Combining multiple heuristics. In: STACS, pp. 242–253 (2006)
Schaul, T., Schmidhuber, J.: Metalearning. Scholarpedia 5(6), 4650 (2010)
Article Google Scholar
Smith-Miles, K.A.: Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv. 41(1), 1–25 (2008)
Article Google Scholar
Streeter, M.: Using Online Algorithms to Solve NP-hard Problems more Efficiently in Practice. PhD thesis, Carnegie Mellon University (2007)
Streeter, M., Smith, S.F.: New techniques for algorithm portfolio design. In: McAllester, D.A., Myllymäki, P. (eds.) 24th Conference on Uncertainty in Artificial Intelligence—UAI, pp. 519–527. AUAI Press (2008)
Streeter, M.J., Golovin, D., Smith, S.F.: Combining multiple heuristics online. In: Holte, R.C., Howe, A. (eds.) Twenty-second AAAI Conference on Artificial Intelligence, pp. 1197–1203. AAAI Press (2007)
Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla-07: the design and analysis of an algorithm portfolio for SAT. In: Bessiere, C. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 4741, pp. 712–727. Springer (2007)
Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla: portfolio-based algorithm selection for SAT. J. Artif. Intell. Res. 32, 565–606 (2008)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

CoMo, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
Matteo Gagliolo
IDSIA, Galleria 2, 6928, Manno (Lugano), Switzerland
Jürgen Schmidhuber
Faculty of Informatics, University of Lugano, Via Buffi 13, 6904, Lugano, Switzerland
Jürgen Schmidhuber

Authors

Matteo Gagliolo
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schmidhuber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Gagliolo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gagliolo, M., Schmidhuber, J. Algorithm portfolio selection as a bandit problem with unbounded losses. Ann Math Artif Intell 61, 49–86 (2011). https://doi.org/10.1007/s10472-011-9228-z

Download citation

Published: 01 April 2011
Issue Date: February 2011
DOI: https://doi.org/10.1007/s10472-011-9228-z

Keywords

Mathematics Subject Classifications (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm portfolio selection as a bandit problem with unbounded losses

Abstract

Access this article

Similar content being viewed by others

Online Black-Box Algorithm Portfolios for Continuous Optimization

Meta-learning of Exploration/Exploitation Strategies: The Multi-armed Bandit Case

Empirical hardness of finding optimal Bayesian network structures: algorithm selection and runtime prediction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications (2010)

Navigation

Algorithm portfolio selection as a bandit problem with unbounded losses

Abstract

Access this article

Similar content being viewed by others

Online Black-Box Algorithm Portfolios for Continuous Optimization

Meta-learning of Exploration/Exploitation Strategies: The Multi-armed Bandit Case

Empirical hardness of finding optimal Bayesian network structures: algorithm selection and runtime prediction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications (2010)

Search

Navigation