Skip to main content
Log in

Algorithm portfolio selection as a bandit problem with unbounded losses

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allenberg, C., Auer, P., Györfi, L., Ottucsák, G.: Hannan consistency in on-line learning in case of unbounded losses under partial monitoring. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) Algorithmic Learning Theory—ALT. LNCS, vol. 4264, pp. 229–243. Springer (2006)

  2. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  3. Babai, L.: Monte-Carlo Algorithms in Graph Isomorphism Testing. Technical Report 79-10, Univ. de Montréal, Dép. de mathématiques et de statistique (1979)

  4. Battiti, R., Brunato, M., Mascia, F.: Reactive search and intelligent optimization. In: Operations Research/Computer Science Interfaces, vol. 45. Springer Verlag, Berlin (2008)

    Google Scholar 

  5. Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inf. Theory 51(6), 2152–2162 (2005)

    Article  MathSciNet  Google Scholar 

  6. Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. In: Auer, P., Meir, R. (eds.) 18th Annual Conference on Learning Theory—COLT. LNCS, vol. 3559, pp. 217–232. Springer (2005)

  7. Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. Mach. Learn. 66(2–3), 321–352 (2007)

    Article  Google Scholar 

  8. Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of independent processes. In: Eighteenth National Conference on Artificial Intelligence—AAAI, pp. 719–724. AAAI Press (2002)

  9. Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of shared resources. J. Artif. Intell. Res. 19, 73–138 (2003)

    MATH  MathSciNet  Google Scholar 

  10. Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G.: Longitudinal Data Analysis. Chapman & Hall/CRC Press (2008)

  11. Gagliolo, M., Zhumatiy, V., and Schmidhuber, J.: Adaptive online time allocation to search algorithms. In: Boulicaut, J.F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) Machine Learning: ECML 2004. Proceedings of the 15th European Conference on Machine Learning. LNCS, vol. 3201, pp. 134–143. Springer (2004)

  12. Gagliolo, M.: Universal search. Scholarpedia 2(11), 2575 (2007)

    Google Scholar 

  13. Gagliolo, M.: Online dynamic algorithm portfolios. PhD thesis, IDSIA/University of Lugano, Lugano, Switzerland (2010)

  14. Gagliolo, M., Legrand, C.: Algorithm survival analysis. In: Bartz-Beielstein, T., Chiarandini, M., Paquete, L., Preuss, M. (eds.) Experimental Methods for the Analysis of Optimization Algorithms, pp. 161–184. Springer, Berlin, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Gagliolo, M., Legrand, C., Birattari, M.: Mixed-effects modeling of optimisation algorithm performance. In: Stützle, T., Birattari, M., Hoos, H.H. (eds.) Engineering Stochastic Local Search Algorithms—SLS. LNCS, vol. 5752, pp. 150–154. Springer (2009)

  16. Gagliolo, M., Schmidhuber, J.: A neural network model for inter-problem adaptive online time allocation. In: Duch, W., et al. (eds.) Artificial Neural Networks: Formal Models and Their Applications—ICANN, Proceedings, Part 2. LNCS, vol. 3697, pp. 7–12. Springer, Berlin (2005)

    Google Scholar 

  17. Gagliolo, M., Schmidhuber, J.: Learning dynamic algorithm portfolios. Ann. Math. Artif. Intell. 47(3–4), 295–328 (2006)

    MATH  MathSciNet  Google Scholar 

  18. Gagliolo, M., Schmidhuber, J.: Learning restart strategies. In: Veloso, M.M. (ed.) Twentieth International Joint Conference on Artificial Intelligence—IJCAI, vol. 1, pp. 792–797. AAAI Press (2007)

  19. Gagliolo, M., Schmidhuber, J.: Towards distributed algorithm portfolios. In: Corchado, J.M., et al. (eds.) International Symposium on Distributed Computing and Artificial Intelligence—DCAI. Advances in Soft Computing, vol. 50, pp. 634–643. Springer (2008)

  20. Gagliolo, M., Schmidhuber, J.: Algorithm selection as a bandit problem with unbounded losses. In: Blum, C., Battiti, R. (eds.) Learning and Intelligent Optimization. 4th International Conference, LION 4, Venice, Italy, January 18–22 (2010) Selected Papers, Lecture Notes in Computer Science, vol. 6073, pp. 82–96. Springer, Berlin, Heidelberg (2010)

    Google Scholar 

  21. Gomes, C.P., Selman, B.: Algorithm portfolios. Artif. Intell. 126(1–2), 43–62 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  22. Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1–2), 67–100 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  23. Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations & Applications. Morgan Kaufmann (2004)

  24. Horvitz, E.J., Zilberstein, S.: Computational tradeoffs under bounded resources (editorial). Artif. Intell. 126(1–2), 1–4 (2001)

    Article  Google Scholar 

  25. Huberman, B.A., Lukose, R.M., Hogg, T.: An economics approach to hard computational problems. Science 27, 51–53 (1997)

    Article  Google Scholar 

  26. Hutter, F., Hamadi, Y.: Parameter Adjustment Based on Performance Prediction: Towards an Instance-aware Problem Solver. Technical Report MSR-TR-2005-125, Microsoft Research, Cambridge, UK, (2005)

  27. Kaplan, E.L., Meyer, P.: Nonparametric estimation from incomplete samples. J. Am. Stat. Assoc. 73, 457–481 (1958)

    Article  Google Scholar 

  28. Klein, J.P., Moeschberger, M.L.: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn. Springer, Berlin (2003)

    MATH  Google Scholar 

  29. Kolen, J.F.: Faster learning through a probabilistic approximation algorithm. In: IEEE International Conference on Neural Networks, vol. 1, pp. 449–454 (1988)

  30. Leyton-Brown, K., Nudelman, E., Shoham, Y.: Learning the empirical hardness of optimization problems: the case of combinatorial auctions. In: Van Hentenryck, P. (ed.) ICCP: International Conference on Constraint Programming (CP). LNCS, vol. 2470, pp. 556–572. Springer (2002)

  31. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  32. Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  33. Mitchell, T.M.: Machine Learning. McGraw-Hill Science/Engineering/Math (1997)

  34. Muselli, M., Rabbia, M.: Parallel trials versus single search in supervised learning. In: Simula, O. (ed.) Second International Conference on Artificial Neural Networks—ICANN, pp. 24–28. Elsevier (1991)

  35. Nudelman, E., Leyton-Brown, K., Hoos, H.H., Devkar, A., Shoham, Y.: Understanding random SAT: beyond the clauses-to-variables ratio. In: Wallace, M. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 3258, pp. 438–452. Springer (2004)

  36. Petrik, M.: Learning Parallel Portfolios of Algorithms. Master’s thesis, Comenius University (2005)

  37. Petrik, M.: Statistically Optimal Combination of Algorithms. Presented at SOFSEM (2005)

  38. Petrik, M., Zilberstein, S.: Learning parallel portfolios of algorithms. Ann. Math. Artif. Intell. 48(1–2), 85–106 (2006)

    MATH  MathSciNet  Google Scholar 

  39. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, New York (1994)

    MATH  Google Scholar 

  40. Rice, J.R.: The algorithm selection problem. In: Rubinoff, M., Yovits, M.C. (eds.) Advances in Computers, vol. 15, pp. 65–118. Academic Press (1976)

  41. Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527–535 (1952)

    Article  MATH  MathSciNet  Google Scholar 

  42. Sayag, T., Fine, S., Mansour, Y.: Combining multiple heuristics. In: STACS, pp. 242–253 (2006)

  43. Schaul, T., Schmidhuber, J.: Metalearning. Scholarpedia 5(6), 4650 (2010)

    Article  Google Scholar 

  44. Smith-Miles, K.A.: Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv. 41(1), 1–25 (2008)

    Article  Google Scholar 

  45. Streeter, M.: Using Online Algorithms to Solve NP-hard Problems more Efficiently in Practice. PhD thesis, Carnegie Mellon University (2007)

  46. Streeter, M., Smith, S.F.: New techniques for algorithm portfolio design. In: McAllester, D.A., Myllymäki, P. (eds.) 24th Conference on Uncertainty in Artificial Intelligence—UAI, pp. 519–527. AUAI Press (2008)

  47. Streeter, M.J., Golovin, D., Smith, S.F.: Combining multiple heuristics online. In: Holte, R.C., Howe, A. (eds.) Twenty-second AAAI Conference on Artificial Intelligence, pp. 1197–1203. AAAI Press (2007)

  48. Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla-07: the design and analysis of an algorithm portfolio for SAT. In: Bessiere, C. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 4741, pp. 712–727. Springer (2007)

  49. Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla: portfolio-based algorithm selection for SAT. J. Artif. Intell. Res. 32, 565–606 (2008)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Gagliolo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gagliolo, M., Schmidhuber, J. Algorithm portfolio selection as a bandit problem with unbounded losses. Ann Math Artif Intell 61, 49–86 (2011). https://doi.org/10.1007/s10472-011-9228-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-011-9228-z

Keywords

Mathematics Subject Classifications (2010)

Navigation