Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

Fialho, Álvaro; Da Costa, Luis; Schoenauer, Marc; Sebag, Michèle

doi:10.1007/978-3-642-11169-3_13

Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

Álvaro Fialho¹⁷,
Luis Da Costa¹⁸,
Marc Schoenauer^17,18 &
…
Michèle Sebag^17,18

Conference paper

931 Accesses
30 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5851))

Abstract

The performance of many efficient algorithms critically depends on the tuning of their parameters, which on turn depends on the problem at hand. For example, the performance of Evolutionary Algorithms critically depends on the judicious setting of the operator rates. The Adaptive Operator Selection (AOS) heuristic that is proposed here rewards each operator based on the extreme value of the fitness improvement lately incurred by this operator, and uses a Multi-Armed Bandit (MAB) selection process based on those rewards to choose which operator to apply next. This Extreme-based Multi-Armed Bandit approach is experimentally validated against the Average-based MAB method, and is shown to outperform previously published methods, whether using a classical Average-based rewarding technique or the same Extreme-based mechanism. The validation test suite includes the easy One-Max problem and a family of hard problems known as “Long k-paths”.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Grefenstette, J.: Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man and Cybernetics 16(1), 122–128 (1986)
Article Google Scholar
Lobo, F., Lima, C., Michalewicz, Z. (eds.): Parameter Setting in Evolutionary Algorithms. Studies in Computational Intelligence, vol. 54. Springer, Heidelberg (2007)
MATH Google Scholar
Davis, L.: Adapting operator probabilities in genetic algorithms. In: Schaffer, J.D. (ed.) Proc. ICGA 1989, pp. 61–69. Morgan Kaufmann, San Francisco (1989)
Google Scholar
Da Costa, L., Fialho, A., Schoenauer, M., Sebag, M.: Adaptive operator selection with dynamic multi-armed bandits. In: Keijzer, M. (ed.) Proc. GECCO 2008, pp. 913–920. ACM Press, New York (2008)
Chapter Google Scholar
Fialho, A., Da Costa, L., Schoenauer, M., Sebag, M.: Extreme value based adaptive operator selection. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 175–184. Springer, Heidelberg (2008)
Chapter Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Article MATH Google Scholar
Hinkley, D.: Inference about the change point from cumulative sum-tests. Biometrika 58(3), 509–523 (1971)
Article MATH MathSciNet Google Scholar
Rudolph, G.: Convergence Properties of Evolutionary Algorithms. Verlag Dr. Kovac (1997)
Google Scholar
Eiben, A.E., Hinterding, R., Michalewicz, Z.: Parameter control in Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation 3(2), 124–141 (1999)
Article Google Scholar
Eiben, A.E., Michalewicz, Z., Schoenauer, M., Smith, J.E.: Parameter control in Evolutionary Algorithms. In: Lobo, F.G., et al. (eds.) Parameter Setting in Evolutionary Algorithms, pp. 19–46. Springer, Heidelberg (2007)
Chapter Google Scholar
Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: Langdon, W.B., et al. (eds.) Proc. GECCO 2002, pp. 11–18. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Yuan, B., Gallagher, M.: Statistical racing techniques for improved empirical evaluation of evolutionary algorithms. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 172–181. Springer, Heidelberg (2004)
Google Scholar
Bartz-Beielstein, T., Lasarczyk, C., Preuss, M.: Sequential parameter optimization. In: McKay, B. (ed.) Proc. CEC 2005, pp. 773–780. IEEE Press, Los Alamitos (2005)
Google Scholar
Nannen, V., Eiben, A.E.: Relevance estimation and value calibration of evolutionary algorithm parameters. In: Veloso, M. (ed.) Proc. IJCAI 2007, pp. 975–980 (2007)
Google Scholar
De Jong, K.: Parameter Setting in EAs: a 30 Year Perspective. In: Lobo, F.G., et al. (eds.) Parameter Setting in Evolutionary Algorithms, pp. 1–18. Springer, Heidelberg (2007)
Chapter Google Scholar
Lobo, F., Goldberg, D.: Decision making in a hybrid genetic algorithm. In: Porto, B. (ed.) Proc. ICEC 1997, pp. 121–125. IEEE Press, Los Alamitos (1997)
Google Scholar
Tuson, A., Ross, P.: Adapting operator settings in genetic algorithms. Evolutionary Computation 6(2), 161–184 (1998)
Article Google Scholar
Barbosa, H.J.C., Sá, A.M.: On adaptive operator probabilities in real coded genetic algorithms. In: Workshop on Advances and Trends in AI for Problem Solving – SCCC 2000 (2000)
Google Scholar
Julstrom, B.A.: What have you done for me lately? Adapting operator probabilities in a steady-state genetic algorithm on genetic algorithms. In: Eshelman, L.J. (ed.) Proc. ICGA 1995, pp. 81–87. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Maturana, J., Saubion, F.: A compass to guide genetic algorithms. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 256–265. Springer, Heidelberg (2008)
Chapter Google Scholar
Whitacre, J.M., Pham, T.Q., Sarker, R.A.: Use of statistical outlier detection method in adaptive evolutionary algorithms. In: Keijzer, M. (ed.) Proc. GECCO 2006, pp. 1345–1352. ACM Press, New York (2006)
Chapter Google Scholar
Goldberg, D.E.: Probability matching, the magnitude of reinforcement, and classifier system bidding. Machine Learning 5(4), 407–425 (1990)
Google Scholar
Thierens, D.: An adaptive pursuit strategy for allocating operator probabilities. In: Beyer, H.G. (ed.) Proc. GECCO 2005, pp. 1539–1546. ACM Press, New York (2005)
Chapter Google Scholar
Wong, Y.Y., Lee, K.H., Leung, K.S., Ho, C.W.: A novel approach in parameter adaptation and diversity maintenance for genetic algorithms. Soft Computing 7(8), 506–515 (2003)
Google Scholar
Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6(1), 4–22 (1985)
Article MATH MathSciNet Google Scholar
Horn, J., Goldberg, D.E., Deb, K.: Long path problems. In: Davidor, Y., Männer, R., Schwefel, H.-P. (eds.) PPSN 1994. LNCS, vol. 866, pp. 149–158. Springer, Heidelberg (1994)
Google Scholar
Garnier, J., Kallel, L.: Statistical distribution of the convergence time of evolutionary algorithms for long-path problems. IEEE Transactions on Evolutionary Computation 4(1), 16–30 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research – INRIA Joint Centre, Orsay, France
Álvaro Fialho, Marc Schoenauer & Michèle Sebag
TAO team, INRIA Saclay – Île-de-France & LRI (UMR CNRS 8623), Orsay, France
Luis Da Costa, Marc Schoenauer & Michèle Sebag

Authors

Álvaro Fialho
View author publications
You can also search for this author in PubMed Google Scholar
Luis Da Costa
View author publications
You can also search for this author in PubMed Google Scholar
Marc Schoenauer
View author publications
You can also search for this author in PubMed Google Scholar
Michèle Sebag
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IRIDIA, CoDE, Université Libre de Bruxelles, Avenue F. Roosevelt 50, CP 194/6, 1050, Brussels, Belgium
Thomas Stützle

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fialho, Á., Da Costa, L., Schoenauer, M., Sebag, M. (2009). Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms. In: Stützle, T. (eds) Learning and Intelligent Optimization. LION 2009. Lecture Notes in Computer Science, vol 5851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11169-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-11169-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11168-6
Online ISBN: 978-3-642-11169-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics