A learning classifier system for mazes with aliasing clones

Zatuchna, Zhanna V.; Bagnall, Anthony J.

doi:10.1007/s11047-007-9055-7

A learning classifier system for mazes with aliasing clones

Published: 23 August 2007

Volume 8, pages 57–99, (2009)
Cite this article

Natural Computing Aims and scope Submit manuscript

Zhanna V. Zatuchna¹ &
Anthony J. Bagnall¹

222 Accesses
9 Citations
Explore all metrics

Abstract

Maze problems represent a simplified virtual model of the real environment and can be used for developing core algorithms of many real-world application related to the problem of navigation. Learning Classifier Systems (LCS) are the most widely used class of algorithms for reinforcement learning in mazes. However, LCSs best achievements in maze problems are still mostly bounded to non-aliasing environments, while LCS complexity seems to obstruct a proper analysis of the reasons for failure. Moreover, there is a lack of knowledge of what makes a maze problem hard to solve by a learning agent. To overcome this restriction we try to improve our understanding of the nature and structure of maze environments. In this paper we describe a new LCS agent that has a simpler and more transparent performance mechanism. We use the structure of a predictive LCS model, strip out the evolutionary mechanism, simplify the reinforcement learning procedure and equip the agent with the ability to Associative Perception, adopted from psychology. We then assess the new LCS with Associative Perception on an extensive set of mazes and analyse the results to discover which features of the environments play the most significant role in the learning process. We identify a particularly hard feature for learning in mazes, aliasing clones, which arise when groups of aliasing cells occur in similar patterns in different parts of the maze. We discuss the impact of aliasing clones and other types of aliasing on learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arai S, Sycara K (2001) Credit assignment method for learning effective stochastic policies in uncertain domain. In: Spector L, Goodman ED, Wu A, Langdon WB, Voigt H-M, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon MH, Burke E (eds) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp 815–822. San Francisco, California, USA, 7–11 2001. Morgan Kaufmann
Bagnall AJ, Smith GD (2005) A multi-agent model of the UK market in electricity generation. IEEE Trans Evol Comput 9(5)
Bagnall AJ, Zatuchna ZV (2005) On the classification of maze problems. In: Bull L, Kovacs T (eds) Foundations of Learning Classifier Systems. Springer, pp 307–316
Browne W, Scott D (2005) An abstraction agorithm for genetics-based reinforcement learning. In: Beyer H-G, et al (eds) GECCO 2005: proceedings of the 2005 conference on genetic and evolutionary computation, vol 2, pp 1875–1882, 25–29 June 2005, ACM Press, Washington, DC, USA
Bull L (2002) Lookahead latent learning in ZCS. In: Langdon WB, Cantú-Paz E, Mathias K, Roy R, Davis D, Poli R, Balakrishnan K, Honavar V, Rudolph G, Wegener J, Bull L, Potter MA, Schultz AC, Miller JF, Burke E, Jonoska N (eds) GECCO 2002: proceedings of the genetic and evolutionary computation conference, pp 897–904, 9–13 July 2002. Morgan Kaufmann Publishers, New York
Bull L, Hurst J (2001) ZCS: theory and practice. Technical Report 01-001, UWE Learning Classifier Systems Group
Bull L, Hurst J (2002) ZCS redux. Evol Comput 10(2):185–205
Article Google Scholar
Bull L, Hurst J (2003) A neural Learning Classifier System with self-adaptive constructivism. Technical report, University of the West of England
Butz MV, Goldberg DE, Stolzmann W (2000) Probability-enhanced predictions in the Anticipatory Classifier System. In: Proceedings of the International Workshop on Learning Classifier Systems (IWLCS-2000), in the joint workshops of SAB 2000 and PPSN 2000 [1]. Extended abstract
Cassandra AR, Kaelbling LP, Littman ML (1994) Acting optimally in partially observable stochastic domains. In: Proceedings of the twelfth national conference on artificial intelligence (AAAI-94), vol 2, pp 1023–1028. MIT Press
Cliff D, Ross S (1994) Adding temporary memory to ZCS. Adapt Behav 3(2):101–150
Article Google Scholar
Hoffman J (1993) Vorhersage und Erkenntnis. Gottingen, Hogrefe
Google Scholar
Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern-directed inference systems. Academic Press, New York
Google Scholar
Hurst J, Bull L (2000) A self-adaptive Classifier System. In: Lanzi PL [1], pp 70–79. Extended abstract
Lanzi PL (1997a) A model of the environment to avoid local learning (an analysis of the generalization mechanism of XCS). Technical Report 97.46, Politecnico di Milano. Department of Electronic Engineering and Information Sciences. http://ftp.elet.polimi.it/people/lanzi/report46.ps.gz
Lanzi PL (1997b) Solving problems in partially observable environments with Classifier Systems (Experiments on adding memory to XCS). Technical Report 97.45, Politecnico di Milano. Department of Electronic Engineering and Information Sciences. http://ftp.elet.polimi.it/people/lanzi/report45.ps.gz
Lanzi PL (1997c) A study of the generalization capabilities of XCS. In: Bäck T (ed) Proceedings of the 7th International Conference on Genetic Algorithms (ICGA97), pp 418–425. Morgan Kaufmann, http://ftp.elet.polimi.it/people/lanzi/icga97.ps.gz
Lanzi PL (1998) An analysis of the memory mechanism of XCSM. In: Koza JR, Banzhaf W, Chellapilla K, Deb K, Dorigo M, Fogel DB, Garzon MH, Goldberg DE, Iba H, Riolo R (eds) Genetic programming 1998: proceedings of the third annual conference, pp 643–651. Morgan Kaufmann, http://ftp.elet.polimi.it/people/lanzi/gp98.ps.gz
Lanzi PL, Wilson SW (1999) Optimal Classifier System performance in non-Markov environments. Technical Report 99.36, Dipartimento di Elettronica e Informazione – Politecnico di Milano
Littman ML (1992) An optimization-based categorization of reinforcement learning environments. In: Roitblatand J-AMH (ed) From animals to animats 2: proceedings of the second international conference on simulation of adaptive behavior. The MIT Press/Bradford Books
Littman ML (1995) Learning policies for partially observable environments: scaling up. In: Proceedings of the twelfth international conference on machine learning
Lorenz K (1935) Der kumpan in der umwelt des vogels. J Ornithol 137–215
Maze material for AgentP (2005) http://www.cmp.uea.ac.uk/Research/ kdd/projects.php?project=17
McCallum AR (1993) Overcoming incomplete perception with utile distinction memory. In: The proceedings of the tenth international machine learning conference
Métivier M, Lattaud C (2002) Anticipatory Classifier System using behavioral sequences in non-Markov environments. In: IWLCS, pp 143–162
Miyazaki K, Kobayashi S (1999) Proposal for an algorithm to improve a rational policy in POMDPs. In: Proc of international conference on Systems, Man and Cybernetics (SMC 99), pp 492–497
O’Hara T, Bull L (2005) A memetic accuracy-based neural Learning Classifier System. In: Proceedings of the IEEE congress on evolutionary computation, pp 2040–2045. IEEE
Pavlov IP (1927) Conditioned reflexes. Oxford University Press, London
Google Scholar
Proceedings of the International Workshop on Learning Classifier Systems (IWLCS-2000), in the joint workshops of SAB 2000 and PPSN 2000 (2000). Pier Luca Lanzi, Wolfgang Stolzmann and Stewart W. Wilson (workshop organisers)
Skinner BF (1953) Science and human behavior. Macmillan, New York
Google Scholar
Stolzmann W (2000) An introduction to Anticipatory Classifier Systems. In: Stolzmann W, Lanzi PL, Wilson SW (eds) Learning Classifier Systems, from foundations to applications. Springer-Verlag, pp 175–194
Studley M, Bull L (2005) Using the XCS classifier system for multi-objective reinforcement learning problems. Technical report, University of the West of England
Thorndike EL (1911) Animal intelligence. Hafner, Darien, CT
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):272–292
Google Scholar
Wertheimer M (1938) Laws of organization in perceptual forms. In: A source book of gestalt psychology. Routledge and Kegan Paul, London, pp 71–88
Wilson SW (1990) The animat path to AI. In: Meyer JA, Wilson SW (eds) From animals to animats 1. Proceedings of the first international conference on Simulation of Adaptive Behavior (SAB90), pp 15–21. A Bradford book. MIT Press, http://prediction-dynamics.com/
Wilson SW (1994) ZCS: a zeroth level Classifier System. Evol Comput 2(1):1–18
Article Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar
Zatuchna ZV (2004) AgentP model: Learning Classifier System with Associative Perception. In: Yao X et al (eds) Proceedings of the Parallel Problem Solving from Nature Conference (PPSN), vol 3242, of Lecture Notes in Computer Science, pp 1172–1182. Springer
Zatuchna ZV (2006) AgentP: A Learning Classifier System with Associative Perception in Maze Environments. PhD Thesis, School of Computing Sciences, University of East Anglia
Zatuchna ZV, Bagnall AJ (2005) AgentP classifier system: Self-adjusting vs. Gradual approach. In: Proceedings of the 2005 congress on evolutionary computation, pp 1196–1203

Download references

Author information

Authors and Affiliations

School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, England
Zhanna V. Zatuchna & Anthony J. Bagnall

Authors

Zhanna V. Zatuchna
View author publications
You can also search for this author in PubMed Google Scholar
Anthony J. Bagnall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhanna V. Zatuchna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zatuchna, Z.V., Bagnall, A.J. A learning classifier system for mazes with aliasing clones. Nat Comput 8, 57–99 (2009). https://doi.org/10.1007/s11047-007-9055-7

Download citation

Received: 05 December 2006
Accepted: 18 July 2007
Published: 23 August 2007
Issue Date: March 2009
DOI: https://doi.org/10.1007/s11047-007-9055-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A learning classifier system for mazes with aliasing clones

Abstract

Access this article

Similar content being viewed by others

Learning classifier systems with memory condition to solve non-Markov problems

A New Swarm Algorithm Based on Orcas Intelligence for Solving Maze Problems

Evolving Cellular Automata for Maze Generation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A learning classifier system for mazes with aliasing clones

Abstract

Access this article

Similar content being viewed by others

Learning classifier systems with memory condition to solve non-Markov problems

A New Swarm Algorithm Based on Orcas Intelligence for Solving Maze Problems

Evolving Cellular Automata for Maze Generation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation