Skip to main content
Log in

A learning classifier system for mazes with aliasing clones

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

Maze problems represent a simplified virtual model of the real environment and can be used for developing core algorithms of many real-world application related to the problem of navigation. Learning Classifier Systems (LCS) are the most widely used class of algorithms for reinforcement learning in mazes. However, LCSs best achievements in maze problems are still mostly bounded to non-aliasing environments, while LCS complexity seems to obstruct a proper analysis of the reasons for failure. Moreover, there is a lack of knowledge of what makes a maze problem hard to solve by a learning agent. To overcome this restriction we try to improve our understanding of the nature and structure of maze environments. In this paper we describe a new LCS agent that has a simpler and more transparent performance mechanism. We use the structure of a predictive LCS model, strip out the evolutionary mechanism, simplify the reinforcement learning procedure and equip the agent with the ability to Associative Perception, adopted from psychology. We then assess the new LCS with Associative Perception on an extensive set of mazes and analyse the results to discover which features of the environments play the most significant role in the learning process. We identify a particularly hard feature for learning in mazes, aliasing clones, which arise when groups of aliasing cells occur in similar patterns in different parts of the maze. We discuss the impact of aliasing clones and other types of aliasing on learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30

Similar content being viewed by others

References

  • Arai S, Sycara K (2001) Credit assignment method for learning effective stochastic policies in uncertain domain. In: Spector L, Goodman ED, Wu A, Langdon WB, Voigt H-M, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon MH, Burke E (eds) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp 815–822. San Francisco, California, USA, 7–11 2001. Morgan Kaufmann

  • Bagnall AJ, Smith GD (2005) A multi-agent model of the UK market in electricity generation. IEEE Trans Evol Comput 9(5)

  • Bagnall AJ, Zatuchna ZV (2005) On the classification of maze problems. In: Bull L, Kovacs T (eds) Foundations of Learning Classifier Systems. Springer, pp 307–316

  • Browne W, Scott D (2005) An abstraction agorithm for genetics-based reinforcement learning. In: Beyer H-G, et al (eds) GECCO 2005: proceedings of the 2005 conference on genetic and evolutionary computation, vol 2, pp 1875–1882, 25–29 June 2005, ACM Press, Washington, DC, USA

  • Bull L (2002) Lookahead latent learning in ZCS. In: Langdon WB, Cantú-Paz E, Mathias K, Roy R, Davis D, Poli R, Balakrishnan K, Honavar V, Rudolph G, Wegener J, Bull L, Potter MA, Schultz AC, Miller JF, Burke E, Jonoska N (eds) GECCO 2002: proceedings of the genetic and evolutionary computation conference, pp 897–904, 9–13 July 2002. Morgan Kaufmann Publishers, New York

  • Bull L, Hurst J (2001) ZCS: theory and practice. Technical Report 01-001, UWE Learning Classifier Systems Group

  • Bull L, Hurst J (2002) ZCS redux. Evol Comput 10(2):185–205

    Article  Google Scholar 

  • Bull L, Hurst J (2003) A neural Learning Classifier System with self-adaptive constructivism. Technical report, University of the West of England

  • Butz MV, Goldberg DE, Stolzmann W (2000) Probability-enhanced predictions in the Anticipatory Classifier System. In: Proceedings of the International Workshop on Learning Classifier Systems (IWLCS-2000), in the joint workshops of SAB 2000 and PPSN 2000 [1]. Extended abstract

  • Cassandra AR, Kaelbling LP, Littman ML (1994) Acting optimally in partially observable stochastic domains. In: Proceedings of the twelfth national conference on artificial intelligence (AAAI-94), vol 2, pp 1023–1028. MIT Press

  • Cliff D, Ross S (1994) Adding temporary memory to ZCS. Adapt Behav 3(2):101–150

    Article  Google Scholar 

  • Hoffman J (1993) Vorhersage und Erkenntnis. Gottingen, Hogrefe

    Google Scholar 

  • Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern-directed inference systems. Academic Press, New York

    Google Scholar 

  • Hurst J, Bull L (2000) A self-adaptive Classifier System. In: Lanzi PL [1], pp 70–79. Extended abstract

  • Lanzi PL (1997a) A model of the environment to avoid local learning (an analysis of the generalization mechanism of XCS). Technical Report 97.46, Politecnico di Milano. Department of Electronic Engineering and Information Sciences. http://ftp.elet.polimi.it/people/lanzi/report46.ps.gz

  • Lanzi PL (1997b) Solving problems in partially observable environments with Classifier Systems (Experiments on adding memory to XCS). Technical Report 97.45, Politecnico di Milano. Department of Electronic Engineering and Information Sciences. http://ftp.elet.polimi.it/people/lanzi/report45.ps.gz

  • Lanzi PL (1997c) A study of the generalization capabilities of XCS. In: Bäck T (ed) Proceedings of the 7th International Conference on Genetic Algorithms (ICGA97), pp 418–425. Morgan Kaufmann, http://ftp.elet.polimi.it/people/lanzi/icga97.ps.gz

  • Lanzi PL (1998) An analysis of the memory mechanism of XCSM. In: Koza JR, Banzhaf W, Chellapilla K, Deb K, Dorigo M, Fogel DB, Garzon MH, Goldberg DE, Iba H, Riolo R (eds) Genetic programming 1998: proceedings of the third annual conference, pp 643–651. Morgan Kaufmann, http://ftp.elet.polimi.it/people/lanzi/gp98.ps.gz

  • Lanzi PL, Wilson SW (1999) Optimal Classifier System performance in non-Markov environments. Technical Report 99.36, Dipartimento di Elettronica e Informazione – Politecnico di Milano

  • Littman ML (1992) An optimization-based categorization of reinforcement learning environments. In: Roitblatand J-AMH (ed) From animals to animats 2: proceedings of the second international conference on simulation of adaptive behavior. The MIT Press/Bradford Books

  • Littman ML (1995) Learning policies for partially observable environments: scaling up. In: Proceedings of the twelfth international conference on machine learning

  • Lorenz K (1935) Der kumpan in der umwelt des vogels. J Ornithol 137–215

  • Maze material for AgentP (2005) http://www.cmp.uea.ac.uk/Research/ kdd/projects.php?project=17

  • McCallum AR (1993) Overcoming incomplete perception with utile distinction memory. In: The proceedings of the tenth international machine learning conference

  • Métivier M, Lattaud C (2002) Anticipatory Classifier System using behavioral sequences in non-Markov environments. In: IWLCS, pp 143–162

  • Miyazaki K, Kobayashi S (1999) Proposal for an algorithm to improve a rational policy in POMDPs. In: Proc of international conference on Systems, Man and Cybernetics (SMC 99), pp 492–497

  • O’Hara T, Bull L (2005) A memetic accuracy-based neural Learning Classifier System. In: Proceedings of the IEEE congress on evolutionary computation, pp 2040–2045. IEEE

  • Pavlov IP (1927) Conditioned reflexes. Oxford University Press, London

    Google Scholar 

  • Proceedings of the International Workshop on Learning Classifier Systems (IWLCS-2000), in the joint workshops of SAB 2000 and PPSN 2000 (2000). Pier Luca Lanzi, Wolfgang Stolzmann and Stewart W. Wilson (workshop organisers)

  • Skinner BF (1953) Science and human behavior. Macmillan, New York

    Google Scholar 

  • Stolzmann W (2000) An introduction to Anticipatory Classifier Systems. In: Stolzmann W, Lanzi PL, Wilson SW (eds) Learning Classifier Systems, from foundations to applications. Springer-Verlag, pp 175–194

  • Studley M, Bull L (2005) Using the XCS classifier system for multi-objective reinforcement learning problems. Technical report, University of the West of England

  • Thorndike EL (1911) Animal intelligence. Hafner, Darien, CT

  • Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):272–292

    Google Scholar 

  • Wertheimer M (1938) Laws of organization in perceptual forms. In: A source book of gestalt psychology. Routledge and Kegan Paul, London, pp 71–88

  • Wilson SW (1990) The animat path to AI. In: Meyer JA, Wilson SW (eds) From animals to animats 1. Proceedings of the first international conference on Simulation of Adaptive Behavior (SAB90), pp 15–21. A Bradford book. MIT Press, http://prediction-dynamics.com/

  • Wilson SW (1994) ZCS: a zeroth level Classifier System. Evol Comput 2(1):1–18

    Article  Google Scholar 

  • Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Article  Google Scholar 

  • Zatuchna ZV (2004) AgentP model: Learning Classifier System with Associative Perception. In: Yao X et al (eds) Proceedings of the Parallel Problem Solving from Nature Conference (PPSN), vol 3242, of Lecture Notes in Computer Science, pp 1172–1182. Springer

  • Zatuchna ZV (2006) AgentP: A Learning Classifier System with Associative Perception in Maze Environments. PhD Thesis, School of Computing Sciences, University of East Anglia

  • Zatuchna ZV, Bagnall AJ (2005) AgentP classifier system: Self-adjusting vs. Gradual approach. In: Proceedings of the 2005 congress on evolutionary computation, pp 1196–1203

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhanna V. Zatuchna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zatuchna, Z.V., Bagnall, A.J. A learning classifier system for mazes with aliasing clones. Nat Comput 8, 57–99 (2009). https://doi.org/10.1007/s11047-007-9055-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-007-9055-7

Keywords

Navigation