Elsevier

Applied Soft Computing

Volume 10, Issue 1, January 2010, Pages 36-43
Applied Soft Computing

Genotype representations in grammatical evolution

https://doi.org/10.1016/j.asoc.2009.05.003Get rights and content

Abstract

Grammatical evolution (GE) is a form of grammar-based genetic programming. A particular feature of GE is that it adopts a distinction between the genotype and phenotype similar to that which exists in nature by using a grammar to map between the genotype and phenotype. Two variants of genotype representation are found in the literature, namely, binary and integer forms. For the first time we analyse and compare these two representations to determine if one has a performance advantage over the other. As such this study seeks to extend our understanding of GE by examining the impact of different genotypic representations in order to determine whether certain representations, and associated diversity-generation operators, improve GE’s efficiency and effectiveness. Four mutation operators using two different representations, binary and gray code representation, are investigated. The differing combinations of representation and mutation operator are tested on three benchmark problems. The results provide support for the use of an integer-based genotypic representation as the alternative representations do not exhibit better performance, and the integer representation provides a statistically significant advantage on one of the three benchmarks. In addition, a novel wrapping operator for the binary and gray code representations is examined, and it is found that across the three problems examined there is no general trend to recommend the adoption of an alternative wrapping operator. The results also back up earlier findings which support the adoption of wrapping.

Introduction

Grammatical evolution (GE) [20], [19], [25] is a form of grammar-based genetic programming. A special feature of GE is that, unlike genetic programming, it has a clear distinction between the genotype and phenotype. The mapping of the genotype and phenotype is governed by a grammar and this grammar can contain domain knowledge to bias the form a phenotypic solution can take. By separating the search and solution spaces, GE allows the implementation of generic search algorithms without a requirement to tailor the diversity-generating operators to the nature of the phenotype. A substantial literature has emerged on GE and its applications [20], [2], [24], [22], [4]. Some of the more recent developments of GE are focused on the various components of the GE approach including the use of alternative search engines [15], [14], [18], the use of alternative grammar constructs [16], [5], [12], [21], and the examination of different mapping processes [17]. One aspect of GE which has seen less research is the examination of the impact of the choice of genotypic representation, and associated diversity-generation operators, on GE’s efficiency and effectiveness. A recent paper by Oetzel and Rothlauf [24] examined the locality properties of a binary representation in GE and found that a genotypic bit-mutation operator produced non-local changes in the phenotype. The authors of this study proposed that further research be undertaken in order to find other representations and associated mutation operators which would produce higher locality, suggesting that this would increase the performance and effectiveness of GE. This study addresses this research issue by investigating the impact of four mutation operators using two different representations, binary and gray code representation, on the performance of GE. In addition, for the first time, a direct comparison is made between the two forms of genotypic representation adopted in the GE literature, namely integer versus binary codons. The combinations are tested using three standard benchmark problems, symbolic regression, the Santa Fe ant trail and the even-5-parity problem. In addition, a novel wrapping operator is proposed for the binary and gray representations and its performance compared to the standard wrapping operator.

The remainder of the paper is structured as follows. Section 2 describes GE and provides background on earlier work on representations. Section 3 details the experimental approach adopted and results, and finally Section 5 details conclusions and future work.

Section snippets

Background

This section provides an introduction to GE and to some prior work on the importance of representation in evolutionary algorithms. GE is a grammar-based form of genetic programming (GP) [9]. Rather than representing the programs as parse trees, as in GP, a linear genome representation is used. A genotype–phenotype mapping is employed such that each individual’s variable length binary string, contains in its codons (groups of 8 bits) the information to select production rules from a Backus Naur

Experimental setup and results

This section outlines the experimental setup used in this study, details the results of these experiments, and provides a discussion of the key findings.

Discussion

A number of experiments on different genotypic representations and operators have been presented, and are conducted on three diverse and well understood standard benchmark problems from the Genetic Programming literature. One of the primary motivations for this study was the fact that at least two variants of Grammatical Evolution exist within the literature, which adopt different genotypic representations (i.e. binary and integer). For the first time, in this study we compare the performance

Conclusions and future work

The object of this study was to examine the impact of different genotypic representations in GE in order to determine whether certain representations, and associated diversity-generation operators, improve GE’s efficiency and effectiveness. Two main variants of genotype representation exist in the literature to date, namely, binary and integer encodings. This is the first time these two representations have been formally compared and analysed.

Four mutation operators using two different

Acknowledgement

This publication has emanated from research conducted with the financial support of Science Foundation Ireland.

References (28)

  • M. O’Neill et al.

    Grammatical swarm

  • W. Banzhaf

    Genotype–phenotype-mapping and neutral variation—a case study in genetic programming.

  • A. Brabazon et al.

    Biologically Inspired Algorithms for Financial Modelling

    (2006)
  • E. Burke et al.

    Evolutionary optimization in uncertain environments—a survey

    IEEE Transactions on Evolutionary Computation

    (2005)
  • I. Dempsey, Grammatical evolution in dynamic environments, PhD Thesis, University College Dublin, Ireland,...
  • I. Dempsey et al.

    Meta-grammar constant creation

  • E. Galvan-Lopez et al.

    The importance of neutral mutations in GP

  • D.E. Goldberg

    Genetic Algorithms

    (1989)
  • R.B. Hollstein, Artificial genetic adaption in computer control systems, PhD Thesis, University of Michigan,...
  • J.R. Koza

    Genetic Programming: On the Programming of Computers by Means of Natural Selection

    (1992)
  • W.B. Langdon et al.

    Foundations of Genetic Programming

    (1998)
  • B. Lewin

    Genes VII

    (1999)
  • M. Nicolau et al.

    Introducing grammar based extensions for grammatical evolution

  • M. O’Neill, Evolutionary automatic programming in an arbitrary language: evolving programs with grammatical evolution,...
  • Cited by (34)

    • Generation of Particle Swarm Optimization algorithms: An experimental study using Grammar-Guided Genetic Programming

      2017, Applied Soft Computing Journal
      Citation Excerpt :

      Operation of the GGGP approaches As stated in [23,32], the GGGP approaches can be classified depending on how the individuals (genotypes) are represented. The main formats to represent individuals are derivation tree and linear genome [23].

    • Haplotype inference using a novel binary particle swarm optimization algorithm

      2014, Applied Soft Computing Journal
      Citation Excerpt :

      Single Nucleotide Polymorphisms (SNPs) are the most common form of DNA variation [4–6]. Studies showed that haplotypes (the combination of SNPs alleles on the same chromosome) can provide more information than genotypes (the conflated data of two haplotypes) in association studies [7–11]. However, technological limitations make it currently impractical to directly collect haplotypes in experimental way [12,13].

    • Automatic innovative truss design using grammatical evolution

      2014, Automation in Construction
      Citation Excerpt :

      More recent methods have successfully used combinations of Evolutionary Algorithms and approximate gradients [29] in truss topology optimization. GE is a grammar-based form of GP in which the grammar provides a representation in which one can encode the structure of the solution [10,11,13]. A grammar (Fig. 1) defines a derivation via a series of rules: non terminals on the left, and a number of production choices on the right.

    • Optimal control for stochastic linear quadratic singular Takagi-Sugeno fuzzy delay system using genetic programming

      2012, Applied Soft Computing Journal
      Citation Excerpt :

      Similarly the remaining rows of Table 2 can be found. The function for the chromosome is exp (x) + log (exp (y)), (for details see [15,18,24,28]). The aim of the fitness function is to provide a basis for competition among available solutions and to obtain the optimal solution.

    • Comparing Individual Representations in Grammar-Guided Genetic Programming for Glucose Prediction in People with Diabetes

      2023, GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
    View all citing articles on Scopus
    View full text