Genetic Algorithms and Group Method of Data Handling-Type Neural Networks Applications in Poultry Science

The book addresses some of the most recent issues, with the theoretical and methodological aspects, of evolutionary multi-objective optimization problems and the various design challenges using different hybrid intelligent approaches. Multi-objective optimization has been available for about two decades


Introduction
The necessity of modeling is well established since the structural identification of a process is essential in analysis, control and prediction.Computer modeling is becoming an important tool in different fields in science including Biology.In Artificial Intelligence research, 'intelligence' is increasingly looked upon not as deliberative reasoning processes alone, but as the ability to exhibit adaptive behavior in a complex world.There have been extensive efforts in recent years to deploy population-based stochastic search algorithms such as evolutionary methods to design artificial neural networks since such evolutionary algorithms are particularly useful for dealing with complex problems having large search spaces with many local optima (Iba, etal,1996).In recent years, the use of artificial neural networks leads to successful application of different type of algorithm in a broad range of areas in engineering, biology, and economics in which GMDH-type is one.

Genetic algorithms
Nature employs the best cybernetic systems that can be conceived.In the neurological domain of living beings, the ecological balance involving environmental feedback, or the regulation of the temperature of the human body, are the examples of cybernetic systems of nature that are fascinating in their accuracy and efficiency (Madala and Ivakhnenko, 1994).
In the 1950s and 1960s several computer scientists independently studied evolutionary systems with the idea that evolution could be used as an optimization tool for engineering problems in different systems (as a collection of interacting, diverse elements that function/ communicate within a specified environment to process information to achieve one or more desired objectives) (Mitchell and Forrest 1994).
Evolution can be considered as the first and highest level of adaptation.It involves the adaptation of a species to global ecological and environmental conditions.This adaptation is a relatively slow process that operates over millennia, although the speed of genetic adaptation may differ widely for individual species.
Genetic algorithms (GAs) are currently the most prominent and widely used models of evolution in artificial-life systems.GAs have been used both as tools for solving practical problems and as scientific models of evolutionary processes.The intersection between GAs and artificial life includes both, although in this article we focus primarily on GAs as models of natural phenomena.Indeed GAs are optimization algorithms that work according to a scheme analogous to that of natural evolution.Literature review reveled that John Holland (Holland 1975) was the first who offered these principles of natural evolution to artificial systems, more precisely to optimization problems, and came up with the notion of GA.A general definition of these algorithms is (Koza 1980): "The genetic algorithm is a highly parallel mathematical algorithm that transforms a set (population) of individual mathematical objects (typically fixed length character strings patterned after chromosome strings), each with an associated fitness value, into a new population (i.e. the next generation) using operations patterned after the Darwinian principle of reproduction and survival of the fittest and after naturally occurring genetic operations (notably sexual recombination)." Genetic algorithms as defined by Goldberg (Goldberg, 1989) is: ...search algorithms based on the mechanics of natural selection and natural genetics." Goldberg offers four differences between genetic algorithms and other search methods.
1. Genetic algorithms work with a coded parameter set.2. They search from a population of points in a solution space, rather than from a single point.3.They only use directly available information provided through a fitness function.4. They rely on probabilistic transition rules instead of deterministic rules.
The success of nature in solving many problems nowadays recognized as very difficult for the traditional approaches, have led researchers into studying the biological example.In various abstractions and formalizations, biological systems have been theoretically proven to provide robust solutions to these hard problems.However the models used in the area of biological problems are complex, because of their characteristics and processes.This concept leads to the conclusion that the biological activity generates information with special features, most notable being the following (Fernández and Lozano, 2010): 1.The obtained information from process presents a non-homogeneous structure since of the complexity of the objects alive.2. The information is emerging from the dynamics of change associated with the functional properties of the studied phenomena.
What structure we need depends, of course, on our aims.We may distinguish roughly between operational and physiological models.An operational model aims to describe behavior realistically, but its structure is not intended to resemble the internal structure of particular biological system.Such models are often referred to as black box models to indicate lack of concern about underlying mechanisms.A physiological model, on the other hand, attempts to take into account more of the physiology that produces behavior, e.g., body and nervous system physiology (Dellaert, 1995 ).

Neural networks
Our brain contains about 10 11 neurons, each of which is connected to an average of 10 4 other neurons.This amounts to a total of 10 15 connections.If these billions of connections were fully random, it can be shown that the brain would be many times larger than it actually is (Happel and Murre, 1994).Massive regressive events of neuronal connectivity in the vertebrate nervous system can be seen as part of the development and maturation of neural functions.The neuron has set of nodes that, connects it to inputs, output, or other neurons, these nodes are also called synapses (See Fig. 1).

Fig. 1. Schematic structure of a Neuron
A single neuron by itself is not a very useful pattern recognition tool.The real power of neural networks comes when we combine neurons into the multilayer structures, called neural networks (NN) (Fig. 2).In a nutshell, a NN can be considered as a black box that is able to predict an output pattern when it recognizes a given input pattern.Once trained, the NN is able to recognize similarities when presented with a new input pattern, resulting in a predicted output pattern.The process of evolution is used as a 'real-world model' that serves as a source of ideas for solving practical and theoretical problems in modeling and optimization.
Researchers from a wide range of fields have discovered the benefits of applying NNs to pattern recognition problems in various systems (including biological system).Artificial NN (ANNs) is a system loosely modeled based on the human brain and are considered as a branch of the field known as "Artificial Intelligence" (AI).The techniques of AI is being applied in this field significantly in recent decades, and among them those known as ANNs, are characterized by their properties of learning and generalization.It's often necessary to take into account their potential for induction, which can be implemented by software (Miroslav Šnorek, 2006).
To understand behavior of model system, we need ways of describing behavior maps and state transition equations.Ideally, behavioral models should fulfill the following requirements (Dellaert, 1995): 1. Versatility 2. Robustness 3. Learning 4. Ontogeny 5. Evolution NNs are a powerful technique to solve many real world problems.They have the ability to learn from experience in order to improve their performance and to adapt themselves to changes in the environment.In addition to that they are able to deal with incomplete information or noisy data and can be very effective especially in situations where it is not possible to define the rules or steps that lead to the solution of a problem.Once trained, the NN is able to recognize similarities when presented with a new input pattern, resulting in a predicted output pattern.NNs are applied in many fields to model and predict the behavior of unknown systems or systems with complexity (or both) based on given input-output data.Using NNs does not require a priori equation or model.This characteristic is potentially advantageous in modeling biological processes (Dayhof & DeLeo 2001).
There are several methods to obtain inductive models.The Group Method of Data Handling methods, GMDH (Ivakhnenko AG, 1971) is well known and has recently gained popularity as a self-organizing and powerful tool to express complex input-output dependencies.

Group Method of Data Handling method (GMDH)
Generally, the connection between input-output variables can be approximated by Volterra functional series, the discrete analogue of which is Kolmogorov-Gabor polynomial.
Ivakhnenko (Ivakhnenko, 1966), inspired by the form of Kolmogorov-Gabor polynomial, developed a new algorithm, known as Group Method of Data Handling (GMDH), which are also called inductive learning methods, self-organization, sorting out, and heuristic methods.This approach is substantially different from deductive methods used commonly for modeling.It has inductive nature, i.e., it finds the best solution by sorting-out of possible variants.The framework of these methods differs slightly in some important respects (Madala and Ivakhnenko, 1994).
A major difficulty in modeling complex systems in such unstructured areas as economics, ecology, sociology, and others is the problem of the researchers introducing their own prejudices model.In the mid 1960's the Russian mathematician and cyberneticist, A.G. Ivakhnenko, introduced a method (Ivakhnenko, 1966), based in part on the Rosenblatt perceptron (Rosenblatt, 1958), that allows the researchers to build model of complex systems without making assumptions about internal working.The idea is to have the computer construct a model of optimal complexity based only on data and not on any preconceived ideas of the researchers; that is,by knowing only simple input-output relationship of the system.Ivakhnenko's GMDH algorithm will construct a self-organizing model (an extremely high-order polynomial in the input variable) that can be used to solve prediction, identification, control synthesis, and other system problems (Farrow, 1981).
The algorithm was developed for identifying nonlinear relationships between inputs and outputs.The algorithm provides an optimal structure, obtained in an iterative procedure of partial descriptions of the data by adding new layers.The number of neurons in each layer, the number of layers and the input variables are automatically determined to minimize a criterion of prediction error and thus organizes an optimal NN architecture using a selfheuristics, which is the basis of the GMDH algorithm.(Ivakhnenko,1971).This method is particularly successful in solving problems of modeling multiple entries for a single output (Mutasem, 2004).The main idea of GMDH is to build an analytical function in a feedforward network based on a quadratic node transfer function whose coefficients obtained by using a regression technique (Farlow, 1984).By means of the GMDH-type NN algorithm, a model can be represented as a set of neurons in which different pairs in each layer are connected through a quadratic polynomial and thus produce new neurons in the next layer, and therefore can be used to map inputs to outputs.Such an NN identification process needs some optimization method to find the best network architecture.This sub-model of ANN is considered as a Self -organizing approach by which gradually more complex models are generated from their performance evaluation (Lemke and Mueller, 2003).The unique feature of GMDH-type NN is that it facilitates, systematically and autonomously, developing optimal complex models by performing both variable and structure identification.
Incorporating Genetic Algorithm to GMDH-type NNs, each neuron is represented as a string, which can be mutated or crossed with each other to form new generations.Thus GA has been used in feed-forward GMDH-type NN for each neuron searching its optimal set of connections with the preceding layer (Vasechkina & Yarin 2001;Nariman-Zadeh et al. 2003).
In the early stage of the development of GMDH theory the similarity between NNs and multilayer GMDH algorithms had been highlighted.(Ivakhnenko 1970) in one of the introductory articles claims that since the differences between perceptron and GMDH are neither significant nor fundamental it is appropriate to call GMDH systems as "systems of perceptron type".
During the modeling procedure, GMDH algorithm involves four heuristics that represent the main features of GMDH theory (Anastasakis and Mort, 2001): 1. Collect a set of observations that seems to be relevant to the object 2. Divide the observations into two groups.The first will be used to estimate the coefficients of model while the second will separate the information embedded in the data into either useful or harmful.Strictly speaking: "no partition of the data, no GMDH".3. Create a set of elementary functions where complexity will increase through an iterative procedure producing different models.4. Acording to Gödel's incompleteness theorem, apply an external criterion to choose the optimum model.
A detailed description of a GMDH-type NN terminology, development, application, and examples of using this approach were reported by several researchers (Farrow, 1984;Mueller and Lemke, 2000;Lemke and Mueller, 2003;Nariman-Zadeh et al., 2005).Recently the GEvoM software for GMDH-type NN training (GEvoM 2009) was developed in University of Guilan, Iran.

Applications of GMDH-type algorithms in animal and poultry production systems
Contributions to GMDH type of NN, have come from many research areas of different disciplines, and recently, the use of such self-organizing networks has led to a successful application of the GMDH-type algorithm in a broad range of areas in engineering, science, and economics (Amanifard et al., 2008).However, very little research has been conducted on modeling animal and /or poultry growth and production using ANNs.
A series of studies have been conducted to examine the potential use of ANNs in various poultry subjects, such, prediction of ascites in broilers (Roush et al., 1996;Roush and Wideman,2000), the estimation of production variables in the production phase of broiler breeders (Salle et al., 2003), and the comparison of Gompertz and NN models of broiler growth (Roush et al., 2006).
However no attempt was made to use GMDH-type NN in animal agriculture, until 2007, when the results of study was published based on the first work of my group in University of Guilan, Iran (Ahmadi, etal., 2007).The idea behind this work was that, when considering the effects of nutrition on broiler performance, several nutrients may influence the breast meat yield, feed : gain ratio, and number of days required to produce the market body weight; among them, Metabolizable Energy (ME) and Amino Acid(AA) , such as Lysine(Lys) and Methionine (Met) (Hruby and Hamre, 1996 ;Gous, 1998) .In terms of AA, whatever system is used to describe the essential AA requirements for broiler chickens, predicting the performance to be used in deciding the most advantageous dietary AA patterns in practical and useful terms is still difficult, even when the digestibility or availability of AA is specified (NRC, 1994;Sibbald, 1987 ].This difficulty is partly due to the nonlinearity of growth responses related to changes in dietary AA concentrations [Hruby andHamre, 1996, Phillips, 1981].A more useful method is to model the system, which in turn requires an explicit mathematical input-output relationship.Such explicit mathematical modeling is, however, very difficult and is not readily tractable in poorly understood systems.Alternatively, soft-computing methods, which concern computation in an imprecise environment, have gained significant attention.One of the soft-computing methods is ANNs, which have shown great ability in solving complex nonlinear system identification and control problems. The optimal structures of the evolved 2-hidden-layer GMDH-type NN that were suggested by GA for performance index (PI) as the system output modeling, were found with 2, 4, and 4 hidden neurons for growth periods 1, 2, and 3, respectively.In the first period, the structure obtained appeared with the GA, which was less complex than in the second and third periods, in which the GA suggested 2 hidden neurons to fit the network.All models constructed from this data set were characterized by a superb response for all input variables from the learning set.The partial descriptions of the GMDH-type NN were found with 2 hidden layers and 2 hidden neurons for growth period, whereas it appeared with 2 hidden layers and 4 hidden neurons for growth periods 2 and 3.In fact, these results revealed the quantitative relation between input (ME, Met, and Lys) and output (PI) variables under investigation, which meant GMDH-type NN may be considered as a promising method for modeling the relationship between dietary concentrations of nutrients and poultry performance, and therefore can be used in choosing and developing special feeding programs to decrease production costs.Also, it can enhance our ability to predict other economic traits, make precise predictions of the nutrition requirements, and achieve optimal performance in poultry production systems.The conclusion remarks of this study were reported as: 4-1-Knowledge of an adequate description of broiler ME and AA requirements can help in establishing specific feeding programs, defining optimal performance, and reducing production costs.
4-2-Calculated statistics indicate that GMDH-type NN provide an effective means of efficiently recognizing the patterns in data and predicting a PI based on investigating inputs.
4-3-The genetic approach could be used to provide optimal networks in terms of hidden layers, the number of neurons and their configuration of connectivity, or both so that a polynomial expression for dependent variables of the process can consequently be achieved.
4-4-The polynomials obtained could be used to optimize broiler performance based on nutritional factors by optimizing methods such as the GA.
In animal and poultry production, feed composition is very important item for diet formulation.Since conventional laboratory techniques for feed analysis is expensive and time consuming, it would be advantageous if a simple means of estimating feed composition could be developed.One year after the first work another study (Ahmadi etal., 2008) was done.The purpose of this study was to examine the validity of GMDH-type NN with a genetic algorithm method to predict the True Metabolizable Energy corrected for nitrogen (TMEn) of feather meal and poultry offal meal (POM) based on their chemical analysis.
All the previously TMEn prediction models reported for poultry by-product meals were based on the regression analysis methods using their CP, ether extract (EE), and ash content.
In this study, a soft-computing method of artificial NN (ANN) seemed to be more appropriate for the TMEn prediction of a feedstuff.
The parameters of interest in this multi-input, single-output system that influenced the TMEn were CP, EE, and ash content of the samples.The raw data were divided into 2 parts of training and validation sets.Thirty input-output data lines (12 from FM and 18 from POM Samples ) were randomly selected and used to train the GMDH-type NN model as a training set.The validation set consisted of the 7 remaining data lines (3 from FM and 4 from POM samples), which were used to validate the prediction of the evolved NN during the training processes.The data set was imported into a GEvoM for GMDH-type NN training (GEvoM, 2008).Two hidden layers were considered for prediction of the TMEn model.A population of 15 individual values with a crossover probability of 0.7, mutation probability of 0.07, and 300 generations was used to genetically design the NN (Yao, 1999).It appeared that no further improvement could be achieved for this population size.A quantitative verifying fit for the predictive model was made using error measurement indices commonly used to evaluate forecasting models.The goodness of fit or accuracy of the model was determined by R2 value, adjusted R2, mean square error, residual standard deviation, mean absolute percentage error, and bias (Oberstone,1990).
The results of this study revealed that the novel modeling of GMDH-type NN with an evolutionary method of GA can be used to predict the TMEn of FM and POM samples based on their CP, EE, and ash content( See Fig. 3).The advantage of using the GMDH-type NN to predict an output from the input variables is that there is no need to preselect a model or base the model entirely on the fit of the data.It is concluded that the GMDH-type NN may be used to accurately estimate the nutritive value of poultry meals from their corresponding chemical composition.The success of poultry meat production has been strongly related to improvements in growth and carcass yield, mainly by increasing breast proportion and reducing carcass fat.
In addition to its measurement in the laboratory using wet chemistry, carcass composition of broiler chickens has been predicted by means of allometric equations, real-time ultrasonography, radioactive isotopes and specific gravity studies (Pesti & Bakalli 1997;Toledo et al.2004;Rosa et al.2007;Makkar 2008).Conventional laboratory techniques for determining carcass composition are expensive, cumbersome and time consuming.Therefore, it would be useful if a simple means of estimating carcass composition could be developed.In this respect, the potential advantages from modeling growth are considerable.
Results obtained from two above mentioned studies urged our group to think about third study in 2010 (Faridi, etal., 2012), aimed at applying the GMDH-type NNs to data from two studies with broilers in order to predict carcass energy (CEn, MJ/g) content and relative growth (g/g of body weight) of carcass components (carcass protein, breast muscle, leg and thigh muscles, carcass fat, abdominal fat, skin fat and visceral fat).The effective input variables involved in the prediction of CEn and carcass fat content using data from the first study were dietary metabolizable energy (ME, kJ/kg), crude protein (CP, g/kg of diet), fat (g/kg of diet) and crude fibre (CF, g/kg of diet).For this purpose, in the current study, GA were deployed to design the whole architecture of the GMDH-type NN, i.e. the number of neurons in each hidden layer and their configuration of connectivity to find the optimal set of appropriate coefficients of quadratic expressions.
Quantified values of bias in this study showed very little under-and over-estimation by the models proposed by the GMDH-type NN, which revealed close agreement between observed and predicted values of CEn and carcass components.The value of R2, a measure of the relation between the actual and predicted values, was high for both studies indicating a strong effect of all selected input variables on output prediction.
In conclusion, the results of the current study showed that a GMDH type NN modeling approach can be a simple but very effective method for predicting carcass composition in broiler chickens based on dietary input variables.This is in agreement with previous studies aimed at investigating the effects of different dietary nutrients on body composition in broilers (Fraps 1943;Donaldson et al. 1956;Kubena et al. 1972;Edwards et al. 1973).
Selection pressure applied by industry geneticists has greatly reduced feed conversion ratio and age to slaughter as well as increased growth rate and yield of edible meat for commercial turkeys.These genetic improvements have occurred along with improvements in nutrition and management (Havenstein et al., 2007).
There has been extensive research conducted to clarify protein, essential amino acids, and energy requirements in poultry.To avoid conventional laboratory and field based techniques problems for determining nutrient requirements alternative methods was offered using GMDH -type NN (Mottaghitalab,etal., 2010).In determining nutrient requirements, the potential benefits from modeling growth in poultry are considerable.This approach has the potential to provide information in several areas for poultry production, including prediction of growth rate and market weights, determination of factors that are truly of economic importance to the operation, general knowledge about the systems involved in production, and determination of more precise nutrient requirements based on sex, strain, protein versus fat accretion, parts yield, and feed intake.
The structures of the 2 hidden layers GMDH-type NN evolved for CE and FE are shown in Figures 4 and 5, respectively.These figures correspond to the genome representation of (abceadaa) and (eeabacdd) for the CE and FE models, respectively, and illustrate the generated relationship between input variables to reach the output.As Figures 4 and 5 show, the optimal structure of the evolved 2 hidden layer GMDH-type NN suggested by GA was found with 5 and 4 hidden neurons for CE and FE, respectively.In most GMDHtype NN, the neurons in each layer are only connected to neurons in the adjacent layer (Farlow, 1984), but for GMDH-type NN developed here, variable a of the input layer for CE is connected to adaa in the second hidden layer by directly passing through the first hidden layer.The same process happens for d and e input variables in the FE model.Such repetition occurs whenever a neuron passes some adjacent hidden layer and connects to another neuron in the next following hidden layer.It appears that all selected input variables in both models had a strong effect on output prediction, which is in agreement with previous studies (Lemme et al., 2006 for amino acid; Noy and Sklan, 2004 for energy and amino acid; Potter et al., 1966 andWaibel et al., 1995 for Met andLys;and Bowyer and Waldroup, 1986 for protein).Figure 1 shows a very strong effect of age on CE.This result is similar to previous studies aiming to describe the growth pattern of animals with age using growth functions (Darmani-Kuhi et al., 2003;Schulin-Zeuthen et al., 2008).The calculated values of CE model error measurement showed that the testing set for toms yielded lower values of MS error, mean absolute deviation, mean absolute percentage error, mean relative error, and higher values of R2 compared with the training set.Conducting a sensitivity analysis (SA) on the obtained polynomial equations reveals the sensitivity of model output to input variables.Hence it is necessary to do sensitivity analysis for any proposed model.In other words, SA increases confidence in the model and its predictions by providing an understanding of how the model responds to changes in its inputs.Moreover, the SA identifies critical regions in the space of the inputs, establishes priorities for research and simplifies the model (Castillo et al.,2008;Saltelli et al., 2008).
For such reason and in line with previous work, another study was designed, titled: "Sensitivity analysis of an early egg production predictive model inbroiler breeders based on dietary nutrient intake" (Faridi et al., 2011).
Although the use of NN and SA techniques has led to successful application in a broad range of areas (Seyedan&Ching 2006;Lee &Hsiung 2009), the use of SA along with NN models is appeared uncommon in poultry science.The aim of the present study was to use the GMDH-type NN to model early egg production (EEP) in broiler breeders (BB) based on the dietary intake levels of ME, CP, and the two first limiting amino acids, methionine (Met) and lysine (Lys).The SA method was utilized to evaluate the relative importance of input variables on model output and to determine the optimum levels of nutrient intake for obtaining the maximum EEP in BB.
In this study, the GMDH-type NN with GA method was used to develop the EEP in BB.By means of the GMDH algorithm, a model can be represented as a set of quadratic polynomials.In this way, GA are deployed to assign the number of neurons (polynomial equations) in the network and to find the optimal set of appropriate coefficients of the quadratic expressions.
The variables of interest in this model were the dietary intake levels of ME (MJ/bird/day), CP (g/bird/day), Met (g/bird/day), Lys (g/bird/day) and weekly egg production (eggs/bird) during early production (from 24 to 29 weeks of age).Datasets were imported into the software GEvoM for GMDH-type NN training (GEvoM 2009).
Results of the developed GMHD-type NN models revealed close agreement between observed and predicted values of EEP.Results showed that the evolved GMDH-type NNs have been successful in obtaining a model for the prediction of EEP in BB.All input variables were accepted by the model, i.e. the GMDH-type NN provides an automated selection of essential input variables and builds polynomial equations to model EEP.
The advantage of using GMDH-type NN is that which polynomial equations obtained can be used to analyze the sensitivity of output with respect to input variables.SA discusses how and how much changes in the input variables modify the optimal objective function value and the point where the optimum is attained.The simple approach to SA is easy to do, easy to understand, easy to communicate, and applicable with any model.

Conclusion
The conclusion was that, genetic algorithm in general and GMDH-type NN in particular may be used as a powerful tool to enhance our ability to predict economic traits, make precise prediction of nutrition requirements, and achieve optimal performance in poultry nutrition and production.

Fig. 3 .
Fig. 3.The comparison of observed and model predicted TMEn values obtained from training (1 to 12 and 13 to 30 are feather and offal samples, respectively) and validation (31 to 33 and 34 to 37 are feather and offal samples, respectively) sets.

Fig. 4 .
Fig. 4. Evolved structure of the generalized group method of data handling-type NNs for caloric efficiency in tom turkeys.The letters a, b, c, d, and e stand for input variables of age (wk), ME (kcal/g), CP (% of diet), Met (% of diet), and Lys (% of diet), respectively.This figure illustrates the generated relationship between input variables to reach output.

Fig. 5 .
Fig. 5. Evolved structure of the generalized group method of data handling-type NNs for feed efficiency in tom turkeys.The letters a, b, c, d, and e stand for input variables of age (wk), ME (kcal/g), CP (% of diet), Met (% of diet), and Lys (% of diet), respectively.This figure illustrates the generated relationship between input variables to reach output.