A Genetic Algorithm with Neighborhood Search to Solve Integer and Linear Programming Problems

In this paper, a metaheuristic algorithm that combines genetic and neighbor search algorithms is proposed to solve integer linear programming problems. The individuals of the population are binary coded into a sequence of chromosomes (variables). Initially, chromosome length is five bits (genes) but if required they grow, up to 21 genes per chromosome, when looking for optima. The algorithm includes a test based on systematic neighborhood search to decide if it continues or stops. The algorithm is able to solve maximal or minimal integer linear programming problems in standard or non-standard form and linear programming problems with a simple adaptation. A comparative study was conducted with three algorithms; LINGO, Simplex LP and Evolutionary. These last two algorithms are from commercial solver in Excel spreadsheet software. The results show that the algorithm was able to find similar solution with LINGO and Simplex LP but better than the Evolutionary. A time study using problems from literature with two, three, four, eight and twelve variables is included.


INTRODUCTION
This paper presents and discusses a first effort to deploy a piece of software capable of finding solution at Integer Linear Programming (ILP) and Linear Programming (LP) problems. The test examples are limited to simple examples that range from two up to four variables and up to six constraints. The implementation is based on a variation of the classical Genetic Algorithm (GA) as first discussed in [1]. Research on this topic keeps on going due to the difficulty found when looking for optimal solutions to ILP problems [2][3][4]. This paper reports the results when implementing a variation of GA with a set of control parameters to search for optima with normal or aggressive GA mechanism, not only to escape from local optima but also to focus the generation of solutions. Furthermore, we deal with one of the questions when applying the GA: how many generations should the GA must run? The answer could be as much as possible, but it is a vague response. Usually, implementations use a time limit some others terminate after a certain number of generations where no improving has been found or all individual chromosomes are identical [2,3]. In this implementation a test applying a systematic neighborhood iterated search is conducted [4]. If one better solution is found in the neighborhood of the current solution then the algorithm is triggered again or ends otherwise.
The implementation includes six control parameters: the model definition: maximization or minimization, number of generations from 100 to 10000, number of ages from 1 to 40, crossover level, mutation level, neighbor range and diversity level. The algorithm is able to run in a progressive way i.e. if at the end of a run it runs again it will use the results from last run to continue searching for optima and it is where the history tracking control fits, if a completely new run is required.
For each age selected, the algorithm runs the number of generations specified. The algorithm keeps track of fittest individual found on each age and the fittest individual of all ages will be the solution, the best solution, an approximation to the optimal one if not the optimal, along with a list of the fittest individuals of all ages.
GAs are an active research topic that has been implemented to solve optimization problems from non-linear programming, task scheduling, computer vision and multi-objective resource allocation problems [3,[5][6][7], and of course, integer linear programming [8,9], among others. The mixing of GA with neighborhood search is not new, it has been used to solve problems related to resource scheduling, machine cell formation and traveling salesman problem, among others [10][11][12], Section 2 is about the variation of the GA mechanisms and how they were implemented. Section 3 describes de implementation software. Section 4 presents a comparative study with two other algorithms from commercial software, Microsoft Excel Solver. Section 5 includes a time study with problems ranging from two to twelve variables, maximization and minimization in standard and non-standard from literature. Finally, on Section 6 there are conclusions and recommendations for further research.

THE GENETIC ALGORITHM MECHANISMS
In this section, the mechanisms as implemented are discussed. For further explanation of the classical mechanisms of the GA they can be found in [1].

Initial Population
On generation zero, population is generated randomly but from age one and above the initial population for that, is generated based on a diversity level. A level of 100% means that population is generated at random. If this level is set to 50%, for example, there is a 50% chance that individuals will be generated at random and the other 50% will be a neighbor from current best solution, at random but inside a maximum neighbor range.
On generation zero, one individual goes through chromosome repair if it is not feasible [13]. The algorithm uses Equation (1) to compensate a not feasible individual and make it feasible. It is the same for minimization or maximization but on maximization, it decreases the variables values, x j , if a less than constraint is not met. On the other hand, on minimization, it increases the variables values only when a greater than constraint is not met. The algorithm computes the sum of coefficients c ij at the start of a run. It has been observed that in this implementation the use of equation (1) helps to accelerate the convergence of the algorithm. Initial population is size 50 and it remains the same through all the optima searching process. Every generation ten individuals are generated applying crossover and/or mutation and they replace ten individuals of the population, the ten individuals with lower fitness evaluation.
Once the initial population is ready, the reproduction process is called (crossover and mutation). By default there is a 50-50% chance that a new individual is created using the crossover or mutation mechanisms. But it can be adjusted to a desirable value with the controls provided. (1)

Crossover
The algorithm selects two individuals from the population at random and at least one chromosome goes through crossover. A random position in the chromosome is selected and the new individual inherits all the genes from the random position to the right or left of one parent followed by all the genes to the left or right of the random position, from the other parent. There is a 50% chance that the procedure includes all the chromosomes and 50% that it uses a subset of chromosomes. There is a 50% chance of switching who is parent one and who is parent two. Fig. 1 illustrates the crossover process.
Parents are selected randomly from population where the fittest individual has 1/10 chance to be selected for crossover or mutation, while individuals ranked from position 2 to 6 have 1/30 chance and remain individuals have 1/60 chance.

Mutation
There are four mutation strategies with equal chance to modify a chromosome: one gene mutation with value switching, one gene mutation with random value setting, multiple gene mutation and offset mutation. In offset mutation, the algorithm mutates the chromosomes from current best solution to another value but inside of range limit from current chromosome value. In this mutation option, the algorithm generates an integer value randomly from zero up to a range limit, an offset, and there is 50-50% chance to go up or down from current chromosome value. The algorithm converts the resulting value of the chromosome to its binary equivalent. Fig. 2 illustrates all four strategies for mutation; the dash lines indicate alternatives due to 50% chance. The illustration depicts a chromosome of 13 genes length at the center of the illustration, from where new chromosomes could be generated using one of the four possible strategies. The algorithm assigns same probability to all mutation strategies.

Evaluation
Individuals are evaluated using the objective function and each time a constraint is not met a penalty is considered. The size of the penalty is the sum of the quantities but it goes beyond the constraints limits times the number of restrictions that was not able to meet and times a sensibility factor. The algorithm deducts a penalty from the objective function value when looking for maxima or adds to it when looking for minimal.
The highest value from the objective function coefficients and the constraints critical values is set as the sensibility but if it is lower than 100 then sensibility is set to 100 and 10 000 if it is greater than 10 000.

THE IMPLEMENTATION
The implementation runs on any web browser with HTML5 and JavaScript support. All the tests were run under Linux Ubuntu platform using Firefox web browser. At present time, the implementation supports up to twelve variables and eleven constraints. All the runs were done using a laptop 3.9 GB, Intel® Core™ 2 Duo CPU T7250 @ 2.00GHz × 2, Ge Force 9200M, 32-bit OS. Fig. 3 shows a screen shot of the implementation. The implementation is available to install on Android devices at Google Play under the name MathGO.
After clicking the find solution button, the algorithm creates 50 individuals Each individual is set to have equal number of chromosomes as the number of variables in the objective function. All chromosomes are set to five genes length but they will grow if required, small chromosome length reduces the searching space [6]. The algorithm assigns at random a zero or a one to every gene. All individuals go to evaluation process and, if model selector is set to maximize, the algorithm sorts the population in descending order whereas, if selector is set to minimize, the algorithm sorts the population in ascending order.
The algorithm generates ten new individuals on every generation applying the crossover and mutation mechanisms. New individuals replace the last 10 individuals of the population and sorting reorders the population. The best solution is at the top of the list.

Fig. 3. Implementation display including a partial results list after a run
The implementation displays a coefficient matrix where there are input fields and control buttons. There is one button to select the model: maximize or minimize. When selecting maximize or minimize model the operator of the constraints will change automatically from less than to greater than, or vice versa. However, the inequality operators can be change independently as required. The algorithm is able to solve non-standard problems without the need to convert them into standard form. Below the constraints matrix there are five fields to define the number of generation, ages, crossover, mutation and neighbor range.
Once all the coefficients, critical constraint values, and algorithm parameters are set, clicking on the find solution button will trigger the GA will start the search for optima. At the end of every age, the algorithm displays a list of the population, the fitness value for each individual, variables and their values up to that age. It is possible to change the parameters values while the algorithm is running or stop the running, adjust parameters or change coefficients and continue the run.
Currently the algorithm solves ILP problems but it can be used to solve LP problems without the need to change the implementation. If variables are required to get one, two or more decimal point precision just moving the decimal point of the constraint limits, adding zeros to the right or move the decimal point one place for each digit after the decimal point required. The variables values will still be integers but just move the decimal point to the left the same number of places as in the critical constraints values to get a final solution, with decimal point, the same for the objective function value. All the problems in the set used to test the algorithm where a continuous solution was required were solved using this adaptation.

Comparative Study
A comparative analysis with commercial software was conducted. Eleven four variables problems were used in this comparative study, three of the problems (9, 10 and 11) are originally LP but were solved adapting them to be solved using ILP. All the problems are from [14]. LINGO and the standard solver in Windows Excel were used to compare the solutions of the proposed algorithm (ILPGA). The two algorithms in the Excel Solver are Simplex LP and Evolutionary [15]. There is another algorithm in the Excel solver, the GRG nonlinear but it was not used on this comparative study. Table 1 lists the solutions obtained from the four algorithms. Column A lists the algorithms: Lingo that uses branch and bound, SLP is the Simplex LP and E is the Evolutionary, for short, both from Microsoft Excel Solver, GA1 is the proposed implementation based on a variation of the genetic algorithm and neighbor search. Column Z is the objective function value and the other columns are the variables values outcome after running the algorithm.
ILPGA is able to find equal solution than LINGO and SLP but it takes longer. There are problems where alternative variables values result in same solution and the ILPG is able to list those alternatives. Table 1 lists only two alternatives for ILPGA on problem 1 and 4. The list of alternatives helps to deduct that any combination where x 1 plus x 3 is equal to 50 is a solution, in case of problem 1.
Only in problem 9, the ILPGA and SLP outcome was different, but probably SLP requires a different setting than the one used in this comparative study. Both solutions were different than the one given in [14], it seems that ILPGA is a better solution because it is lower than the one found using SLP and it does not violates any constraints as the solution given in [14]. The SLP algorithm is based on the branch and bound algorithm, which is one of today´s fastest algorithms [15].
Lingo also uses branch and bound and the outcome is the optimal solution. Branch and bound easily outperforms any GA if it is compared just based on time and especially in this set of simple problems being used here. However, as mentioned before, there are other characteristics like the one to find different solutions where they exist, flexibility, and easy to code, among others, that makes GAs appropriate but for sure requires more time and runs than Lingo or SLP.
In the comparative, the SLP did not take more than 7 seconds to find a solution in the worse scenarios of Problems 9, 10 and 11. However, in Problem 9 the SLP was not able to come out with a better solution even after running it multiple times. The ILPGA was able and it did take, on average, 55, 36 and 97 seconds to find the solution of problems 9, 10 and 11, respectively.   Fig. 4 illustrates the average time from five runs of 117 problems: from one to 59 were problems with two variables, from 60 to 102 were problems with three variables, from 103 to 113 was the set of problems with four variables, the ones used in the comparative study. Problems range from finding maxima and minima with standard and non-standard constraints. All these problems came from [14], some are from examples and others are from the exercises section. Last four problems were the alternatives of the CAPLOC problem in [16]. This problem was divided into four alternatives: three alternatives with eight variables and ten constraints and one alternative with twelve variables and eleven constraints.

Time Study
There were problems where no maxima exist. In these cases, variables keep increasing their value until they reach the maximum limit number of genes and the algorithm terminates the search for optima with a message indicating that not maxima exist and the upper number of genes has been reached. The algorithm was set to run using 500 generations and 10 ages.

CONCLUSION
GA1 algorithm performed better than the Evolutionary implementation in Excel Standard Solver and was able to find equal solutions than the Simplex LP. In one problem, GA1 was able to come up with a better solution than the Simplex LP.
More research is required to determine the parameter values to fine tuning the algorithm for a particular set of constraints and the objective function.
More research will be conducted taking into account bigger problems but the implementation until now is able to solve academic problems with up to twelve variables and eleven constraints.
The test to find if the algorithm keeps running requires more research because as the number of variables increases the number of iterations required during testing grows exponentially with it. At twelve variables, testing was set to look in ±1 range and this reduces the power of the test to find a neighbor solution. For now, in cases like this, the alternative is to run one more time or many more times as desirable in hope of finding a better solution.
Another line of research is the improving of equation (1) to better compensate infeasible solutions in the early generations to help accelerate the convergence of the algorithm.