Solution of a Generalized Problem of Covering by means of a Genetic Algorithm

An algorithm of solution of a generalized problem of covering is offered. Generalization is made on the basis of subsets with diverse values of minimal summary weight of the given set. In difference from a standard problem of covering, this problem implies selection of particular covering, when rank sum of covering columns (from the covering matrix) gives the particular vector and corresponding values of this vector exceeds not a singular vector, but pre-existing vector b (with nonsingular components). Offered algorithm is based on a general genetic algorithm. It uses variable operator of non-standard mutation.


Introduction
A problem of covering is famous and many works have been devoted to its solution.However, because of its NP-complexity, solution of this problem in real time is often impossible.Therefore, it is still actual to develop efficient algorithms of solution of the standard problem of covering and its modifications.
A generalized problem of covering may be practically useful in solution of everyday problems.For instance, we may need to select location of service objects such that consumer has access to one of the several nearest objects, i.e. for him/her service object must be one or more.We may need to avoid queues in the banks, for instance, determine number of operators and many other practical problems may arise.If we use language of graphs, such problem may be used for solution of problem of multiple centers (Christofides, 1975).
A standard problem of covering is a NP-complex problem of combinatoric optimization (Garey and Johnson, 1979).Obviously, a generalized problem of covering is also NP-complex.Algorithms of precise solution of the standard problem of covering utilize techniques of branches and limits, time of their realization rapidly grows and it becomes impossible to get optimal solutions in real time, when scale of the problem increases (Christofides, 1975;Minieka, 1978).The standard and generalized problems of covering represent mathematical models of frequent and realistic everyday problems.Therefore, it is important to solve them in a real time.Heuristic algorithms are often used for solution of such problems, because they find almost optimal solutions in reasonable time intervals (Jacobs and Brusco, 1993).Approximate algorithms often use partial selection of covering sets and currently popular genetic algorithms (Holland, 1992) also belong to this class.Indeed, these algorithms often give near optimal solutions (Goldberg and Holland, 1988;Beasley and Jörnsten, 1992;Haupt and Haupt, 2004;Rutkovskaya et al., 2008).
One of the first and the best of approximate algorithm is offered by Chvatal (Chvatal, 1979).It solves the standard problem of covering in polynomial time.The work of Grossman and Wool (1997) considers the standard problem of covering, offers a heuristic algorithm and argues that it gives better results than previous techniques of solution.Lan et al. (2007) present meta-heuristic algorithm of standard covering.Ananiashvili (2015a) gives precise solutions of problems of the least division and covering and uses technique of branches and limits.According to this work, by means of compact insertion ("packaging") of columns of a covering matrix, a volume of random access memory (used for calculations) and number of operations is decreased approximately 32 times.Ananiashvili (2015b) offers approximate solution of the standard problem of covering by means of modified genetic algorithm and represents results of test problems.In their paper, Azar et al (2009) considered the general problem and gave a logarithmic approximation algorithm for it.In their paper, Bansal et al. (2010) improved their result and gave a simple randomized constant factor approximation algorithm for the generalized min-sum set cover problem.Lim et al. (2014) offer greedy algorithm of solution of problem of minimal covering.Yang and Leung (2003) and Umetani et al (2013) consider the generalized problem of weighted covering and require multiple covering of every element.For comparison, the authors use commercial software CPLEX (ILOG COMPLEX 7.0 -User's Manual, ILOG) which is used for solution of problems of integer programming and they use it for their own test problems.Motivation is that they don't know results of other authors.Yesipov and Muraviev (2014) offer two algorithms: additive and genetic.The efficiency of these algorithms is tested on randomly developed matrices and vector of weights is also filled randomly.

Formulation of a Problem
The generalized problem of minimal covering requires finding of a covering, when function is once again minimized 1 () with the following constraints: are given natural numbers with small range (less than N).Note that in the standard problem of covering, every element of standard vector b equals to 1.
are solutions of the given problem.
Vector R covers any vector b, if Rb  , i.e.
, 1,..., ii R b i N  .In the standard problem of minimal covering, if logical rank sums of elements of selected columns of matrix A are equal or greater than 1, then it is enough to find covering.However, the generalized problem of minimal covering implies selection of such columns from the matrix A, when their rank arithmetic sum gives vector R, which covers the given non-singular vector b.In both problems, a function   fx of goal (1) is minimal for the selected columns.

The General Design of Genetic Algorithm
The proposed algorithm of solution of a generalized problem (1)-(2) of minimal covering implies general principles of genetic algorithms (Goldberg and Holland,1988).It can be described on the Fig. 1.

Fig. 1. General Genetic Algorithm
Let us consider a design that is used in the proposed algorithm.The first step of genetic algorithm is selection of scheme of encoding.Binary encoding is selected, i.e. every chromosome is n -dimension vector values are chromosomes that correspond to the individuals.k j x Values are genes that can be equal to 0 or 1.First of all, let us select an initial population and then compute value of fitness function for chromosomes: Then we select the parents.When the problems are solved by means of genetic algorithms and the parents are selected, technique of roulette or some other techniques are used (Haupt and Haupt, 2004).In the proposed article, selection is made with the following technique.At the odd iteration two individuals are selected from the population that has minimal values of fitness function.At the even iteration we select one chromosome with minimal value of fitness function and another chromosome with maximal value of fitness function.In this way we avoid a rapid summation that brings us to the solution which is very different from minimal.Then we breed selected individuals with operator of one-point crossover.Individuals derived after crossover are mutated.Fixed value 1 N of mutation operator is often used in genetic algorithms, but we choose variable value of mutation, because purpose of crossover and mutation operators is to derive individuals that are different from individuals of population.
Besides, when we approach local extreme, individuals are slightly different from each other.Therefore, we must use mutation operator with variable value to avoid selection of only kindred individuals.This value will depend on individual, as well as characteristics of genotype of these individuals, particularly quantity of 1s and 0s in the genotype.
From the chromosomes derived by means of crossover and mutation, we select the best one that has minimal value of function of fitness/usefulness.Then we look for individual with maximal fitness in the initial population and replace it with individual selected from offspring.The process of selection of parent chromosomes, crossover, mutation and replacement of parent individual with offspring individual is repeated until the end of iterations.

Formation of initial population
Let us assume that columns of matrix A are arranged according to growth of costs.Quantity of individuals of population is denoted with S, L is the matrix of population with sn  size.(Note: In the algorithms the array indexes are specified in parentheses).Algorithm 1: Formation of initial population.
Step 1.Let us take: Step 3. Let us find number 1 i of the first element of R, for which:   .
Step 6.Let us find number 1 i of the first element of R for which:

R i b i 
. If such number does not exist, then the process is finished.
Step 3. If every       of population L and find dead-end coverings.

Crossover
After formation of population, we can compute values of fitness function for each chromosome by means of equation (3).We select two parent chromosomes on the basis of rule that is described in the third paragraph.Let us assume these chromosomes are: .So, it is necessary to find and add covering sets for uncovered nodes of those chromosomes.We propose algorithm that guarantees this process: Algorithm 3: Addition of subsets that cover uncovered nodes to chromosomes.
For legibility, chromosome Step 2. Let us find minimal number 1 i : , for every i th 1,2,..., im  then the process is finished.
Step 3. Let us find numbers   12 , ,..., j j j  of those columns of matrix A that cover 1 i r node.Let us select randomly any number u j from   12 , ,..., j j j  , i.e.   , 1 u a l j  ; take column u j into the current chromosome: .
Step 4. Let us assume 1 1 ii  If im  then jump to step 2, otherwise the process is finished.
. If we do not have such element in vector R, then the result will be empty.
Function numel determines the length of an array.If an argument of function numel is empty array, then it will return 0.
After crossover offspring chromosomes are mutated.

Mutation
Let us select some number k from set then develop vector of probability of mutation: Let us select gene  for mutation, where  is random normalized integer from range x .Then add columns that cover every uncovered node.Chromosomes derived after addition may not represent dead-end coverings.Let us use algorithm #2 and delete excessive columns.This process is repeated until depletion of iterations.

The results of experiment
The corresponding algorithm is realized with Matlab.For experiments we used standard test problems of covering from Or-Library, but because these problems imply that b i = 1, i = 1, …, N, we used random numbers b i = {1,2, 3}, i ϵ = 1, …, N. I developed AMPL-models for the same problems and solved them by means of NEOS Server (solvers Gurobi and CPLEX).Table 1 shows the results of solution.Fmin denotes the best value given by the above-mentioned genetic algorithm and Fgurobi denotes the value given by solver Gurobi.For those problems, where the dimensions of coverage matrix are 100 × 1000, 200 × 1000 or 200 × 2000, 400 × 4000, 500 × 5000 , the results obtained with proposed algorithm and the Gurobi solver are closed to each other.I have got better results with my algorithm compared to GUROBI-solver for scpd1(D,1) problem.For the dimensions 1000 × 10000, I have got results with my algorithm, though Gurobi solver did not get the result.and use the test problems from Or-Library.Table 2 gives the results.For comparison, in the column (Minimal value f*) of the table, known optimal solutions of these problems are given and the column (Fmin) shows the results of the offered algorithm.These results are quite close to minimal and often margin of error does not exceed 2-5%.Sometimes the results are even precise.On the basis of experiments, we may conclude that offered algorithm operates reasonably efficiently.For test the standard computer was used with specifications Intel(R) Pentium (R) Dual CPU E2220 2.40 GHz, 2.00 GB of RAM.

Conclusion
The presented work considers the generalized multicover problem (1-2), which is solved by means of genetic algorithm.Parents are selected according to the rule of proportional selection and correspondingly to the probability determined for each individual.Probabilities are calculated on the basis of fitness.The essence of operator of new cross-breeding is that genes of descendant individuals are determined on the basis of fitness of parent individuals and frequency of recurrence of these genes in a population.In addition, we use variable operator of mutation.After cross-breeding and mutation, descendant new individuals may not be covering anymore and therefore, we take into account a procedure which guarantees that new set is covering.I think that the algorithm and its results which are represented in this work, will be interesting for researchers of this field.
S of individuals (chromosomes) in a population depends on the scale of problem.
the first row of matrix L , i.e. the first chromosome of population in the following way:Step 2. Let us find the first columns with numbers kth gene of this chromosome.Note that specification of gene in this problem is its weight, as well is number of 1s and 0s in corresponding genotype.Let us calculate quantity   1 pj of 1s and quantity   0 pj of 0s for each jth columns of matrix A, when 1, 2,..., jn  .Besides, let us calculate the following values for each column:

Fig. 2 .
Fig. 2. Comparative diagram of constraints and calculated values of the problem scpd1.If in the problem (1)-(2) we assume that 1, 1,..., i b i N , then we get the standard problem of covering.In the offered algorithm, let us take 1, 1,...,i b i N and use the test problems from Or-Library.Table2gives the results.For comparison, in the column (Minimal value f*) of the table, known optimal solutions of these problems are given and the column (Fmin) shows the results of the offered algorithm.These results are quite close to minimal and often margin of error does not exceed 2-5%.Sometimes the results are even precise.

\
such element exists, then assume 1 ii  and jump to step 5.If such element does not exist, the jump to step 4. When a population is formed with such technique, it is possible to find excessive genes in each population, i.e. after deleting some or several columns from each covering, set Jj is not covering for any jJ  .Therefore, it is necessary to delete excessive columns from each individual of given population and make them dead-end.For this purpose the following heuristic algorithm is used: Algorithm 2: Deletion of excessive columns from kth individual.
l r th node in k th chromosome.It means that we must find quantity in k th chromosome that is represented in k th row of matrix L, when   If pu  , then jump to step 3.If im  then jump to step 2, otherwise the process is finished.
Note: find(R<=b,1,'first') searches the first kth element of vector R, where

Table 1 .
The results of experiments for a generalized problem of covering

Table 2 .
The results of experiments for a standart problem of covering.