A new improved genetic algorithm approach and a competitive heuristic method for large-scale multiple resource-constrained project-scheduling problems

Article history: Received 6 May 2010 Received in revised form June, 28, 2011 Accepted 29 June 2011 Available online 30 June 2011 The aim of this paper is to present a new genetic algorithm approach for large scale multiple resource-constrained project-scheduling problems (RCPSP). It also presents a heuristic approach to achieve proper solutions for large scale problems. This research area is very common in industry especially when a set of activities needs to be finished as soon as possible subject to two sets of constraints, precedence constraints and resource constraints. The emphasis in this research is on investigating the complexity of scheduling problems and developing a new GA approach to solve this problem in such a way that the advantages of GA are appropriately utilized by applying a novel method to reduce the complexity of the problem. Computational results are also reported for the most famous classical problems taken from the operational research literature. © 2011 Growing Science Ltd. All rights reserved


Introduction
The RCPSP is a general scheduling problem that includes precedence and resource constraints and it is considered as NP-hard problem (Philippe et al., 2008).The RCPSP problem presents a very large search space for exhaustive enumeration to attain the optimum solution.Moreover, they would require more computational expense as the problem size grows or additional constraints are added.Practically it is impossible to search the entire search space, which means minimization of makespan becomes tedious and time-consuming.Despite all difficulties to solve RPSP, it has been the focus of a massive research efforts, because even incremental improvements in project scheduling can lead to huge benefits in terms of resource and production time saving (Wall, 1996).
The structure of this paper is organized as follows, in section 2, the related works are reviewed, which consist of exact and heuristic methods.Section 3 defines the problem and its formulation.In section 4, the structure of the proposed genetic algorithm (GA) and its operators are described, in the next section, a novel heuristic capable of achieving near-optimal solutions is proposed.Section 6 includes computational results associated with conventional benchmarks.Finally, section 7 concludes our approach and expresses some more probable application of proposed GA method.

Literature review
Solution methods applied to solve RCPSP form two different classes: exact and heuristic methods.Exact methods are involved to find the optimal solutions but they are impractical for large-scale problems having significant number of constraints.Exact methods can be classified into two groups: (1) mathematical programming and (2) enumerative techniques.Methods of solving such problems through mathematical programming are reviewed in Garfinkel and Nemhauser (1976) and Daellenbach and George (1979).They reported that generally mathematical programming (MP) methods require much computation time to solve real-world problems.Additionally, they claimed that MP methods do not consider particular properties to solve all kinds of problems.Consequently, MP problems tend to take longer computations to find a solution than implicit enumeration algorithms designed specifically for a particular class of problems.
Enumerative techniques simply list, or enumerate all possible schedules and then eliminate the nonoptimal schedules from the list, leaving those, which are optimal.Branch and Bound search is a typical implicit enumerative method.Conway et al. (1967) introduced the uses of branch and bound in scheduling.Patterson and Huber (1974) suggested two exact bounding algorithms, a minimum bounding algorithm and a maximum bounding algorithm.
In order to solve real world problems appropriately, many researchers have switched to metaheuristics.Basically, finding near optimal results and using limited computation time are the major characteristics of any heuristic algorithm.There are different heuristic approaches which deal with the RCPSP.Fayer (1990) reported fairly good performance by a simulated annealing approach on scheduling problems.The more important thing is that Fayer's implementation maintained precedence feasibility by restricting the neighborhood operator to only precedence-feasible task swaps.That means any step during the searching process must obey the priority rule.Some researchers reported that they had achieved high-quality project scheduling results by Tabu search (Dell' Amico and Trobian, 1993;Nowicki and Smutnicki, 1996).Merkle and Schmeck (2002) reported that they could achieve excellent results in project scheduling by an ant algorithm.They presented a new ant colony algorithm procedure (AS-RCPSP) for the project scheduling, which combines the direct (local) and summation (global) pheromone evaluation methods.Actually the only difference between these two pheromone evaluation methods is that the second pheromone evaluation methods gives weighted value to each pheromone value in order to get rid of the local minimum.Furthermore, they discussed the changing strength of heuristic influence, the changing rate pheromone evaporation over the ant generations.In recent years, with developing hybrid meta-heuristic algorithms, there have been employed such methods for RCPSP; specifically one can refer to Valls et al. (2005) and Kim et al. (2003).
In addition, there are several genetic algorithms applied to either MRCPSP or RCPSP.Next part briefly describes their approaches and related methods.

Previous genetic algorithms
One of first attempts to apply GA in scheduling problems was made by Davis (1985).The main idea of his approach was to encode the representation of a schedule in a meaningful and legal way.In general, most GA operators often produce illegal schedules when they are applied to a scheduling problem.David and Lingle (1985) introduced a new crossover operator (Partially mapped crossover) for the sequence problem.In the same year, Davis (1985) introduced order crossover (OC) for the sequence problem.Two years later, Oliver (1987) developed another new crossover operator called cycle crossover (CC).These special developed crossover operators for the sequence problem will be discussed in the design of our proposed GA.Genetic algorithms are natural candidates for parallel processing.One approach is splitting the population into subpopulations and assigning a processor to each subpopulation.A standard genetic algorithm is run on each processor and the subpopulations evolve.Periodic migration is permitted when some chromosomes from one subpopulation are transferred or copied to another.Grefenstette (1986) did some surveys on parallel processing of GA.Some researchers presented their comments about the control parameter setting for GA.However these comments only fit for some particular problems and not for general problems.Grefenstette (1986) believed the parameter setting of a GA should be in a range rather than some particular numbers.He made his comments about the best parameter settings for GA: population size from 30 to 80, crossover rate from 0.45 to 0.95, and mutation rate to 0.01.Based on the computational experience, Kolisch and Padman (1996) carried out a survey to investigate the performances of different algorithms.Their research revealed that the GA approach of Hartmann (2002) has been the best so far.This is the approach that its operations are imitated for our proposed genetic algorithm.2009) blended genetic algorithm with a novel method capable to produce active schedules.They also exploited a new fitness function to enhance their procedure qualities.Agarwal et al. (2011) proposed a Neurogenetic approach, which was a hybrid GA and neural-network (NN) approaches.In their hybrid approach, the search process relied on GA iterations for global search and on NN iterations for local search.Akbari et al. (2011) presented an artificial bee colony as an alternative and efficient optimization strategy for solving RCPSP and investigated its performance on the RCPSP compared with other metaheuristics for solving case studies in the PSPLIB library.

Problem definition
Let be the number of activities with identical durations imposed for execution under precedence constraints.Interruption is not allowed while an activity is being processed.Although there is no constraint to the number of activities being processed, there are constant quantities of resources which are renewable, which means any feasible solution is not allowed to exceed available resources but every activity can be finally executed by being floated to a time period which has sufficient available resources.Logically, r ij indicates the amount of the resource j needed to progress activity i, which is assumed constant during the activity processing time.The LP model can be written as follow (1987).
The first constraint does not allow activities to start before the finish time of their precedent activities.
The second constraint assures that there must be adequate resources needed by activities in each period.The last constraint imposes completion times to be positive.

Design of the GA
A new GA approach for RCPSP is developed in this section.In general practices of GA, we should consider: (1) GA's exploration and exploitation ability, (2) the convergence and diversity of the population, and (3) the nature of multiple resources constrained project scheduling problem, especially in the feasibility of each solution (constraints of resource and precedence).In designing of our proposed GA, we will consider these issues and discuss some measures associated with them.

Encoding and representation of chromosome
For many optimization problems, GA does not operate directly on the solutions for the problems.Instead, they make use of problem-specific representations of the solutions.The genetic operators modify the representation, which is then transformed into a solution by means of a so-called decoding procedure (Hartmann, 2002).First we discuss how to represent the sequence of a set of numbers in the chromosome.Then we discuss how to allocate the resources and start time to each activity.To facilitate the discussion, we use the previous sequence studies as a starting point.
During the last few decades, there have been two main chromosome encoding methods for representing the sequence of a set of numbers: (i) Adjacency, (ii) Path representation (Jean, 1996).However adjacency method is proper for some problems such as traveling salesman problem (TSP), but when it is used for RCPSP, adjacency representation has some disadvantages (Jean, 1996).Therefore, we choose the path representation as the presentation of the chromosome for our proposed GA.We create an activity list that indicates the sequence of activities.After discussing how to represent the sequence of a set of numbers in the chromosome, we need to discuss how to allocate the resources and start time for each activity.
There are two possible schedule generation schemes (SGC): a serial schedule generation scheme and a parallel schedule generation scheme.The serial SGA is constructed from activity lists as follow: First, the activity 1 is started at time 0. Then the activities are scheduled in the order prescribed by the activity list.Thereby, each activity is assigned the earliest precedence and resource feasible start time.
In parallel SGC the activity starts at time 0. The difference is that it computes a so-called decision point, which is the time in which an activity to be scheduled is started.This decision point is determined by the earliest finish time of the activities currently in process.For each decision point, the set of eligible activities is computed as the set of those activities, which could be feasibly started at the decision point.The eligible activities are selected successively and started until none of them is left.Then, the next decision point and a related set of eligible activities are computed.This is repeated until all activities are feasibly schedule (Hartmann, 2002).Hartmann also pointed out that the activity list representation together with the serial SGC as decoding procedure leads to better results than other representations for the RCPSP.Based on this research we choose the serial SGC in our proposed GA.

Crossover
For the RCPSP, a simple crossover reproduction scheme does not work since it makes the chromosomes inconsistent.In other words, some activities may be repeated while others are missed out and hence solutions cannot meet the precedence constraints.The traditional crossover operators like one point crossover is regarded as inappropriate in the study of scheduling problems.The drawback of the simple crossover mechanism is illustrated in Fig. 1.

Fig. 1. Illegal solutions created by inappropriate crossover
A simple crossover operator also cannot guarantee the precedence constraints; even no activity is missed out or visited twice.For example, in Fig. 1, we assume, as the precedence constraint, activity 5 must begin after activity 4 is finished.In this case, after the simple crossover, activity 5 maybe begins before activity 4. Except the traditional one or two point crossover, recently two crossover operators were developed for the sequence problem: partially-mapped (PMX) and order (OX) crossovers.However, they cannot guarantee that no activities are missing or visited twice.Sometimes they still break the precedence constraints.The crossover operator used in this paper is derived from them and inherits their merits.According to PMX and OX crossover operator, the crossover operator in this paper is described as follow.There are three main differences among this operator and PMX and OX: (i) Both PMX and OX focus on missed or replicated activities.However they do not consider the precedence constraints.The crossover operator used in this paper can guarantee the precedence constraints.(ii) The crossover operator used in this paper only produces one offspring rather than two.(iii) This crossover focuses on the change between the two cut points rather than outside the two cut points.The pseudo-code of the crossover operator is as follows.

Begin
Set R= Determine randomly x,y,crossover points.
For each i a in parent1 chromosome

If i<x or i>y
Then copy ai in the same position in child's chromosome Else R= R U { i a } Sort R according to the positions of the associated individuals in parent2's chromosome.
Insert R in vacant space in child's chromosome End.
Based on the procedure of our proposed crossover operator shown above, let us set the following two parents to show the crossover operation in Fig. 2.

Proposed crossover operator
The crossover of our proposed GA is also similar to Hartmann (2002).However, Hartmann's crossover can create two children, the crossover of this paper only creates one child.Another important thing is that our proposed crossover operator only considers the changes in the middle part of the schedules rather than the whole parts of solution.

Mutation
The classic mutation operator is also inappropriate in scheduling problems in terms of precedence constraints as well as crossover operators.Mutation used in this paper will first randomly choose 2 activities, and if possible, it swaps them.The reason for using the word possible is that the offspring after mutation may be invalid and our proposed algorithm checks the precedence constraints.This means that if the swap breaks the precedence constraints, the procedure abandons this switch.This procedure continues till the first feasible exchange is found or the number of tries to find it exceeds k, if k is determined as a big number, we need a significant running time to find feasible swaps, on the other hand, a very small k practically eliminates the mutation chance in the population.

Selection
The parent selection operator used in this paper randomly chooses two individuals and compares them.Consequently, better individual is added to the mating pool.It is known as tournamentselection in literature, which not only chooses better individuals but also gives a chance to those with worse fitness value.For each individual i,the fitness function is computed as follow.
Fitness value i 1 (5) Appling crossover and mutation operators lead to a similar-size child population.After that, recent population is combined with the parent population.Logically the best solutions will participate for the next generation.

Population diversity
A good selection of population size plays an important role for the convergence of the proposed GA method.In this paper, in order to keep the population diverse, not only mutation is applied but also some new individuals are added in each generation.There are either new individuals are as well as others or not, which helps population avoid premature convergence.Although the number of new individuals effectively affects the performance of the GA and the computation time, it seems to be the best way when the mutation scheme is not able to keep diversity in population because of the complexity caused by precedence constraints.

Heuristic method
Heuristic methods are usually applied to solve problems, because they are simple and fast enough to be used, commercially.
In this paper a two-stage heuristic method is introduced where in the first stage, it simply calculates the longest path to achieve the project finishing time from the current activity and rates them according the following expression.
Note that it is assumed that there is a dummy job with zero progressing time, which must be executed at last with α j = 0.
In the second stage, we schedule activities based on the results of the previous stage.As it is summarized below, it starts with an empty scheduled list , after that, it lists available jobs at the current time, which is shown they should respect just precedence constraints.Next step considers resource constraints and schedules activities in the order.If there was any excess requirement to do the activity, the algorithm simply checks the next activity.When there is no feasible activity at current time the time is raised to the next event.The event happens whenever an activity ends and causes a change in resource levels.The following procedure is repeated for all planned activities.

Pseudo-code 2: Crossover operator
. and Sprecher (1996) presented a set of benchmark instances for the evaluation of scheduling techniques for the RCPSP called PSPLIB and it is accepted in the literature as the set of benchmark problems.The feature of PSPLIB is that the instances are classified according to various indicators.Either the GA approach or the proposed heuristic developed in this research were coded in C# and have been run on a PC with core i5 2.53 GHz CPU and 4 GB of RAM.Table 1 compares proposed GA with MR heuristic.Note that after 1000s the GA finds the best solution.Consequently, reported results, which have greater running time than 1000s could not complete their search.Although the GA has better results, as the number of activities increases the running time rises dramatically.In this sense MR heuristic has acceptable performance.In order to distinguish between algorithms those are faster than average deviation is usually computed as follow, 1

Kolisch
Ave. Deviation= 100.D j :the best solution found till j th n: the number of iterations D : optimal solution or the best found solution Fig. 3 and Fig. 4 show the running time for both the proposed GA and the proposed heuristic, respectively.Although both GA and MR heuristic algorithms practically solve problem in exponential running time, the MR heuristic is capable of dealing with larger problems properly.1 reports the results of our proposed method for J30 problems where only Ranjba's (2008) algorithm performs better than our GA when the number of generations is five thousands.Although our GA algorithm used to have better performance but it consumes exhaustive running time to find feasible solutions and add new members to the population.Similarly Table 2 shows the results associated with J60.Because there is not optimal solution for problems when problem size is greater than 30, the results are a lower bound for average deviation from the optimal solution.The PSO algorithm presented by Tchomte (2007) outperformed other algorithms followed by the proposed GA.For J90 problems, filter and fan search method proposed by Ranjbar (2008) demonstrates the best performance.Table 5 presents the lower bound for average deviation from the optimal solution in large instances, i.e.J120.We have generated the performance of all our methods for different algorithms for N=1000 and N=5000.
Finally, Table 6 indicates that our proposed GA has the best performance for large-scale problems and it is observed that it produces massive amount of individuals as well as keeping population diverse could improve the performance of the algorithm for larger problems.

Conclusions
In this paper, we have proposed a new GA method to solve large-scale multiple resource-constrained project-scheduling problems.The proposed model of this paper takes advantages of GA and chooses special designs for specific requirements of the problem such as the representation of the chromosome, the precedence and the resources constraints.We have compared the performance of the proposed model of this paper with other available methods in the literature using some wellknown benchmark problems.The proposed GA presented of this paper represented o the chromosome, the crossover and mutation operator in a way to obey the precedence and resource constraints.This means that after crossover, the child solutions could still satisfy precedence constraints.This is a crucial issue for the RCPSP, but many other alternative methods could not handle such a problem.The preliminary results indicated that the proposed model of this paper performed relatively well compared with other existing methods.

Fig. 3 .
Fig. 3. Proposed GA running time Fig. 4. MR heuristic running timeTable 2 to Table 4 compare our proposed GA with other metaheuristics in the literature.Table1reports the results of our proposed method for J30 problems where only Ranjba's (2008) algorithm performs better than our GA when the number of generations is five thousands.Although our GA algorithm used to have better performance but it consumes exhaustive running time to find feasible solutions and add new members to the population.Similarly Table2shows the results associated with J60.

Table 1
Comparison between proposed algorithms

Table 2 to
Table 4 compare our proposed GA with other metaheuristics in the literature.Table

Table 2
Average deviation from the optimal solution for J30

Table 3
Lower bound for average deviation from the optimal solution for J60

Table 5
Lower bound for average deviation from the optimal solution for J120