Exploring Hybrid Genetic Algorithm Based Large-Scale Logistics Distribution for BBG Supermarket

: In the large-scale logistics distribution of single logistic center, the method based on traditional genetic algorithm is slow in evolution and easy to fall into the local optimal solution. Addressing at this issue, we propose a novel approach of exploring hybrid genetic algorithm based large-scale logistic distribution for BBG supermarket. We integrate greedy algorithm and hill-climbing algorithm into genetic algorithm. Greedy algorithm is applied to initialize the population, and then hill-climbing algorithm is used to optimize individuals in each generation after selection, crossover and mutation. Our approach is evaluated on the dataset of BBG Supermarket which is one of the top 10 supermarkets in China. Experimental results show that our method outperforms some other methods in the field.


Introduction
The core of logistics distribution is the vehicle routing problem (VRP). Since the problem of VRP was raised by Dantzing [1] in 1959, a large number of research results have been obtained in recent decades. With the accelerating process of urbanization, the radiation area of cities has been expanded and the distribution of goods on the offline O2O platform, such as BBG's Commercial Chain Co., Ltd., China, has become a research hotspot in the field of large-scale coordinated distribution of city logistics [2].
The traditional methods for solving VRP focus on dynamic programming methods or heuristic algorithms. However, the traditional methods such as tabu search algorithm (TS) and ant colony optimization (ACO) are time-consuming in evolution and easy to fall into the local optimal solution. Also, the average convergence rate of these algorithms is slow. How to solve these two deficiencies of traditional methods is an urgent problem in logistics distribution.
Addressing at this issue, we propose a novel approach of exploring hybrid genetic algorithm based large-scale logistic distribution for BBG supermarket. This method uses greedy algorithm to initialize the population and performs the hill-climbing operation on the best individuals in each generation after undergoing selection, crossover, and mutation. From the validation on real large-scale commercial logistics distribution data, the result shows that this method can speed up the evolution of the population, avoid the local optimal solution, and greatly improve the quality of understanding.

Related Work
Nowadays, more and more algorithms have been proposed to solve VRP. Kim et al. [3] proposed method that solving the dynamic vehicle routing problem through the Markov decision process model, and used approximate dynamic programming to avoid dimensionality disasters; Li et al. [4][5] proposed a multivariate optimization algorithm and a heuristic intelligent path planning method using Bezier curve, while this algorithm need determine the search radius manually, and the efficiency and stability of the solution are difficult to guarantee; Li et al. [6][7] proposed a dual-chromosome genetic algorithm to improve the performance of the solution through hill-climbing (HC) and simulated annealing (SA), however, compared with a chromosome coding, it greatly increases the understanding space, resulting in slow evolution; Podgorelec et al. [8] proposed a genetic algorithm for solving the problem of vehicle routing in multiple vehicles, they focus on the design of different selection, crossover and mutation operators, but as the scale of solving problems increases, the speed of population evolution will be slowed down; Lang et al. [9][10] designed a different improvement method for the genetic algorithm to fall into the problem of "premature maturity", but the problem of slow convergence of the algorithm still exists; Zhang et al. [11] improved the crossover operator and used large mutation operations to improve the search speed of genetic algorithms and avoid falling into local optimal solutions, but the complexity of the algorithm increased; Zhou et al. [12] improved the genetic algorithm by introducing niche technology to enhance the diversity of population and the ability of global optimization; Zhang et al. [13] constructed the robustness and optimization method of vehicle routing problem for the uncertainty factors in logistics distribution to reduce the influence of uncertainty factors on logistics distribution system; Ning et al. [14] designed a multi-objective optimization model for logistics distribution interference management and a user sensitivity decision model based on foreground theory for different types of interference problems; Liao et al. [15] established an optimized mathematical model for solving logistics vehicle scheduling, and solving it through two stages, thus constructing a hybrid genetic algorithm. With the development of trajectory data mining [16][17], spatiotemporal model [18][19][20], vehicle route recommendation [21][22], and future network technology [23], the solution method of VRP problem is more and more diverse.

The General Vehicle Routing Model
The VRP problem studied in this paper can be described as follows: Suppose the goods are delivered to customers by vehicles from a distribution center, there are m vehicles in the distribution center, the maximum load capacity of each vehicle is k Q , the maximum mileage is ( 1, 2, , ) k L k m = ××× ; the number of customers is n and each of customer's demand is ( 1, 2, , ) i q i n = ⋅⋅⋅ , the distribution center number is 0 and ij d is distance from customer i to customer j . According to the above constraints, the reasonable total route of the vehicles is shortest under the premise of satisfying the customer's needs. The following conditions must be satisfied: 1. There is only one distribution center which is the starting point and the end point of each line. The vehicle needs to complete the delivery task from the distribution center and finally return to the distribution center.
2. The sum of customer's demand on each distribution line does not exceed the maximum loading capacity of the corresponding delivery vehicle.
3. The total length of each distribution line does not exceed the maximum mileage of the corresponding delivery vehicle. 4  Eq. (1) is the objective function of the shortest total mileage. There are some constraints as follows: This restraint condition represents the maximum load constraint for the vehicle; This restraint condition represents the maximum mileage constraint for the vehicle; 3.
This restraint condition represents customer constraints assigned to vehicle k while ensuring the vehicle's travel route; 6.
These restraint conditions represent ranges of values for variables ijk x and ik y .

Hybrid Genetic Algorithm Based Large-Scale Logistics Distribution
The genetic algorithm simulates the idea of the survival of the fittest in Darwin's evolution theory which has been widely used in image processing, machine learning, combinatorial optimization, industrial design and other fields since it was raised. As a global optimization algorithm, it is characterized by simplicity, robustness and versatility. At the same time, genetic algorithms are prone to slow evolution and easy to fall into local optimal solutions in the process of computing. Thus, this research designs a hybrid genetic algorithm (HCGAG) to solve the VRP problem, the details of HCGAG is shown in Algorithm 1.

Algorithm 1. Hybrid Genetic Algorithm begin
Objective function f(x), It can be seen from the above pseudocode, that the hybrid genetic algorithm is based on the standard genetic algorithm, and the greedy algorithm and hill-climbing operator are added into it. The specific implementation details are as follows.

The Design of Coding
In the manner of natural number coding [24], 0 is used to indicate the distribution center, and 1, 2, , n ⋅⋅⋅ is the customer. And the route is planned according to the loading rate of the vehicle and the maximum mileage. For example, for a solution with 10 customer codes ( Fig. 1), the resulting distribution scheme is: Route1: 0 4 7 1 0; Route2: 0 8 6 2 0; Route3: 0 9 3 10 5 0.

GA Initialization with Greedy
In this paper, the greedy algorithm [25] is used to produce the individuals of the initial population. This method utilizes the local optimization ability of the greedy algorithm which not only guarantees the diversity of the initial population, but also accelerates the optimization speed of the algorithm. For N individuals, the greedy algorithm for n genes of each individual is used to initialize the population: Step 1. Determine whether the number of individuals in the current initial population is equal to N, if yes, end; if not, go to Step 2; Step 2. Randomly generate a number which is range in 1 and n as the current customer c C , and make it join the individual. Then find one of all customers which is not in individual and is closest to c C , add it to the individual and act as the current customer. And continue to search for the next nearest customer until all customers are joined the individual; Step 3. Detect whether the individual obtained in Step 2 is already in the population, and if yes, go to Step 2; if not, add the individual to the population and go to Step 1.

Select Operator
The selection operator is an operation based on individual fitness, and individuals with higher fitness are more likely to be selected into the next generation. In this method, the fitness function of an individual Eq. (1)) represents the total length of the routing plan for the logistic distribution. Here, M represents the total mileage of the current routing beyond the maximum mileage of the selected vehicles. If M is smaller than 0, then M will be set to 0, which means the current routing is feasible. If M is greater than 0, then the current routing is infeasible and it will be punished.
Here, w p represents the penalty weight for each infeasible routing.
Step 1. Calculate the fitness value of each individual in the population i f and its proportion Step 2. According to the optimal individual retention strategy, individuals with the highest fitness value are directly entered into the population of the next generation; Step 3. Determine whether the population of the next generation is full. If yes, go to Step 5; if not, go to Step 4; Step 4. Randomly generate a decimal r in (0.0~1.0), if 1 1 1 i i p p r p p − + ⋅⋅⋅ + < ≤ + ⋅⋅⋅ + , insert the individual i into the next generation population. Then go to Step 3; Step 5. End the selection.

Crossover Operator
The crossover operator selects at random two substrings in the parent individuals, and the substrings are imposed on the other individual to produce offspring individuals. The crossover procedure is illustrated in Fig. 2. First, two individuals are selected as the parent chromosomes, and then two gene substrings marked with shadow are randomly generated from them. Next, the selected substring in chromosome1 is added to the rear of chromosome2, and the selected substring in chromosome2 is added to the rear of chromosome1. After crossing substrings, the genes which are duplicated in the transitional chromosomes are deleted, and then two offspring chromosomes are obtained. Compared with other crossover methods, this method can produce a certain degree of mutation while preserving a part of the parent's chromosome segments.

Mutation Operator
In order to ensure the diversity of individuals in the population, the mutation operator is introduced. The mutation probability m P is used to determine the mutation operation for the offspring generated by the crossover operator. In this paper, J times gene swapping was carried out for the individuals which have mutated operation. The mutation operator can be illustrated in Fig. 3. For example, if J equals to 2, the current chromosome will be mutated by swapping its genes at different locations twice.

Hill-Climbing Operator
This paper used the hill-climbing operator [6] to perform the hill-climbing operation on the optimal individuals of each generation. While preserving the global search ability of genetic algorithm, the method utilized the local search ability of the hill-climbing algorithm and accelerates the convergence speed and improve the ability to jump out of the local optimal solution. The specific steps are as follows: Step 1. Randomly select two genes in the optimal individual best C and exchange their positions, then generate an individual ' best C ; Step 2. Calculate the fitness ' best P of the individual ' best C after transposition and the fitness best P of the original individual best C , if ' best best P P > , then ' , then keep best C ; Step 3. To determine whether reach the maximum number of climbs, if yes, the climb is over; if not, go to Step 1.

Experiment 1: Comparison with Other Genetic Algorithms
The simulation experiment is implemented in Intel(R) Core (TM) i5-6500 CPU @ 3.20 GHz processor, 8 GB memory under Windows10 64-bit operating system using jdk1.8.0 programming environment. The specific settings of the algorithm parameters are: the individual number is 20, the crossover rate is 0.9, the mutation rate is 0.09, the number of gene transpositions is 5, the fitness function penalty weight is 1000 km, the maximum number of iterations is 2000, and the number of hill-climbing is 20.
In this research, the actual distribution data is provided by BBG commercial logistics management system, BBG is the top 10 commercial chain company in China. The rated volume of distribution vehicle is 7.3 m 3 , the maximum loading rate is 0.85, and the maximum mileage is 600 km. Tab. 1 shows the number of the customers and the quantity of goods delivered at different scales. The experiment separately plans the distribution route by the standard genetic algorithm (SGA), the genetic algorithm (GAG) using greedy algorithm to produce the individuals of the initial population and the hybrid genetic algorithm (HCGAG). Because of the instability of the genetic algorithm, the experiment evaluated each algorithm about 10 times. Our proposed method (HCGAG) has very good practical application value in BBG's large-scale logistics distribution. Tab. 2 is a comparison of the experimental data among SGA, GAG and HCGAG for the distribution path. The symbolization in the table, such as Best, Worst, Avg, Time respectively represents the path length of best solution, worst solution, mean solution, and average solution time. It can be seen from Tab. 2 that the best, worst and average results of HCGAG is better than GAG and SGA, but the average time used for HCGAG is higher than GAG and SGA. The HCGAG algorithm can bring a significant performance improvement by only adding a small amount of computing time. The cost of time can be easily overcome by upgrading computer hardware, which makes sense in business practice. We will further analyze the advantages of the HCGAG algorithm through graphical demonstrations. Fig. 4 is a comparison of the average mileage; Fig. 5 is comparison analysis of the average time-of-use of the three algorithms. It can be seen from them that the quality of the solution is greatly improved in the case of little increasing the time consumption of the algorithm, because of the population initialization is performed by the greedy algorithm on the basis of the standard genetic algorithm. Then, the optimal individual in each generation is hill-climbed, and the total mileage is improved compared with the original algorithm.  Fig. 11 are comparisons of convergence on mileage for the SGA, GAG and HCGAG in 6 instances with the increasing of iterations number. From these figures, the convergence of the SGA in 6 instances is unstable and significantly inferior to GAG and HCGAG. Comparing GAG and HCGAG, we know the convergence degree of these two algorithms is similar. However, when the convergence of GAG tends to be stable, the total mileage of HCGAG converges to the optimal value with the increasing of iterations number.

Experiment 2: Comparison with Other Intelligent Optimization Algorithms
In order to further validate the effect of the hybrid genetic algorithm proposed in this paper on solving large-scale logistics distribution problems. The tabu search algorithm (TS), ant colony optimization (ACO) and hybrid genetic algorithm (HCGAG) were selected to solved delivery paths about 10 times for 188 customers in instance 2 of Tab. 1. TS and ACO are both one of the intelligent optimization algorithms, and they are also commonly used to solve VRP problems. And the experimental results are shown in Tab. 3.  It can be seen from Tab. 3 that the HCGAG algorithm not only has the least computing time but also it has the best performance on the best solution and the average solution. Although the ACO algorithm shows the best performance on the worst solution, it costs much more computing time than HCGAG.   Fig. 12 is a schematic diagram of convergence on mileage of the tabu search algorithm, ant colony optimization and hybrid genetic algorithm. It is clear from the figure that the hybrid genetic algorithm proposed in this paper has a faster convergence speed than the tabu search algorithm and the ant colony optimization. Furthermore, the final delivery mileage of the HCGAG is obviously shorter than that of the other two compared algorithms. In sum, our proposed algorithm has better performance than the other two intelligent optimization algorithms in terms of logistics vehicle routing.

Conclusions
This paper focused on the large-scale logistics distribution problem of single logistics center and proposed a hybrid genetic algorithm. The method initialized the population through the greedy algorithm and climbed the optimal individual in each generation after selection, crossover and mutation to form the final hybrid genetic algorithm. The experiment is validated on the actual dataset provided by the BBG commercial logistics management system. At different experiments, the proposed method is superior to other related methods in the total mileage of distribution. Through the analysis of algorithm convergence, the proposed method could speed up the evolution of genetic algorithm and jump out of the local optimal solution. It is proved that the method proposed in this paper has great practical significance for solving the problem of large-scale logistics distribution in a single logistics center.

Conflicts of Interest:
The authors declare no conflict of interest.