Expectation-Maximization Algorithm of Gaussian Mixture Model for Vehicle-Commodity Matching in Logistics Supply Chain

,


Introduction
With the development of Internet technology, a series of new solutions and adjustments are designed to the layout of the logistics industry.In the past, the distribution centers of producers nationwide have gradually evolved into personalized orders, which push up the update of logistics service and change the whole logistics system.e data of logistics system and customer transactions can track the execution process of order fulfillment from order allocation to order delivery on the Internet platform, so as to achieve collaborative management, service efficiency, and system cost control [1].e firms need to synthetically balance the relationship of order service, logistics capacity, vehicle arrangement, inventory control, distribution cost, and so on, in order to form an orderly and uniform operation process [2].For the mode of transportation, China's logistics can be divided into highway, railway, water transport, aviation, and pipeline.For e-commerce enterprises, there are limited physical resources to support offline services (e.g., vehicles and channels), so it is difficult to deal with abnormal situations in the transportation process [3], such as product deterioration caused by the mismatch between chilled/fresh products and refrigerated vehicles.Matching of vehicles and commodities promotes the whole logistics carriers' service, which is related to consumers' evaluation and feedback on the quality of commodities [4].e importance of the vehicle-commodity matching problem (VCMP) is widely attracted to logistics companies and e-commerce enterprises.
Logistics system modeling and simulation technology are feasible and reliable to find optimal solutions for the order allocation and logistics distribution [5,6]; Goswami et al. [7].Yang et al. [5] formulated a cooperative rich vehicle routing problem (CoRVRP) for three typical logistics providers in the last-mile rural logistics system and it is solved by a new branch-price-and-cut algorithm.Patidar et al. [6] designed the vehicle routing for the collection of agrifood products from farmers to the market based on GA and PSO in India.Goswami et al. [7] developed a Bayesian game-theoretic framework of product portfolio planning problem to find the right product portfolio set for the manufacturers, considering the duopolistic market and the product type.In the aspect of problems, the existing research has paid attention to the optimization of the logistics service process and the balance of interests of logistics service providers, but there is a gap in matching subprocess links in micrologistics service.Particularly, vehicles belong to the assets of enterprises, while goods belong to the needs of consumers, so the matching process is a way to improve service, which can make the connection of service links more efficient and personalized.
e plan of distribution resources (e.g., vehicles and commodity) using an intelligent algorithm and heuristic method has become a highly concerning challenge for scholars [8][9][10][11].Regarding the evolutionary algorithm, in the research of Garg [9], particle swarm optimization (PSO) operates in the direction of improving the vector and the genetic algorithm (GA) for modifying the decision vectors using genetic operators.Garg [8] proposed the algorithm with the gravitational search method and genetic operators to upgrade solutions by selection, crossover, and mutation.Based on Garg's research, initializing the parameters can influence the design of the whole algorithm to a large extent.erefore, our research structured the prearrangement model based on GMM and expectation-maximization (EM), so that the input parameters of the algorithm are stable.
Overall, our study makes three main contributions.First, VCMP is proposed to upgrade the service of logistics providers by matching vehicles and commodities.is is different from the global perspective of logistics system optimization; our research focuses on service details of matching.Second, our model is based on the standard GMM to optimize the input of the designed algorithm, forming a joint GMM-EM prearrangement model.ird, the GA algorithm is stable and can converge to the same solution repeatedly [8].e convergence speed of the PSO algorithm is second, but the algorithm is not stable, and the final convergence result is easily affected by parameter size and initial population [9].e convergence speed of the EA algorithm is relatively slow, but in dealing with the noise problem, EA can be a good solution, while the GA algorithm is difficult to deal with this kind of noise problem.
is research presents a novel expectation-maximization (EM) algorithm based on GMM for VCMP in the logistics supply chain.e remainder of this paper is organized as follows.e application scenarios of VCMP can be divided into two categories through literature review in Section 2. VCMP is described by a binary complete digraph in Section 3. On the basis of Section 3, a microoptimization model of VCMP is proposed in Section 4. In order to improve the delivery speed with stable service quality, the algorithm is designed for VCMP in Section 5.In Section 6, a numerical example is applied to the proposed models and algorithm.e conclusions and future research direction are presented in Section 7.

Literature Review
e application scenarios of VCMP can be divided into two categories in logistics industry.Scenario I. Logistic companies and production enterprises have opened their own capacity pool system.Learning from the research results of traditional industries is helpful to meet industry challenges in the era of networking.Ren et al. [3] optimized capacity allocation for cross-border e-commerce related 3PFL operations.Moghaddam [12] constructed a fuzzy multiobjective mathematical model to solve the uncertain customer order demand considering the supplier's ability and the proportion of returned products, in the reverse logistics system.Cont et al. [13] studied the order allocation model under stochastic dynamic constraints in the financial market and used matrix calculation and Laplace transform to calculate the probability of effective order allocation.Bayraktar and Ludkovski [14] based on Cont et al. considered a problem of solving the optimal clearing limit order and constructed the arrival of order as a price strength dependence.Mafakheri et al. [15] proposed a two-stage dynamic planning method for supply chain management to solve the problem of multisupplier ranking and then introduced the supplier parameters into an order allocation model to maximize the utility of the company.Kannan et al. [16] further expanded the research of Mafakheri to obtain a set of systematic methods.Azadnia et al. [17] proposed a comprehensive method based on rule weighted fuzzy algorithm, combined with multistage.
e fuzzy analytic hierarchy process and multiobjective mathematical programming were used to solve the problem of multiproduct batch.e past research has mainly researched the optimization of different logistics services and the balance of interests of service providers, but there is a lack of matching subprocess of logistics service.Our research focuses on this gap to find a matching mechanism between vehicle service and commodity service so that a compact and personalized service process is designed and provided to consumers.Scenario II.
e public platform gathers the social capacity resources with the vehicle matching software.e research on using intelligent algorithms to solve the integration problem of order and logistics distribution is becoming more and more obvious.Dávid and Krész [4] introduced the schedule assignment problem for public transit in the fleet of a transportation company.Torfi et al. [18] used FMCDM to determine the weight of multiple objectives in the location path problem and found the route from DCS to customers to minimize the total distribution network cost.For example, Marinakis [19] proposed an improved particle swarm optimization algorithm for the discrete optimization of the location path of random demand.
e heuristic method can effectively deal with the problem of order allocation and location selection.Macedo et al. [20] proposed a metaheuristic algorithm for system exploration based on different neighborhood structures, which decomposes the integration problem into two subproblems, that is, vehicle routing problem and 2 Complexity location problem, so as to ensure shorter order processing time.e order allocation problem is more systematic and integrated with the downstream links such as path planning and inventory allocation.Foreman et al. [21] studied the supply optimization of Dell's transportation network based on order components.Yue et al. [22] found that enterprises' manufacturing can calculate all possible combinations of total cost and punctuality probability by order.
erefore, the portfolio method can not only ensure the low cost of the manufacturer's order purchase process but also meet the customer's requirements without failure activity.In recent years, [23][24][25][26] studied selecting suppliers for decision-maker, considering the price, the quality of purchased parts, the reliability of on-time delivery, and the risk factors of delayed delivery.According to Ren and Croson [27], different decision-makers often make suboptimal decisions in the face of changing supply chain and inventory allocation.Pan et al. [28] established a multiobjective linear order allocation model for information service enterprises with the objective of minimizing the discount cost, taking into account the influence of factors such as capacity and price.Hall et al. [29] found that, in a multiproduct supply chain, manufacturers receive orders from several distributors.If the available production capacity cannot meet all orders, distributors need to plan the distribution of capacity in advance before the order is reallocated.Garg [9] designed particle swarm optimization (PSO) for improving the vector and the genetic algorithm (GA) for modifying the decision vectors.Garg [8] proposed the gravitational search method and genetic operators to upgrade solutions by selection, crossover, and mutation.Based on existing research, the optimization of the algorithm depends on the model design and structure.erefore, our research structured GMM-EM as the prearrangement so that the input parameters are suitable for the proposed hybrid evolutionary algorithm.
To sum up, the research so far shows that the research on order allocation of the logistics system is often combined with other classical service processes.ere are three gaps in the existing research.Firstly, there are not enough integrated service factors to consider in the logistics system, such as order allocation, vehicle service, and personalized service.Secondly, most of the researches ignore the role of matching problem in the whole distribution process.irdly, the research and application of the heuristic method need to be improved to make it better combined with the model.is paper aims at the cost optimization of the whole logistics operation process of e-commerce enterprises from order allocation to order delivery.VCMP is presented for commodity and vehicle to reduce the cost of the logistics system.GMM-EM is designed to solve the parameter estimation to optimize the algorithm.
VCMP is designed to reduce and eliminate the information asymmetry between supply and demand, so that the order allocation can work at the right time and the right place and use the optimal solution of vehiclecommodity matching.

Vehicle-and-Commodity-Matching Problem (VCMP)
VCMP is described as a binary complete digraph, in which the node set is V. e vehicles with different service levels go back and forth between the warehouses and consumers to execute order fulfillments.erefore, the vehicles must start from the assigned warehouse and choose the appropriate order distribution path. is process can be expressed as the e nonnegative weight W ij of each arc (i, j) ∈ E is the transportation between the warehouses and the consumers.e variables' descriptions are shown in Table 1.
As shown in Figure 1, the matching process of VCMP is divided into four parts, including customers, orders, vehicles, and warehouses.Firstly, customers place orders on the Internet platform, and the e-commerce enterprise services the orders forming feedback information of order delivery.Customers' order information is divided into a sequential order flow from order 1 to N. Secondly, an order allocation process begins to form a commodity flew corresponding to order sequence.And then the matching step is started for commodities and vehicles.e matching process is required to invoke GMM-EM as a pretreatment.irdly, the path selection is computed for the initial location of the vehicle and the fixed location of the warehouse and supported by a new improved evolutionary algorithm.Fourthly, the selected warehouse is identified to generate the solution of order fulfillment for costumer.
e matching process of VCMP in Figure 1 is described by order allocation, matching, and path out.Finally, the logistics and delivery should be enabled for the Internet platform and the execution information is fed back.

Modeling
ere are many factors to be considered in the integration of the logistics system for order allocation in multiple warehouses [30].e logical relationship framework is shown in Figure 2. A system optimization model is proposed.
In Figure 2, the matching problem of VCMP is based on the data set of the enterprise.ere are five kinds of data in the matching database, which is analyzed and mined to form a matching scheme.e model is built on the rationality of implementation in the sets of order, path, warehouse, and vehicle.In the process of modeling, capacity constraints are considered from two aspects, warehouse and vehicle.For costumer, the model considers the requirement of service time because logistics time is an important factor to measure service standards.Figure 2 summarizes the constraints that we need to consider during the modeling process.Unit fixed cost of warehouse y q ij 0-1 variable: 1 means vehicle q from i to j; otherwise, it is 0 H q Unit fixed cost of freight vehicles W q ij e variable cost of transportation process from i to j α ip 0-1 variable: 1 means that the order i is assigned to warehouse p; otherwise, it is 0 β q ij e load of vehicle q from node i to node j;

Complexity
e service start time for vehicle q from node i to j under time window constraint, t i ∈ t q i S p e upper limit load of warehouse  e total cost (TC) of the distribution process includes three parts: fixed cost of distribution warehouse, fixed cost of freight vehicles, and variable cost of transportation process.e objective function of the integration problem is as follows: (1)

Rationality Constraint.
Although splitting the order will make commodities be delivered to consumers earlier, it will also increase the frequency of picking up and the cost of logistic time.At the same time, it also causes waste of resources in e-commerce distribution.Based on these two considerations, in order to avoid repeated distribution of orders, each order is processed by one vehicle only once.
e order distribution process is a closed loop, so it is necessary to ensure that the vehicles start from the warehouse and go back to the same one: ( Subloops removing constraints are as follows: e warehouse serving the customer's order is noted as the supply status: x p ≥ α ip . ( e allocation constraints of warehouse and vehicle are expressed as equations ( 6) and (7) for the order response, respectively.Both ensure that each order is allocated only once.
Avoid unreasonable routes e carrying capacity of each node changes is described as e order cannot be split, so each time when the vehicle returns to the warehouse means that a batch of orders have been processed.At this time, the vehicle load should be equal to 0; that is,

Capacity Constraints.
e total load of orders shall not exceed the transportation capacity of vehicles.N q is the remaining load of the vehicle: e warehouse inventory configured by e-commerce enterprise is enough to meet the order allocation.
For order allocation, it is necessary to consider the load capacity limit of the allocated freight vehicles and the load boundary constraint.
e total supply of the configured warehouse cannot exceed its actual total capacity.S i is the upper limit load of point i and then S p is the upper limit load of warehouse point.

S p x p ≥ 􏽘
e open quantity of warehouse constraint is

Time Window Constraints.
If the vehicle arrives at node i ∈ V before time point l i , it must wait until time point l i to provide delivery service.h i is the end of time window.
M is an artificial variable.

Model Analysis and Function Transformation.
Equation (1) is the objective function of the integration problem.Equations (1)-( 20) are the constraint condition of the logistics service process model.e model is simplified as follows.
Note that , and M is an artificial variable: Constraints ( 3) and (7) are simplified.Let i, j, k be set as an effective equation representing the equivalent route 3) and constraint (7) only allow the same vehicle; that is, y 3) and (7),

Algorithm Design
In order to improve the delivery speed with the stability and continuity of service quality, the VCMP is realized by clustering the characteristics of related data (vehicles and commodities).e basic idea of the algorithm is as follows.Firstly, the parameters of the Gaussian model are estimated for each distribution vehicle by initializing the parameters and the results of the previous iteration.Secondly, the parameters of the Gaussian model are estimated again based on the estimated weight value.Finally, repeat the above steps until the fluctuation is very small and reaches the extreme value.
e specific implementation steps are as follows.
Step 2. Set a posterior probability of α q .
Step 3. Update the Gauss weight, mean value, and covariance matrix as follows: Step 4. Repeat Step 2 and Step 3 to update the three parameters until the algorithm converges, e matching process of GMM-EM is shown in Figure 3. e first step is to train the sample of order data, path data, consumer data, warehouse data, and vehicle data, forming the initial data sets.e second step is to extract the eigenvalues for each data set for feature extraction.Make the results of the second step as input data into GMM-EM and then start step 4 to get the classification calculation of vehicle and commodity data.e final step is to get the matching classification value from 1 to H, which provides input for the path decision.e whole block diagram shows the basic logic of preprocessing. Figure 4 gives an example of the pruning and inserting process in the self-adaptive neighborhood search algorithm.Figure 4(a) shows the initial location of warehouses (1, 2, and 3) and consumers ( A, B, C, D, E, F, G, and H).We need to find the matching path of the three types of vehicles (oversize vehicles, medium-sized vehicles, and small vehicles) through the evolutionary algorithm to connect the consumer and the warehouse.
Figure 4(b) is the initial distribution scheme.e distribution area is radiated outward with the warehouse as the center.e discrete points scanned by the distribution area radius are recorded as the selected consumer points.e initial population size n is generated.e parent population obtains the offspring through the competitive selection strategy.e cross operation obtains the new offspring and then the partition operation divides the offspring nodes into the path selection.
e adaptive neighborhood search algorithm learns to train the offspring and inserts them into the population.In the process of learning and training, the 6 Complexity adaptive weight updates the probability associated with the adaptive neighborhood search.Figure 4(c) shows the warehouse selection operation of the hybrid evolutionary algorithm.In terms of adaptability, the evolutionary algorithm focuses on the selection of excellent offspring and their behavior chain, which is suitable for solving integrated optimization problems.
e initial solution is obtained by initializing the population data for the location path selection for VCMP.e adaptive neighborhood search algorithm is used as the learning and training stage of the solution process, and then a descendant partition is carried out to obtain the path solution for vehicle distribution that needs to traverse the warehouse to determine the route.Finally, we can get more solutions from the evolutionary stage.GMM-EM operator strengthens the evolutionary process to get the feasible solution after evolution.In the evolutionary process, new offspring are generated to join the population, and the number of offspring that can be further increased is m, and then the upper limit is m + n.If the number of iterations does not exceed the upper limit and the population size reaches the upper limit m + n after training, the surviving offspring will be generated.Mutation stage is a feasible solution which can be generated by randomly selecting the evolved individuals from the population according to the size of probability.

Experimental Data and Parameter Setting.
e case data comes from Suning cloud store's plan to build an e-commerce shopping platform for a city.e goal is to integrate the O2O platform of commodities management, order information, logistics supply chain, and service delivery into one regional management.
ere are 6 open transfer warehouses in the city, with cost range of [36000, 50000], [80000, 120000], [16000, 25000], [85000, 100000], [16000, 28000], and [82000, 100000] separately.e operating costs of each warehouse are distributed within an interval.100 samples of consumer orders are selected randomly, and the service capacity gradient of corresponding warehouse changes are shown in Table 2. Table 2 shows the changes in the number of consumer orders that the warehouse serves currently.
e number of service consumers reflects the service capability of the warehouse.For example, when the service size is 10, the cost range of warehouse no. 1 is [90, 110], but when the service size is 20, the cost fluctuation range of warehouse no. 1 is [160, 220].

Experimental Results.
According to the characteristics of load and fuel consumption, vehicles are divided into large vehicles, medium vehicles, and small vehicles.e service scheme of personalized matching consumer orders is designed to make full use of existing data resources and improve the accuracy of VCMP.GMM-EM algorithm is used to match vehicles and commodities according to time window, distribution distance, and path characteristics.
According to Table 3, the sample sets are tested in four situations.e hybrid evolutionary algorithm is tested in four cases.e differences of the total cost, error rate, and running time are calculated in the comparison sample set, with four situations based on three degrees of evolution (training, strengthen, and mutation) in Table 4.It can be seen that as the complexity of the algorithm increases from no evolution to complete evolution, the operation time increases, the error rate gradually decreases, and the planning cost also decreases with the increase of accuracy.
Table 4 selects 10 groups of customer nodes randomly to verify that situation 4 (in Table 3) of the algorithm is optimal to improve the calculation accuracy, with an average error of 1.53% (in Table 4).e accuracy in situation 4 is higher than that in the other three cases.In terms of the running time, it  us, for VCMP, accuracy is the most important.erefore, the proposed algorithm improves the matching accuracy through the cost of time.

Comparison and Sensitivity Analysis.
e constraints are classified and solved according to the following four divisions based on the attribute analysis of logistics service for VCMP.As shown in Table 5, the Lagrangian relaxation degree of different divisions is obtained by relaxing the constraints of variable y q ij , among which four kinds of constraints (I, II, III, and IV) have similar solution space.For example, both formulas (2) and (3) express constraints related to the binary variables selected for the order in Table 5.Table 6 shows the comparison results between Lagrange relaxation and the algorithm in this paper.
e first two columns of Table 6 show the set size of customer quantity and warehouse quantity, respectively.From Table 6, it can be found that the HEA error fluctuation is below 8% and the error variation range of LP relaxation is below 16% in terms of accuracy, as the sample size of customer increases to 150 with the amplitude of 10 and 50.

Conclusion
As the development of Internet technology, data resource from trading and logistics has been paid close attention on the costumers' service demand by scholars and managers.In order to upgrade the efficiency of the logistics industry and solve the predictive problem for Internet with logistics planning, a nonlinear mixed-integer programming model is proposed to reduce the total cost which considers the online orders allocation process in VCMP.Logistics system integration problem is broken into several subproblems in sequence, for example, order response, selection of warehouses, distribution of the vehicle, path planning, and order delivery.And then the complex problem is described as a directed graph so that all events and objects in the VCMP can be designed as points and vectors by mathematical method.e detailed solution is summed up as follows.Firstly, vehicle classification is expressed as GMM-EM algorithm to solve the parameter estimation of VCMP, so that VCMP process is optimized by preprocessing.Secondly, in view of the features of the problems, a new HEA is designed, based on the idea of adaptive searching schemes to solve multistage integration problems.
e warehouse and path planning with time window in the order allocation process are compared with the traditional logistic planning method.e results show that the performance of HEA is proved to be superior.Finally, experimental analysis validates the solution so that the rationality of the model and the feasibility of the algorithm can be obtained in the logistics integration  Complexity system.Research results indicate that intelligent algorithm can be applied to solve the new problems in the era of big data and logistics distribution system.In future research, the optimization of the heuristic algorithm and the research of the matching method are both valuable research directions.

Figure 4 (
Figure 4(d)  is the result of the evolutionary algorithm.GMM-EM operator strengthens the evolutionary process to get the feasible solution after evolution.In the evolutionary process, new offspring are generated to join the population, and the number of offspring that can be further increased is m, and then the upper limit is m + n.If the number of iterations does not exceed the upper limit and the population size reaches the upper limit m + n after training, the surviving offspring will be generated.Mutation stage is a feasible solution which can be generated by randomly selecting the evolved individuals from the population according to the size of probability.

Figure 4 :
Figure 4: Adaptive neighborhood search process of evolutionary algorithm.

Figure 5 :
Figure 5: Sensitivity analysis of LP and HEA to objective function.

Table 2 :
Gradient warehouse capacity and service scale.

Table 3 :
Test of hybrid evolutionary algorithm.

Table 4 :
e operation results of hybrid evolutionary algorithm in four cases.

Table 6 :
Comparison results of HEA and LP relaxation.