Location optimization of fresh food e-commerce front warehouse

: The ongoing emergence of COVID-19 and the maturation of cold chain technology, have aided in the rapid development of the fresh produce e-commerce industry. Taking into account the characteristics of consumers' demand for fresh products, this paper constructs a location allocation model of a front warehouse for fresh e-commerce with the objective of minimizing the total cost. An improved immune optimization algorithm is proposed in this paper, and the effectiveness of the proposed algorithm is demonstrated by a real case study. The results show that the improved immune optimization algorithm outperforms the traditional genetic algorithm in terms of solution accuracy; the proposed location model can effectively help fresh produce e-commerce enterprises open new front-end warehouses when demand is increasing, as well as provide optimal economic decision-making for front warehouse layout.


Introduction
Online sourcing of fresh food for daily needs has grown in popularity as a way to avoid human contact in public places and reduce the risk of infection during epidemics. Farmers' markets have moved to e-commerce platforms, and fresh food retail e-commerce has emerged as a popular new retail type [1]. The impact of the epidemic will continue to accelerate the development of fresh food ecommerce in the future, with the fresh food e-commerce industry expected to exceed a trillion yuan in 2023, with a high growth trend [2]. There are three major forms of logistics network and operational models in the retail form of fresh food e-commerce: (1) Front warehouse mode. Fresh e-commerce platforms set up multiple small warehouses (front warehouses) near communities where consumers are concentrated, and couriers provide customers with instant delivery service from these front warehouses [3]. (2) Warehouse and store integration mode. The store serves as a small fresh food supermarket and fulfills online orders. Consumers can purchase from the store or place orders online, with express delivery [4]. (3) Community group buying mode. Fresh e-commerce initiates group purchases, summarizes orders and distributes products to customers in the community through group buying leaders, while the fresh platform is responsible for the procurement, transportation and aftersales service of fresh food [5].
Front warehouse mode can solve the last mile problem and ensure freshness and timely delivery [6], and it is distinguished from the other two e-commerce modes by its small scale, proximity to the consumer, high layout density and efficient distribution [7]. However, compared to the other two operating modes, front warehouse mode increases the length of the supply chain and incurs higher fulfillment costs (including warehouse rent, inventory costs and order delivery costs), which have become one of the main bottlenecks restricting the development of fresh e-commerce corporations operating in front warehouse mode [8]. Consequently, research on the location optimization of fresh food e-commerce front warehouses is of great application value. This paper presents a case study of an e-commerce company locating new locations for front warehouses using a location optimization model with the goal of minimizing total distribution system cost. The location and allocation decision of front warehouses can be described as follows: In the given set of alternative front warehouse and customer locations, select a certain number to open according to the distribution of customer locations and the quantity of demand, and allocate the customer orders to the open front warehouses nearby, to ensure optimal cost of the distribution network.
The remainder of the paper is structured as follows. Section 2 is a review of the literature. Section 3 discusses the problem description, assumptions and mathematical model. Section 4 presents the solution approach, Section 5 gives a real case study, and Section 6 concludes the paper with limitations and recommendations for future research.

Research on sustainable closed loop supply chain network and location problem
The concepts of sustainable development and circular economy have received extensive attention. Scholars delve into sustainable closed-loop supply chain networks, and have adopted sustainability factors, such as carbon emission, energy consumption and environmental pollution, into their research to respond to real challenges [9]. Goodarzian et al. investigated a citrus fruit supply chain network in Iran to minimize the total cost and CO2 emission simultaneously [10]. Momenitabar et al. designed a sustainable bioethanol supply chain network to cope with the demand increase of bioethanol [11]. Fresh food requires low-temperature preservation during transportation, which greatly meets the quality requirements of consumers but also creates huge energy consumption and carbon emissions [12]. Therefore, research on the sustainability of cold chain transportation is particularly important. Govindan et al. [13] and Wang et al. [14] both studied the cold chain location and routing problem (VRP) from the perspective of low carbon and environmental protection to minimize the impact of carbon emissions on the environment. Wang et al. constructed a bi-objective model to realize carbon emission reduction in fresh agricultural cold chain logistics [15]. Liu designed a dynamic model to reflect the carbon emission cost affected by time-varying vehicle speed [16].
Specifically, scholars have basically matured their research methods on the e-commerce warehouse location problem, mainly based on quantitative analysis, which quantifies costs, service level, demand, time window and other elements to formulate a location model, with designs of specific algorithms to solve it. For example, Nuno et al. presented a solution for selecting a location for the installation of an e-commerce warehouse with the goals of lowering logistics costs and ensuring service improvements, and the problem was solved using multiple criteria decision-making [17]. Wang used the effective covering model to conduct an empirical analysis on the location selection problem of marine product e-commerce distribution centers, taking into account factors such as the surrounding environment, public facilities and population [18]. Ahmadi et al. established a mixed integer programming location model with the goal of maximizing the total profit of serving customers, considering the price elasticity of demand [19]. Macedo et al. proposed a heuristic algorithm based on skewed universal variable neighborhood search to solve the VRP for perishable goods with time constraints [20].

Research on algorithms of location model
Both precise and heuristic algorithms can solve location problems. The former is suitable for situations where the problem size is not large and can obtain the optimal solution. When the scale of the problem is large, heuristic algorithms can efficiently obtain the Pareto optimization. The location problem is a classic NP-hard problem, and the Pareto optimal solution of an instance can be obtained using heuristic algorithms [21]. Torabi et al. proposed a new fuzzy-stochastic hybrid programming model based on two-stage scenarios and designed an effective multi-step solution to solve practical problems [22]. Erdin and Akbas took the location of storage facilities as a multi-criteria problem, and applied multi-criteria decision-making and TOPSIS (technique for order preference by similarity to an ideal solution) method to plan the location of storage facilities [23]. Silva et al. used three linear relaxation heuristic algorithms and genetic algorithm combined with variable neighborhood hybrid algorithm to solve the problem. The research showed that the performance of the hybrid algorithm was better than the linear relaxation heuristic algorithm [24]. Considering random demand, Silva et al. designed location models for internal distribution and outsourcing operation strategies of e-commerce enterprises, and carried out accurate solutions and heuristic solutions for the models, respectively [25].

Research gaps
Most of the literature only considers the process from front warehouses to customers, neglecting the process from urban distribution center to front warehouses. The improvement of distribution efficiency of a logistics system involves every circulation of the supply chain. This paper takes the front warehouse location problem as a three-tier two-stage logistics network. Although there is extensive research on fresh food e-commerce, the research in specific fields is limited, especially the quantitative research on front warehouse location. The main difference is that the front warehouse of fresh food e-commerce studied in this paper is the end node in the logistics process, providing lastmile service, with a small service scope and scattered service objects. Therefore, it is necessary to finely depict practical problems and systematically consider factors such as cost and environment.

Contribution
This work presents two major contributions. First, a model of front warehouse location with the objective of total minimum cost is established, and the model and case study make a detailed description of the location of the front warehouse and comprehensive consideration of the location cost. Second, the improved immune algorithm is applied to solve the model, and the comparison with GA proves the validity of the proposed algorithm. Also, a real case study shows that the proposed algorithm is better in performance than the traditional genetic algorithm, and the real case study gives a reference for optimizing the layout of a front warehouse for fresh food e-commerce enterprises. In a word, the research of this study is of practical and academical significance. Figure 1 depicts the operating mode of a fresh produce e-commerce business with forward warehouses, which includes a point of origin, regional warehouses, front warehouses and customer points. Regional warehouses are large warehouses that are set up for each province or city, with good cold chain storage and processing facilities, and they are typically located far from urban centers. Front warehouses are typically located within the city or in densely populated areas of the city, making it easier to deliver fresh goods to customers in a shorter period of time. First, the e-commerce company buys fresh produce directly from the source and transports it to a regional warehouse where it is processed and packaged. The fresh goods are then transported to front warehouses, where they are temporarily stored to fulfill customer orders. When a customer order is received, the front warehouse with available stock will fulfill the order quickly. To meet the needs of consumers, two specific issues should be addressed. To begin with, in order to make profit, the goal is to reduce total delivery cost. Furthermore, all demand points should be covered. This model divides logistics distribution into two phases: First, products are supplied from urban distribution centers to front warehouses, and then, products are delivered from front warehouses to consumers.

Problem hypothesis
Before developing a model, the following assumptions are made in order to solve the problems listed above: (1) Only choose the front warehouse from the list of alternative addresses.
(2) Distribution vehicles have the same model and limited capacity.
(3) The operating costs of front warehouses and vehicles are not taken into account.
(4) Customer service time, and loading and unloading time are not taken into account.
(5) Only one supply point can serve a front warehouse, and two or more front warehouses cannot serve a demand point at the same time.
(6) Maintain a constant temperature throughout the delivery process. (7) Vehicles travel at a fixed value.

Parameter description
The following are the definitions of the model's parameters. | 1,2, ⋯ , : set of candidate front warehouses; | 1,2, ⋯ , J : set of demand points; | 1,2, ⋯ , : set of vehicles; | 1,2, ⋯ , : set of supply vendors; : fixed costs for establishing a front warehouse in area i; : total cost for establishing a front warehouse; : capacity of vehicle; : outdoor ambient temperature; : refrigeration temperature; : energy consumption factor; : unit transportation cost from supplier to front warehouse; : unit transportation cost from front warehouse to demand point; : distance from supply vendor to front warehouse ; : distance from front warehouse to demand point ; : supply quantity of vendor k; : cargo volume from supply vendor to front warehouse ; : demand from front warehouse to demand point ; : demand of point ; : speed of vehicle; : time of vehicle traveling from supply vendor to front warehouse ; : time of vehicle traveling from front warehouse to demand point ; : 1 means ∈ is selected as front warehouse: for others, 0; : 1 means vehicle transporting from supply vendor to front warehouse ; for others, 0; : 1 means vehicle transporting from front warehouse to demand point ; for others, 0; : 1 means demand point is served by front warehouse ; for others, 0.

Model formulation
The front warehouse location allocation problem (FWLAP) model for fresh food logistics constructed in this paper takes the minimum total cost as the objective function. First, the components of the objective function are analyzed, and then the specific composition of the FWLAP model is determined.

Analysis of objective function
(1) Infrastructure and operating costs of fresh logistics distribution system The infrastructure cost of the fresh food logistics distribution system mainly includes the purchase cost of land, the construction cost of the front warehouse distribution center, the salary and management expenses of workers employed by the distribution center and the expenses incurred by water and electricity consumption. The facility cost is (2) Fresh logistics distribution and transportation cost The transportation cost of fresh food logistics and distribution includes the transportation cost from the supply point to the distribution point and cost from the distribution point to the demand point. The total transportation and distribution cost of the two stages is (3) Energy consumption cost It is well known that vehicles loaded with fresh products need to keep the fresh products in a lowtemperature environment during the distribution process. Compared to ordinary vehicles, fresh vehicles consume much more energy. The low-temperature environment of fresh vehicles requires fuel and power consumption to maintain, and the temperature of fresh vehicles is determined by the product type. The fuel and power consumption of different types of products are also different. If the outdoor ambient temperature is below zero, there is no need to refrigerate fresh foods such as vegetables and fruits, and the energy consumption cost will be very low.
where 0, , , is outdoor ambient temperature, and is reefertemperature required.

FWLAP model formulation
Through detailed analysis of the targets, the FWLAP model is formulated as follows: The goal of Eq (4) is to minimize total cost. Formula (5) indicates that the cost of the selected front warehouse cannot exceed the total cost of the front warehouse. Formula (6) means that the quantity of goods provided by a supplier to a front warehouse cannot exceed its supply capacity. Formula (7) displays that the goods in the front warehouse cannot be retained and are all used for distribution. Formula (8) indicates that the requirements of the demand point must be satisfied. Formula (9) indicates that there is at least one front warehouse. Equation (10) requires that the total front warehouse supply is equal to the total demand of the customer points. According to Eq (11), there is no lateral transportation of goods between front warehouses. Equation (12) represents that a vehicle is assigned to one of the front warehouses. Equation (13) ensures that the vehicle transportation path remains continuous. The capacity limit of vehicles is represented by Eq (14). Equation (15) indicates that a demand point can be served by multiple delivery vehicles. Equations (16) and (17) show that vehicles are only assigned to open front warehouses. Equation (18) represents that only one front warehouse serves a demand point. The constraints of decision variables are represented by Eq (19).

Algorithm overview
There are numerous methods for resolving location issues, including analytical methods, optimal planning methods, comprehensive factor evaluation methods and so on. These methods, however, have limitations. The analytical method is limited to one distribution center location model. Although there is no limit on the number of distribution centers, the optimal planning method is not appropriate for complex models. The result of the comprehensive factor evaluation method is subjective. As a result, the proposed model is solved using a heuristic algorithm in this study. The genetic algorithm, immune algorithm and neural network are examples of intelligent optimization algorithms that are refined through repeated experimental simulations under various conditions or variants. Among them, immune algorithm research began in the 1980s, and research results show that it outperforms the classical genetic algorithm in solving location problems [26]. In this study, the immune algorithm is used to solve the proposed model.
The immune algorithm is an intelligent optimization algorithm that is generated by simulating the body's powerful immune system [27]. The biological immune system has powerful learning ability, recognition and memory ability to identify harmful information from the outside world, analyze the information and activate defense mechanisms to minimize system damage and ensure robustness and safety. The immune optimization algorithm uses the immune system's diversity generation and retention mechanism to maintain population diversity, overcoming the problem of "premature maturity" that is difficult to solve in the process of finding the optimal solution and successfully obtaining the global optimal solution.

Algorithm implementation steps
The flow chart of the immune algorithm is shown in Figure 2.
Step 1: Problem analysis. The goal of the research is to select several suitable locations from multiple alternatives in order to establish front warehouses with the shortest total distribution time.
Step 2: Antigen identification. The magnitude of the affinity between the antibody and the antigen is used to identify it.
Step 3: Generation of initial antibody population.
Step 4: Determine the expected probability of reproduction for each individual and evaluate each one in the population.
Step 5: Rank all individuals based on their expected reproduction probability, then choose the top N to form the parent population, and select the top m to store in the memory.
Step 6: Check to see if the loop termination condition is met. To complete the iterative loop, the maximum number of iterations is used as the termination condition.
Step 7: Generation of new population. Generation is made up of two parts, one obtained through antibody selection, crossover and mutation operations, and the other is obtained from the memory bank. After the new generation is generated, the cyclic operation continues.
The specific calculations used in the implementation of the immunization algorithm are as follows.
(1) Generation of the initial antibody population If the memory bank is empty, the first antibody population is chosen at random from the feasible solution space. If the memory bank is not empty, then the initial antibody population is chosen from it. The initial antibody population in this paper is generated at random using simple coding, and different antibodies indicate different location schemes. The number of front warehouses determines the antibody length in this study, and the antibodies represent the set of selected front warehouses. For example, if there are 5 alternatives, 2 of them should be chosen to establish front warehouses, and antibodies [2,3] indicate that locations 2 and 3 are to open as front warehouses. (2) Affinity calculation The degree of recognition of an antigen by an antibody is expressed by the affinity between the antibody and the antigen. The expression of affinity function is as follows. (20) where is the objective function of front warehouse location. The affinity between antibodies and antibodies is used to measure their similarity, and the algorithm employs Forrest's R-site contiguity method. The principle of the R-site contiguity method is based on the partial matching rule, and the key is determining the R-value. If two individuals are coded with consecutive R-positions or more than R-positions, the antibodies are similar, and vice versa. In this paper, the following formula is used to calculate the antibody affinity.
, denotes the same number of digits between antibodies v and s. L means the length of the antibody, and the value of L is determined by the number of the front warehouse selected. For example, two antibodies are [237854] and [438961], and the comparison shows that three same values exist for the two antibodies. Then the affinity is 0.5.
(3) Calculation of antibody concentration Antibody concentration denotes the proportion of similar antibodies to the total number of antibodies in the population and characterizes antibody strength. When an external antigen or a new antibody attacks the immune system, the body's immune response becomes intense, and the number of antibodies changes. Most antibodies' affinity will increase as more antibodies are produced. When it exceeds a certain threshold, their concentration suffers. The formula for calculating antibody concentration is as follows.
where N denotes the total number of antibodies, and , is a binary variable. , = 1 when , > T, and otherwise, , = 0. T is a predetermined threshold value.
(4) Expected probability of reproduction The expected reproduction probability P for each individual is determined by the antibody concentration and the affinity between the antibody and the antigen. The expression for calculating the expected reproduction probability is as follows.
The above equation shows that an individual's fitness is proportional to the expected reproductive probability, whereas individual concentration is inversely proportional to the desired reproduction probability. This suppresses the production of individuals with high concentration while promoting the production of individuals with high fitness, ensuring the population's diversity. is a constant value.
(5) Immunization operation Selection operation. The selection operation is carried out using a roulette selection mechanism. The probability of an individual being selected is determined by the ratio of an individual's expected reproductive probability to the overall.
Crossover operation. A crossover operation is one in which genes or gene segments from two chromosomes are exchanged in a specific manner to create a new individual. This paper chooses the crossover operation by multi-point crossover to make the good genes of the parents be inherited in order to produce new good individuals.
Mutation operation. The antibodies are drawn at random from the antibody population generated by the crossover operation, and one of the antibodies is mutated to create a new individual. The most commonly used mutation operations in practice are basic position mutation, flip mutation, adaptive crossover mutation, and so on. In this paper, the flip mutation operation is used to ensure population diversity. The new antibody population produced by the mutation operation is combined with the previously memorized one from memory.

Implementation of improved immunization algorithm
Because the traditional immune optimization algorithm conservatively retains the excellent individuals within the memory bank, the population quality improves as the number of iterations increases, but it is unable to retain more excellent individuals, resulting in a slower convergence rate, which affects computational accuracy. This paper proposes an improved immune algorithm to retain dynamically excellent individuals by establishing the dynamic retention mechanism. The number of excellent individual m in the memory bank is set to change dynamically, and the number of excellent individuals is preserved to increase as the population's overall quality improves. The m population quality function is presented as ∑ (24) where p is the number of demand points, q is the number of distribution centers, k is the size of the memory bank, and is the descending position of the distance between nodes in the set of distances between demand points and temporary distribution stations.
Traditional immunization relies on a certain probability to retain the best individual, which increases the number of iterations and computational effort while also affecting computational accuracy. As a result, the population in this paper is chosen using the Metropolis criterion. The i-th individual in a population is compared to the (i+1)-th, and the rule of accepting the (i+1)-th individual is chosen as in Eq (18); otherwise, the i-th individual is accepted.
exp (25) where denotes the fitness value, w is the temperature constant and R represents a random number uniformly distributed on [0, 1].  Table 1 shows the data for each candidate front warehouse, including capacity, coordinates and available area. The amount of fresh produce that could be accommodated in the alternative temporary front warehouses was calculated assuming that the height of the temporary front warehouses ranged from 4 to 6 meters, and the storage capacity was 1000 kg per 7 cubic meters. The distribution mode is multi-vehicle distribution, in which vehicles begin at the distribution point proceed directly to the demand point, and return to the distribution point once the task is completed.

Basic data
Location of the urban distribution center is set on the west side of the city with a capacity of 40,000 tons, which is currently the largest integrated cold storage in Handan city. Its geographical coordinates are (114.455441,36.636189). The layout of alternative front warehouses that meet the floor area of above 2000 m 2 is shown in Figure 3.

Data of demand points
The main urban area of the city Handan is subdivided into 86 demand points based on its geographical characteristics and population distribution, and the population number of each demand point is obtained from city statistics. Assuming a daily demand for fresh food of 2 kg per person, the daily demand is then calculated using the population number, and the specific data are shown in Table 2. Handan's urban area is divided into 86 demand areas with varying sizes and areas based on population density. High population density areas are small in order to balance the demand of each demand point, and vice versa. This can avoid the problem of overcrowded vehicles and late deliveries caused by high demand in the area, as well as reduce the number of idle and empty vehicles in lowdemand areas. The layout of specific demand point numbers is depicted in Figure 4.

Data of parameters
According to the case description, there are 86 demand points and 15 locations where front warehouses can be opened. In this paper, traffic congestion is ignored, and distance between nodes is calculated using the Euclidean distance formula when vehicles travel at a constant speed. Each vehicle is of the same type, has an on-board capacity of 2000 kg and travels at a speed of 30 km/h. Table 3 shows the parameters of the immune optimization algorithm.

Algorithm validity analysis
The model formulated in Section 3 is solved using MATLAB R2020a software by the traditional genetic algorithm and the improved immune optimization algorithm proposed in this paper. The optimal result is obtained by running in MATLAB hundreds of times. Figures 5 and 6 depict a comparison of the convergence graph results of the improved immune algorithm and the genetic algorithm.  The comparison between Figures 5 and 6 shows that the genetic algorithm obtains an optimal fitness value of 19.03 after a larger number of evolutions. The improved immune optimization algorithm in this paper, by establishing a dynamic retention mechanism to retain the best individuals and using the Metropolis criterion to select the best individuals, accelerates the convergence speed of the algorithm and can first search for the local optimum quickly and then search for the global optimum with fewer iterations. The global optimal fitness value is 13.8278, which shows the feasibility and efficiency of the algorithm in this paper. The convergence curves and layout of location schemes show that the genetic algorithm falls into the local optimum after a dozen iterations and produces site selection results for 9 front warehouses, whereas the improved immune optimization algorithm successfully jumps out of the local optimum and achieves the global optimum solution, producing location results for 7 front warehouses. Moreover, the location cost of the immune algorithm is 851,460 yuan, and that of the genetic algorithm is 1,037,230 yuan. Consequently, the solution results show that the immune algorithm outperforms the genetic algorithm in efficiently solving location problem proposed in this paper. Table 4 shows the front warehouse selected, demand point, and total cost generated by the two algorithms.  According to the location results, front warehouses 7, 8 and 12 handle the majority of the distribution tasks. These three front warehouses are located in the center urban area, which is the most densely populated area, according to maps of 86 demand points and 15 alternative front warehouses (Figures 3 and 4). Because there is less available space in the central urban area to open a front warehouse, these front warehouses are overcrowded. According to field research, traffic in Handan city's urban area is in good condition, and small and medium-sized vehicle can pass freely with less congestion; however, with more demand to meet, same-day or next-day delivery of fresh food can be satisfied; thus, the solution solves the delivery problem of fresh food to communities.

Model validity analysis
To summarize, the model can be used to aid decision-making for fresh produce e-commerce front warehouses or distribution nodes in logistics networks, particularly when the decision objective is to minimize the cost of delivery system.

Conclusions
This paper investigated the location of new front warehouses opened by fresh produce ecommerce platforms to meet increased demand in the epidemic and post-epidemic era, considering infrastructure and operating costs, fresh logistics distribution and transportation cost and energy consumption cost. To solve the actual case, the improved immune algorithm and the genetic algorithm were used, and it was proved that the improved immune algorithm outperforms the genetic algorithm in terms of solution accuracy and results. The managerial implications are as follows.
The results of the case study show that since the front warehouse serves the last-mile delivery, it radiates more communities in the main urban areas with large population and is more efficient to utilize. In contrast, the number of front warehouses in remote communities with less population is small, but the coverage is wider, and the utilization rate and unit distribution cost of front warehouses in remote communities are higher. Therefore, e-commerce enterprises should take into account the population distribution of the community, total demand quantity, the radiation distance of the service and the comprehensive cost to locate front warehouses. If the front warehouse unit distribution cost is too high in communities with small populations, it should consider the combination of front warehouse mode and community group purchase and customer pick-up mode to reduce distribution costs and achieve profitability.
In summary, the model has broad applicability in the decision-making process for location selection in the pursuit of managing a sustainable operation distribution system.
The following are the main limitations of this paper. For starters, the delivery time factors of fresh produce e-commerce operations are not taken into account, and the time window requirement of customer demand is not specified. As a result, future research directions include taking into account the cost of operation as well as the environmental impact of carbon emissions in model formulation, as well as meeting the customer time window to provide better customer service.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.