Estimating optimal substitution scale of urban gasoline taxis by electric taxis in the era of green energy: a case study of Zhengzhou City

ABSTRACT Electric Taxis (ETs) are the most favored alternatives to Gasoline Taxis (GTs) in cities that aim to reduce environmental pollution. How to develop a reasonable scale on which GTs are substituted by ETs remains a challenge to governments due to the dynamics and complexity of the taxi system. To address this challenge, this paper develops a discrete-event-based simulation framework to simulate participants in the system and estimate the results under different substitution scales, which are helpful to understanding the status changing law of entities under different substitution scales, such as the operating indices of ETs, the unsatisfied travel requirements of passengers, and the usage state of charging facilities. The framework abstracts the behavioral process of ETs into three elements, namely, entity, behavior, and event. The entities are constructed from the information derived from the trajectory data. The behaviors are defined by rules following behavioral logic under anxiety psychology, which is caused by the limited range of ETs. The events are triggered based on rules from reality. With the help of this framework, a multi-objective optimization model is developed to obtain the optimal substitution scale of GTs in the case study area of Zhengzhou City. Overall, the approach could provide a practical tool to address this challenge, which could support further studies of the effect of ETs on urban taxis.


Introduction
Urban transportation is facing increasing air pollution and a steadily diminishing supply of fossil fuels (Li et al. 2021b), which leads to increased attention to alternative renewable sources (Kaltschmitt, Streicher, and Wiese 2007;Liu et al. 2021a) and seeking green energy.The electricity of public transport (i.e.taxi, bus) is a critical step for the city government to reduce urban air pollution and decrease the risk of energy dependence on fossil fuels.Therefore, increasing the penetration rate of EVs in public transport is an important effort to achieve green energy.Electric Vehicles (EVs) have been viewed as effective alternatives to GTs (Wang et al. 2015;He et al. 2016;Kong, Ma, and Wang 2016;Shareef, Islam, and Mohamed 2016;Li et al. 2021b) and buses.In urban areas of developing countries like China, the total number of taxis is much higher than buses.It is a particularly important issue to promote ETs in urban areas.Many Chinese cities have implemented ETs aggressively, such as Beijing, Shenzhen, Shanghai, et al.These city governments offer many incentives to encourage taxi drivers to purchase electric vehicles.However, it is still not clear how charging activities affect taxi drivers' daily operational behaviors, which is of great concern to drivers, especially in cities with under-developed charging infrastructure, like Zhengzhou, Wuhan, Changsha, Hefei, Chongqing, etc., in China.The affections of charging issues are mainly related to the scale of the ET fleet.Also, taxi electrification is challenging and complicated owing to the various influencing factors involved, for example, the acceptance of ETs in the taxi industry, which in turn depends on factors such as the load of power systems, energy consumption, and operation mileage of the ETs (Liang, Correia, and van Arem 2016;Yang et al. 2016;Sun et al. 2017;Tu et al. 2019;Skippon and Chappell 2019).Any reasonable measure on it can bring positive social feedback and further promote taxi electrification (Azadfar, Sreeram, and Harries 2015).In addition, the process could bring a series of social problems, such as weakening the taxi fleet's ability to satisfy travel demands, reducing the income of drivers, and arousing public complaints regarding government decisions (Liu, and Liu 2016).Therefore, estimating the optimal substitution scale of urban GTs by ETs is still a critical challenge for the scientific decision-making of urban transportation agencies in under-developed cities.
Two categories of methods could be used to determine the number of ETs or EVs that can be accommodated in a city: static mathematical and dynamic simulation methods.In the aspect of static mathematical methods, the direct approach is to predict the future trend according to the historical changing laws of the number of vehicles.For example, the Gompertz model described the growth law of EVs to predict the number of EVs in the next decade (Li, Wang, and Zhang 2014).A supply and demand balance model (Lu and Wang 2004) determined the optimal number of taxis by considering taxi resource utilization.It considers a linear model to describe the relationship between the number of taxis and the influencing factors including mileage, empty-loaded rate, the number of passengers, operating time, and operating speed.In addition, the fluctuating taxi travel demands along with the associated costs (Yang, Wong, and Wong 2002;Liu and Wang 2004) are considered to improve the flexibility of this model.To improve the prediction accuracy limited by linear models, researchers employed a wavelet neural network (Yang, Hua, and Zhang 2012) to approximate the nonlinear relationship between the number of taxis and the influencing factors.Furthermore, a Gray model (Zhang and Liu 2015) was employed to predict the varying trend in the number of taxis, and its prediction error is then employed to train a back propagation (BP) neural network for improving prediction.In short, mathematical models abstract the growth law of the number of taxis into a set of static quantity relations with affecting factors.Future mathematical models could be improved by considering the dynamic and complex mechanism involved in the growth in the number of taxis and EVs (Teodorovic 2003).
Dynamic simulation methods (i.e.Monte Carlo, Multi-Agent Simulation (MAS), network equilibrium, and system dynamics) are promising to determine the number of taxis or EVs because they could consider the operating environment and behavioral characteristics of participants (Brodeur and Nield 2018;Wang et al. 2019), which can be formulated and virtually implemented by computers to estimate the potential and spatiotemporal effects on taxi services.First, the Monte Carlo and Markov chain methods are used to simulate the drive circle of EVs (Wang et al. 2019).For example, Monte Carlo (Hastings 1970) is used to select track pieces that are divided by the TP to form a realistic drive circle, which ensures that the velocity state with high TP is selected at a high probability.The Markov chain method (Gasparini 1997) is used to describe the characteristics of variations in the velocity of EVs through a Transition Probability (TP) matrix approach.Second, the MAS method (Lioris, Cohen, and Arnaud 2009) is used to simulate the taxi operation process under different fleet sizes and to support a detailed analysis in the simulation program.Moreover, MATSim, a commercial traffic simulation platform, is used to simulate the micro behavior of ETs in reality, which uses a dynamic algorithm to match a taxi and a passenger under different electric states (Bischoff and Maciejewski 2014).Since the time and location of vehicles could affect the coming load of the grid, the dynamic simulation methods are widely applied in revealing its spatiotemporal characteristics (Veneri 2017).For example, an adaptive electric vehicle charging model based on MAS (Xydas, Marmaras, and Cipcigan 2016) is proposed to obtain actual electricity demand, generate a pricing strategy for maximizing profits automatically, and simulate the EVs charging decision by rationally responding to new prices.Similarly, an agent-based simulation method (Dallinger and Wietschel 2012) is used to forecast the power price directly and to dynamically adjust the charging behavior of vehicles to reduce the fluctuation of the power grid load.An improved charging mode for EVs based on dynamic simulation (Rao et al. 2015) is introduced to develop induction strategies for different charging scenarios for improving the charging efficiency of EVs.And the scale of hydrogenation stations was investigated based on MAS (Grüger et al. 2018).Third, in the aspect of network equilibrium and system dynamics, a network equilibrium model (Yang and Wong 1998) is proposed to simulate the cruising behavior of taxis in a road network and is used to explore the service performance under different fleet sizes for given equilibrium state parameters.A two-layer model (Wong, Wong, and Yang 2001) was then proposed to improve the previous network equilibrium model by considering traffic congestion and variations in travel demands.Several system dynamics models (Li et al. 2019) were used to simulate and quantitatively evaluate the energy efficiency, carbon emission, and economy of ET systems under different pricing strategies.In addition, reinforcement learning algorithms (Gong et al. 2018) are also used to simulate the dynamic decision-making process of ET, to obtain the charging loads of Charging Stations (CSs) on both temporal and spatial scales.In short, these simulation approaches could generate the travel and charging demands of ETs for determining the number of taxis or EVs according to mathematical probability and the simplified topological structure of road networks.They could be improved by deriving actual demands and spatiotemporal characteristics from the real trajectory data of urban taxicabs in the determination process of the number of taxis or EVs.
The urban taxi electrification is limited to consider in the previous research works due to the application scenario of government, and the existing gap of optimization approaches or tools in considering some complicated factors like real taxi travel demands, facilities of power system, the balance between limited range and income of the ETs.It is still challenging for governments to develop a reasonable scale on which GTs can be substituted by ETs.To address this challenge, the research problem of this paper is to quantify the influence of the electrification scale on the state of the taxi system and further solve the optimal electrification scale of taxis, aiming to provide a basis for the electrification of urban taxis.In this study, there is a basic assumption that the approach to taxi electrification in urban areas is to replace gasoline cars with an equal number of electric cars.This paper proposes an approach by integrating the simulation of the ET's behaviors with a discrete event-based framework.Within this proposed approach, taxi trajectory data obtained by Global Positioning System (GPS) sensor is used to derive real spatiotemporal demands of taxis such as the starting and ending times of each trip taken in each taxi, time-varying locations of taxis, and distribution of travel demands (Paulista, Peixoto, and Jose 2019).A discrete-event-based simulation framework is introduced to simulate the passenger seeking/servicing/rejecting behavior and Charge Station (CS) searching/queuing behavior through a set of rules following the behavioral characteristics of ETs that differ from GTs (Zhao et al. 2014) while meeting the derived real demands of taxis.Based on this simulation framework, a dynamic simulation process is developed under the constraints of real road networks, the location, and state of Charging Piles (CPs), the running states of ETs, and real spatiotemporal traveling demands of passengers derived from GPS trajectories.Through this simulation process, this study could obtain the quantitative impacts on multiple participants caused by the number of ETs, such as the number of unmet travel requirements of the passengers, electricity consumption of ETs, and usage duration of charging facilities.At last, this study implements a multi-objective optimization model to balance the interests of participants based on the proposed simulation approach.This approach is helpful to determine a reasonable substitution scale for GTs by ETs for supporting the plan of taxi electrification by urban transportation agencies.Besides this, it could also provide a practical tool to support further studies on the ET market or intelligent transportation systems as well as the influence of ET on urban taxis or public transportation services.
The remainder of this paper is structured as follows: Section 2 introduces the proposed methodology including the simulation framework and system elements; Section 3 introduces the experimental design and simulation result analysis under different scenarios; Section 4 presents a multi-objective optimization model to optimize the number of ETs constrained by current electricity facilities.The conclusions of the study are drawn in Section 5.

The proposed approach
To estimate the optimal substitution scale of urban GTs by ETs, this study proposes an integrated approach, which consists of three steps, namely, i) deriving the real travel demands from taxi GPS trajectories (Keler, Krisp, and Ding 2019), ii) designing a discrete-event-based simulation framework to simulate (ET, passenger and charging facility) entities (operational and charging), behaviors and (passenger waiting, Origin-Destination (OD) accepting/rejecting, charging and waiting for charging) events, iii) solving the optimization problem of estimating optimal substitution scale with multi-objectives, for example, consumed power, waiting-time of charging, load pressure of charging facilities, the average daily income of ET.
The first step is to derive the real travel demands from taxi GPS trajectories.The starting and ending time of trajectory data collection is 00:00:00 to 23:59:59 on 1 July 2018.The data involves 8650 taxis, with a total of 6,814,976 track point records.Each taxi GPS trajectory is defined as a series of GPS points with attributes (vehicle ID, timestamp, longitude, latitude, speed, direction, and service state) (see Table 1).The service state is no service (0) or in service (1).The GPS trajectory point with changing service state from 0 to 1 is viewed as the start point of a trip serviced by this taxi.The following series of GPS trajectory points with changing state from 1 to 0 is viewed as the end point of this trip.As a result, the GPS trajectory point pair {start point, end point} is an actual passenger trip serviced by the taxi.All taxi trajectories are filtered by this step to derive all traveled trips by taxi passengers.The second and third steps are critical ones in the proposed approach, which will be described in the following subsections.

Simulating the ET's behaviors using a discrete event-based framework
The ET system could be viewed as discrete for its state changes at discrete points in time.Discrete eventbased simulation (Paulista, Peixoto, and Jose 2019) is a computational modeling approach, which is suitable to simulate the dynamic evolution of the ET system in a sequence of discrete points.The evolution of the ET system is based on a logic loop as shown in Figure 1.The loop revolves around three system elements (entities, behaviors, and events) and consists of four steps, namely, modeling interactions between entities, triggering behaviors related to interactions, triggering events according to behavior, and changing the state of entities after events.The proposed simulation framework based on the logic loop is given in Figure 2.This framework comprises four modules, namely data pre-processing, simulation, record, and analysis.The data preprocess module derives the information of taxis, travel demands, and charging facilities from raw taxi trajectory data and Point Of Interest (POI) data, which would be used in the simulation module.The simulation module, the core part of this framework, contains the special behavioral mechanism caused by ETs' characteristics such as range anxiety.The record module is used to record the necessary information when events occur.It also reclaims ETs after finishing the simulation and recycles unmet ODs that have not been served after a long wait.The analysis module supports the statistics of ETs including operation indices (i.e.income and mileage utilization), the service level to travel demands, the charging pressure of charging facilities, and the cost for ETs to find a CS.Within the proposed framework, a time synchronization mechanism called Double Clock (DC) is used to establish the time order among all the participants.The DC refers to the System Clock (SC) and Individual Clock (IC).The SC serves as the unique time reference in the simulation.The IC exists in each entity, which is responsible for recording the end time of the continuous activities.Thus, IC is always ahead of SC.For example, for an ET, IC records the end time of charging or carrying passengers, and the ET will be locked until SC is pushed to IC to avoid access conflict by other entities.
The following subsections introduce the details of the simulation module: i) constructions of entities, ii) interactions between entities and corresponding behaviors, iii) triggering events according to behaviors, iv) changing the state of entities by events in the framework.

Entities
Definition 1.An ET entity corresponds to an ET in the real world, which contains ET's necessary attributes.An ET entity is defined as: where VID is the unique ID of the ET.T s and T e refer to the time points when the ET enters and leaves the simulation program, respectively.L ET is the location of the ET (i.e.latitude and longitude).T ET ind is the individual clock for the ET.S taxi oc means the occupancy or service state of the ET (0 or 1).SOC refers to the remaining capacity of the battery.V ET s represents the current speed of the ET.M rem is the remaining millage with SOC.Definition 2. A passenger entity corresponds to an OD derived from taxi GPS trajectory.The entity is devised as follows: where ODID is the unique ID of the OD.L OD s andL OD e are the locations of the origin and destination.C OD mil and C OD t record the mileage and time cost of the OD.C OD m is the money that the passenger pays for the OD.T gen and T dis are the time points when the OD generates and disappears.T dis is calculated by T gen plus T ε w .Here T ε w is the waiting time length of passengers, which should be reflected in simulation because passengers wait for taxis within limited patience.In this study, T ε w is set to 12 min according to the study (Zhao et al. 2014).If an OD is not served by an ET after T dis , then it is regarded as unmet and gets removed from the simulation program by the record module.
Definition 3. Charging facility entities correspond to CS and its CPs.A CS is equipped with more than one CP.Here, CS and CP entities are formulated as follows: where CSID is the unique ID of a CS to link with its CPs.L cs is the location of the CS entity, which is shared by all CPs within it.T cp ind is the individual clock of a CP, usage state S cp occ = {0, 1} (0 means the CP is idle, and 1 means the CP is in service).P is the electric power of the CP.

Interactions between entities and corresponding behaviors
ETs interact with service facilities and public transportation in both the virtual space and physical spaces (Wu, Gui, and Yang 2020).In the simulating process, interactions are always initiated by ETs, and they occur in two scenarios.One is in the operating process, ETs in service interact with ODs and decide whether to serve them or not by forecasting the SOC after arriving at the L OD e .The other interaction occurs when ETs need to be recharged.These ETs interact with CPs to obtain the S cp occ and the distance to the CSs, and this information helps decide which CS to choose.
In the operating process, there are OD seeking, serving, and rejecting behaviors could be triggered.Seeking behavior is implemented by choosing a strategy to find an OD, and after that, the serving/ rejecting behavior is triggered to serve the OD or not.
Definition 4. OD-seeking behavior refers to strategies adopted by ETs to find ODs and get divided into moving and waiting in this study.Drivers adopting the moving strategy will go to the nearestL OD s of the OD, while drivers adopting the waiting strategy will stay in places.Which strategy to take depends on the decision process as follows: First, supposing the current system clock is T sys , the simulation module selects ODs generated before T sys and not gotten serviced yet (Equation ( 5)).Second, the path distance between ETs and ODs will be calculated using the Dijkstra algorithm constrained by the road network, to select the nearest OD for each ET (Equation ( 6)).Furthermore, the time cost to the nearest OD will be estimated according to the speed of the ET (Equation ( 7)).If the time cost is so long that the OD will disappear after the ET arrives, it will be viewed as the ODs near the ET are sparse, which would result in a low success rate for the moving strategy.Thus, the waiting strategy then gets adopted to save electricity.If the ET can arrive before the T dis and the travel cost M cost (km) is less than the acceptable distance MaxDis of the ET (km) extracted from trajectory data, the moving strategy will be adopted according to Equation ( 9).where T move is the time cost of moving, distance L ET ; L OD s À � represents the path distance between the locations of ETs and ODs, T arrival is the expected arrival time of the ET to the nearest L OD s of the OD.
Definition 5. OD choosing behavior (OD serving/ rejecting) refers to serving the OD or rejecting it.Generally, rejecting ODs is illegal for GTs, but it is common among ETs.According to the survey of ET owners, 80% of them stated the experience of rejecting ODs (SMN 2018).And the main reason for this is the anxiety about SOC (Noel et al. 2019), it happens when the remaining SOC is lower than a psychological threshold.Moreover, the distance to the nearest CS of the L OD e also affects whether to serve the OD or not when the ET is in a low SOC.In the paper (Tian et al. 2014), it is demonstrated that the last OD accepted before the ET is charged is always near a CS, and drivers reject ODs that are inconvenient for recharging when the SOC is low.Rejecting ODs would prolong the waiting time of passengers and lead to higher cost and lower operating efficiency of ETs, thus it should be reflected in the simulation.
Therefore, the simulation rules of OD choosing behavior consider the anxiety about SOC and the charging convenience around L OD e of ODs.Firstly, the degree of driver's anxiety is divided into three levels based on SOC, namely, no anxiety, anxiety, and risky to shut down.To set the SOC break points of these levels, it is realized that the SOC of ETs is evenly distributed between 30% and 50% before charging (Bi et al. 2017).This demonstrates that drivers do not consider charging when the SOC is above 50% because they have no anxiety; however, they generally charge the ETs before the SOC is below 30% because this would be risky to shut down.Therefore, the anxiety levels of drivers are summarized in Table 2.For the first level, drivers choose ODs without considering the charging problems.For the second level, drivers select ODs prudently considering SOC.For the last level, drivers solely consider the charging problem and no longer serve ODs.
The flow of ETs' behavior in the operating process including OD seeking and choosing behavior is shown in Figure 3.When choosing an appropriate OD for an ET, the first to consider is whether the distance of the OD is beyond the max acceptable distance MaxDis of the ET.If so, the ET rejects the OD; otherwise, the continuing judgments are executed: If it is estimated that SOC >50% after reachingL OD e of the OD, it would not cause anxiety, and the ET accepts the allocated OD.If the distance of the OD is extremely long resulting that the ET would be risky to get out of service (SOC < 30%) after finishing the service, the driver then refuses the OD.If 30% < SOC < 50%, the driver will further evaluate the charging convenience L OD e of the OD.If SOC after the service is sufficient enough to move from L OD e of the OD to the nearest CS, the driver will accept the OD; otherwise, he/she will refuse it.
When ETs need to charge, the CP searching and CS seeking behaviors could be triggered.The former aims to find an idle CP for the ET.If it fails, the latter would be triggered to choose an appropriate CS for the ET to wait in queue.The following part defines these behaviors individually.Definition 6. CP searching behavior refers to the process that ETs search for idle CPs.In simulation, these ETs are able to get the location of the nearest CS with idle CPs, which is already realized in the real world.If the CS is too far to reach before the SOC falls to 30%, the CS-seeking behavior then be triggered.
Definition 7. CS seeking is defined as the process of choosing an appropriate CS for the ET, within which the waiting time length is considered.ETs go to the nearest CS first, if there are already several ETs in the line at the CS, the driver would make a comparison between the number of queuing cars (N qc ) and CPs (N cp ).If N qc < N cp , the waiting time for the ET in the current CS will not exceed the charging time of one car, the ET would choose the CS.Otherwise, the ET would go to the next CS nearly, which would be repeated until an appropriate CS is found or the SOC of the ET is not sufficient to reach the next CS.The flow of behaviors above is shown in Figure 4.

Events triggered by behaviors
Events would be triggered by specific behaviors to change states of relevant entities.There are 5 events considered in this framework, namely, passenger waiting, OD accepting, OD rejecting, charging, and waiting for charging events.These events are described in detail below.
Definition 8.The passenger waiting event is triggered when an OD is served by an ET or is reclaimed by the record module after the system time is beyond T dis .Meanwhile, the necessary information about the event would be recorded as Equation ( 10): Where t p wait is the time cost of passenger waiting calculated by Equation ( 11), and the T sys refers to the system time when the event occurs.
Definition 9.The OD accepting event is defined as that an ET serves an OD.This occurs simultaneously with the passenger waiting event.The attributes of the event as Equation ( 12) shows.
where T OD s is the system time when the event occurs, and the T OD e is the end time obtained by forecasting through Equation ( 13).C OD mil =V ET s refers to the duration of the OD allocated to the ET.Definition 10.The OD rejecting event is the opposite of the OD accepting event.Both of them are results of OD choosing behavior.And this event is triggered when the ET considers the OD is not appropriate to serve and is formed as Equation ( 14).
Definition 11.The charging event is defined as the process that ET accesses the power grid to charge.The attributes are as shown below: where the T charge s is the system time when the ET accesses the CP.The t charge is the time cost for charging and is estimated by Equation ( 16) where the charging process is considered as a linear process: where k denotes the adjustment coefficient of the target SOC for charging.M is the driving range under full power; Q 100 is the power consumption of 100 km, which is 19.5 kW•h in (Autotimes 2016).As most ET batteries are lithium batteries, it is conducive to extending the battery life by charging and discharging between 25% and 75% (Meng et al. 2016), so this study considers 75% of the battery capacity as the charging target SOC, meaning that k = 0.75.
Definition 12.The waiting event for charging is defined as the whole process of ET from joining in a waiting queue to recharging.The event is formed as below: Where T queue s is the system time when the ET joins in the waiting queue of the CS and T queue e is the time when the ET begins to recharge.N records the CS searching times before the ET joins the waiting queue in the CS.

Changing the status of entities after events
When an OD is accepted by an ET, the passenger waiting event gets triggered, and the served OD would be removed from the OD list.Simultaneously, the serving event occurs, and the status of ET is updated as follows: the S ET oc switches to 1, and the M rem is reduced by subtracting the Distance of the OD, besides, the L ET is updated as the L OD e of the OD, and the T ET ind would be pushed forward to the T OD end .The Speed is updated by the average speed of the OD, which is derived from the GPS data and able to reflect the actual traffic condition at that time.When an OD is rejected by an ET, the rejecting event then gets triggered, and the ET would seek for another passenger at next system time, and the T ET ind is updated as Equation ( 24) shows to ensure the ET would be called next time.
When an available CP is assigned to the ET, the charging event would be triggered.Both the ET and CP are locked until the charging ends, the S ET oc switches to 1, the T ET ind and T cp ind are pushed forward to the end of the charging, and the M rem is updated as pre-set value.
When an appropriate CS is found for the ET to join in the waiting line, the waiting event for charging would be triggered and make the ET get registered in the waiting queue of the chosen CS.The whole simulation flow is depicted in Figure 5.The specific steps are described as follows: (1) When the simulation starts, the elements including the road network, system clock, and all entities are first initialized.
(2) It is determined whether to terminate the simulation.
(3) The members of ET and passenger entities are updated.The external entities activated by the T sys will join in the simulation from the preprocessing module.The internal entities that have completed the simulation are reclaimed by the record module.(4) The status of the entities is updated.If it is time for some ETs to finish the serving of passengers, then these ETs are unlocked (S ET occ ¼ 0Þ:If it is time for some other ETs to complete the charging process, then S cp occ is changed to be available.And the ETs are unlocked and their SOC is updated. (5) Vacant ETs with sufficient SOC would trigger operational behaviors to serve passengers.As a result, the OD accepting or rejecting event is triggered.Once vacant ETs remain low SOC, they have to trigger charging behaviors to replenish electricity.The charging or waiting event occurs depending on the available CPs.
As for ETs that have been waiting in line at a CS, the CS would identify if there are any CPs to be released.All ETs are sorted by their waiting start time in ascending order.They are matched to the available CPs according to the first-in-first-out rule.At the same time, the charging events are triggered.As for the ET seeking for a CP, if there are available idle CPs, the ET then goes charging directly and the charging event then happens; otherwise, the ET would search for an appropriate CS to wait in line and the waiting event then occurs.(6) All the events that occurred in step 5 should be recorded.(7) The system clock is updated and the process returns to step 2.

Optimizing the substitution scale with multi-objectives conditions
It is a multi-objective optimization problem to optimize the substitution scale of urban GTs by ETs.By considering the ETs' mission of saving energy, the function as public transport, and the greatest concerned charging issue, the optimal objectives are set up to maximize saved gasoline Q T by ET fleet, maximize its service level S ET , and minimize the waiting duration for charging T wait (Equation ( 28)).In addition, the revenue of ETs' R ET should be about the same as GTs' R gasoline , which is considered as the constraint condition.Maximize: Subject to: where Q T is the total amount of gasoline that should be consumed without substitution, which is calculated as Equation ( 30), N ET is the substitution scale.The q gasoline avg is the gasoline consumption of a GT calculated by the product of fuel consumption per 100 km q 100 and daily driving range M: Refer to BYD F3, a common type of GT, q 100 is set as 5.9 L/100 km.M is set as 257 km according to statistics of its operating data.S ET is defined as the rate of served travel demands by ETs and GTs under the same fleet size (Equation ( 32)).
T wait is the total waiting duration of ET fleet in a day, which reflects the service pressure of charging facilities (Equation ( 33)).R ET (Equation ( 34)) is a function of N ET describing the relationship between the average profit of ETs r ET (Equation ( 35)) and the ET fleet size.
The changing regulation of S ET with N ET is shown in Figure 6.In Figure 6(b), as N ET increases, the competition on travel demands gets more intense, and S ET declines.There appear two break points because of the charging issue.Before the first break point N bp1 , N ET is relatively small and the charging demands could be met well, so that the operation process is almost unaffected and S ET maintains a high level.After N bp1 , some ETs that need to be recharged must wait for idle CPs, which causes S ET declines until the system is saturated at the second break point P bp2 .When N ET � N bp2 , the system is saturated and S ET stays at a low level.The piecewise regression function between S ET and N ET is shown as Equation (32) where N bp1 ; N bp2 ; α; β; k; b are undetermined parameters.By the first derivative of S ET , N bp1 andN bp2 are obvious to find (see Figure 6(a)).
T wait is related to the number of CPs and the layout of CSs, and these factors are taken into consider in the simulation framework.According to the simulation results, T wait stays at about 0 until N ET grows to the threshold N bp3 , and then T wait begins to increase as shown in Figure 7 Charging issues cost amount of time, so the operation efficiency is reduced, which is adverse to r ET .On the other hand, ETs have a lower travel-mile-cost, and it is an obvious advantage to increase r ET , which is formulated as Equations ( 35)-(37).In Equation ( 36), M 0 is the base mile of ETs, i 0 is the relative base fare, and i 1 is the price per kilometer beyond M 0 .In Equation ( 37), e 100 is the electricity consumption per 100 km of an ET, and Price e is the unit price of electricity.All of the parameters above are set as Table 3 shows.
As shown in Figure 8, R ET declines with the number of ET growing to the system saturation point N bp2 .The piecewise regression function between R ET and N ET is shown as Equation ( 33) where a 1 ; b 1 ; c 1 ; k 1 ; b 2 are undetermined parameters.Taking R ET � R gasoline as constraint condition refers to there is an intersection point N is of R ET and GTs' revenue before N bp2 , thus the feasible section of To solve multi-objective optimization problems, Multi-Objective Particle Swarm Optimization (MOPSO) (Coello and Lechuga 2002) is commonly adopted, which is used to solve the model.MOPSO adopts random mechanisms to update the particle location and realize the global search.This study implements this algorithm by taking N ET as the location of a particle to fit in the proposed MOOP model.The difference from MOPSO is that this study used the following strategy to avoid premature in the computation iteration of finding optimal substitution scale.where, t is the number of iterations, v i t ð Þ is the speed of the particle i after the update of the t iteration, v i t À 1 ð Þ is the speed of the particle i of t À 1 generation, and ω is the inertia weight.c 1 and c 2 refer to the individual learning factor and the group cognition factor, respectively.x i is the current position of the particle i and is described in this paper as the substitution scale of taxi electrification.x i t À 1 ð Þ is the position of the particle i of t À 1 generation.x t i pbest is the historical optimal solution found for the particlei up to generationt.x t gbest is the historical optimal solution found by particle swarm search up to generation t.Here, the randf is a random function.

Study area and data
The study area (Figure 9) is Zhengzhou City, China.The data used in the simulation include city-wide road network datasets, daily GPS trajectory data of all 8650 taxicabs, and POI data of CS.Road network dataset comprises four kinds of the road segments (expressway, trunk road, sub-trunk road, and branch road).
The trajectory data (July 1-15 September 2018) record the location of taxis and other attributes versus time by a series of points.The attributes include VID, V ET s , direction, and S ET occ .The direction is indicated by azimuth, which is an integer between 0 and 360.The CS POI data involve the spatial distribution of CS and the number of CPs each CS contains, as shown in Figure 10.
The datasets mentioned above were preprocessed at first.A network dataset in ArcGIS based on road network data was built.The network dataset is used to calculate the path distance between the entities in the simulation.The trajectory data are categorized into files according to the license plate number.Each file corresponds to the trajectory information of one taxi.These 8650 files help extract information regarding the  operation of each taxi and the orders they complete to form the taxi and OD pools.The operation information includes the starting and ending time of taxis' operations, which will be used to control the entry or exit of the ET entity from the simulation program, as well as the starting position of the taxi, which determines the position of the ET entity to be initialized in the simulation system.Compared with the random distribution, this real spatial distribution of ETs is more conducive to improve the simulation accuracy.The extracted order information includes the origin and destination, time and mileage costs, and income of the order.At last, the unique IDs to CSs and CPs were assigned for identification, and the inclusion relationship between them was established.

System initialization and parameter settings
The initialization sets up an original state for the simulation, which involves the simulation environment and entities.First, the network analysis tool is activated to initialize the road network, which is used to plan the moving route of ETs.Then, charging facility entities are established according to their spatial position and quantity.And ET entities are set up by selecting fixed amount taxis from the initial state pool of taxis randomly.And this state pool is built up by the real operation data of GTs, which contains the initial position and operation state of these taxis.As for ETs' initial battery state, we suppose ETs are fully recharged.Simultaneously, passenger entities are initialized in the corresponding position.
Table 4 gives the setting of parameters involved in the simulation program.The starting time of the system clock is 00:00, the ending time is 23:59, and the time step is 1 min in this simulation.The starting and ending times are set with limit of a day.The driving range of the ET is set to 240 km by referring to the performance parameters of BYD-E6 in the literature (MAIGOO 2018).The charging power of the CPs is set

The optimize the substitution scale in simulations
In order to explore the range of the optimal substitution scale, by using the concept of linear fitting, this study conducted some simulation under the constraint of N ET growing from 0 to 8000 at the interval of 1000.The result shows the market is already saturated when N ET reaches 3000.To explore the changing rules of system status among range [0, 3000], further experiments were conducted at the interval of 100 within this range.Therefore, 30 kinds of substitution scales are simulated.Based on the simulation results, the parameters without dimension of the proposed MOOP model (see Table 5) are figured out according to the linear fitting approach.In this study, the average revenue of GT R gasoline is about 395 CNY.According to the constraint R ET � R gasoline , the value range of N ET is solved as [0,811] through Equation ( 32).Supposing the OSS as N � .When solving N � with the MOPSO algorithm, the number of iterations is set to 150, and the number of effective solutions reserved for each iteration is set to 200, of which the average value N i is considered as the temporary optimal value for iteration i.Further, as the number of iterations increases, N i tends to stabilize (Figure 11).According to the results, this study finds the final solution of this model as that N � = 778.

Operating indices of ETs
This study used six indices to describe the operation of taxis, including OD number, income, total mileage, loaded mileage, occupied time, and the mileage utilization ratio of a taxi.These indices are calculated for GTs and ETs individually.The results in Figure 12 show that the indicators of GTs only fluctuate slightly with the increase in fleet size, and the operation performance is relatively stable.For GTs, they could complete approximately 27 orders every day on average and have the income about 488 CNY.Their total mileages per taxi are about 252 km per day.Within the total mileage, the loaded mileage is 183 km, and the serving time is about 383 min.The mileage utilization rate is about 0.7.As for the ETs, all indices decrease with the increasing number of ETs.Among these indices, the OD number, income, total mileage, and loaded mileage are below those of GTs because the running range is limited and the charging consumes a long time.Moreover, the loaded time of ETs is the same as GTs and even larger under the scenario of a small fleet size because ETs tend to select shortdistance ODs for reducing the risk of stalling.These ODs are distributed mostly in urban areas with high crowd mobility and dense traffic.Therefore, it leads to a lower speed and longer loaded time of ETs.Furthermore, the mileage utilization of ETs is generally higher than GTs, which is due to the order allocation algorithm for ETs following the proximity principle, a common strategy adopted by ETs in reality to avoid power wastage caused by extremely long moving distances.In contrast, fuel consumption lightly contributes to the decision taken by GTs, which enables them to obtain objective income even though the mileage utilization is low.The comparison of mileage usage rate demonstrates that ETs are optional for ODs.Finally, the results of all indices present a downward trend with the increasing number of ETs, and the rate of decline gradually decreases.As the number of ETs increases to about 3000, these indices tend to be stable and ETs in the market are in a saturated state.

Unmet travel demands of passengers
Unmet travel demands indicate ODs that are served by GTs under practical scenarios, however, are not met by ETs in the simulation.Here, we analyze this index under two situations, the saturation state of ETs in the market and the complete substitution state of GTs by ETs, to satisfy the requirements of decision makers.The ET fleet size corresponding to the saturation state is about 3000.The study area is divided into 100 traffic zones by main roads, as shown in Figure 13, for further revealing the spatial

Usage state of charging facilities
The Usage rate R cs occ describes the busyness of a CS on the timescale of one hour.And R cs occ is calculated by Equations ( 37) and ( 38).Relevant usage rates of CSs in one day under optimal substitution scenario (ET fleet   It is obvious that CSs are much busier in scenario (b).And almost all the CSs reach high R cs occ at 6:00 and continue until 22:00 because charging demands generated by 3000 ETs grow fast beyond the serving ability of CSs.As for scenario (a), the charging peak period is about 15:00 to 18:00.Many CSs only show slight busyness all day, and some CSs are busy only within charging peak period, reflecting that they are totally able to handle these charging demands.Moreover, it is noticed that No. 28 CS is always the first to get fully occupied and lasts for the longest time in two scenarios, which is because this CS is only equipped with one CP and needs to be expanded.
To quantify the charging pressure on the power grid brought by the ET fleet under the optimal substitution scenario, the stacked area chart of power consumption is shown in Figure 19, where the outermost contour exhibits the varying trend of total power consumption.The first peak appears at 6:00 ~ 7:00, this is the end of the first battery cycle of some ETs.These ETs prefer operating at night.And the battery cycle is defined as the period from an ET start operating with full-battery until it needs to recharge.After 13:00, the power consumption rapidly increases, peaking at 840 kW•h at 15:00.These charging demands are generated by the main body of ETs.They start operating in the early morning and need to recharge at noon.Then, the charging heat decreases gradually, reflecting that the charging facilities can absorb these charging demands.

Charging queuing behavior of ETs
The searching times before the ET find a suitable one reflecting the sufficient degree of charging facilities relative to charging demands.Here, the frequencies of different searching times under various ET fleet sizes are compared in Figure 20.We can see that when the number of ETs is 500, all the ETs could directly access a free CP when they need to charge.As the number of ETs grows, the searching times for a suitable CS increase significantly.Moreover, the increment of the frequency corresponding to different searching times varies considerably.To find out the reason behind this, this study adopted the natural discontinuity point method to divide the searching times into three levels according to their increment of frequency as follows: low-value (0-3), median-value (4-7), and high-value (>7).These three levels refer to the charging difficulty from low to high.The frequency in the low-value area is mainly limited by the number of charging facilities.Thus, it does not vary considerably, irrespective of the number of ETs.The frequency in the high-value area is limited by the residual running range of ETs, which is rarely sufficient to search for CSs more than seven times.Therefore, this study concludes that the standard to measure the adequacy of charging facilities should depend on if a suitable CS could be found within three searching times.

Optimized results under different CS layouts
The proposed method firstly simulates the behaviors of the ET fleet, and further figures out the OSSs by the MOOP model.The case study applies this method under a fixed CS layout.This method is available for different CS layouts, which is practical for decision makers to understand the rationality of CS planning schemes.For example, according to the "Zhengzhou 13th five-year plan for electric vehicle charging infrastructure development" (ZZPG 2015), hereafter referred to as plan, we get information of CSs from 2016 to 2020, including the completion time, the number of CSs and CPs, and the locations of CSs.The quantities and relevant layouts of charging facilities from 2016 to 2020 are shown in Table 6 and Figure 21 respectively.
The growth of charging facilities could enlarge the market capacity of ETs.To further explore the OSSs  under CS layouts from 2016 to 2020, we formulate and solve the relevant MOP models based on the method proposed in Section 2.2.And the iterations of OSSs from 2016 to 2020 based on the MOPSO algorithm are shown in Figure 22.The comparison of OSSs is drawn in Figure 23.Firstly, we can see there is a significant increase in OSS because of the growth of charging facilities.Then, the relationship between OSS and charging facilities is unlinear, which is because not only the quantity but also the spatial distribution of CSs also impacts the charging efficiency of the system.Moreover, charging facilities in plan until 2020 seem still not sufficient to realize the complete electrification.To sum up, this application on OSS under different CS layouts shows that the method proposed in this paper could supply a long-term support for government decision-making.

Comparison with a mathematical optimization model
This study chose a mathematical model as the reference model (Sathaye 2014), which is developed to describe the relationship between system states and ET fleet size.As the influencing factors in this model are similar to those considered in the proposed model of this paper, the two models are highly comparable.The reference model takes the cost as the optimization objective as Equation (39), including the management cost of taxi companies, the time cost of passengers, the construction cost of charging facilities, and the distance cost of taxis.The reference model supposes that the ETs should be in one of the three states (empty-driving, passenger-carrying, and charging), and the number of vehicles in each state is stable when the system is in equilibrium.Therefore, the number of ETs in each state could be firstly solved and then add them up to get the total number.Therefore, the reference model is mathematically expressed as Equations ( 39)-( 49) (Sathaye 2014).( 42)  where the parameters involved in the model are explained in Table 7, and the values are set considering the specific situation of the research area.
This study adopts the exhaustive method to list all of the M E and the corresponding cost under the optimal taxi fleet to elect the optimal M E with the minimum cost.Table 8 shows the distribution of the number of ETs in the three kinds of state solved by the reference model.
To make a further comparison, this study takes three indices, including the empty-driving ratio, the gasoline saved, and the utilization rate of CPs, to represent the system state under the two substitution  solutions as Table 9 shows.The actual empty-driving ratio of taxi fleet is about 60% according to the taxi trajectory data, and this index of the proposed approach is about 54%, while the index of the reference model is about 10%.Therefore, the proposed approach shows a more moderate and reasonable description of the ETs' system.On the other hand, the saved gasoline of the two models is 11,018.7 L/d and 4380.8L/d, respectively, demonstrating that the proposed approach would bring much more benefits to energy and the environment.In the aspect of the utilization of CPs, the two models are similar.In short, compared with the reference model, the proposed approach shows a more reasonable abstraction of the ET's system, and its solution brings more benefits to resources and the environment.Moreover, the proposed approach is suitable for dynamic application scenarios and could show more details of the system both in spatial and timescales, while mathematical modeling methods under a static assumption are weak in considering spatiotemporal factors.

Conclusion and discussion
This paper presents an approach based on the discreteevent to simulate the behaviors of ETs in the market, and eventually to estimate the substitution of GTs by ETs under various scales.This approach considers real datasets, including road network, GPS trajectory data, and CS POI data, to construct a realistic virtual scene for the simulation, which is easily expanded and updated.The contribution of the paper could be summarized as follows: (1) a discrete-event-based approach is proposed to simulate the spatiotemporal behavior of ETs, which supports to obtain the quantitative impacts on multiple participants caused by the number of ETs.
(2) the status changing law of entities under different substitution scales in the study area, such as the operating indices of ETs, the unsatisfied travel requirements of passengers, and the usage state of charging facilities are revealed, which provide practical guidance to the decisionmaking of the optimal substitution scale, and (3) a multiobjective optimization model is proposed to optimize quantity of ETs in cities, which help to find the optimal substitution scale of GTs by ET.This proposed approach provides scientific support for policy-making, exhibiting crucial practical significance in the current context of taxi electrification.Moreover, the proposed approach could be used in many other application scenarios, such as identifying a better CS planning scheme by replacing the input datasets of the road network or charging facilities.Based on the spatiotemporal distribution of charging demands obtained by simulation, studies on CS location and capacity determination could be conducted.Further, the system state under the coexistence of multiple vehicles can also be simulated by considering the behavior of other types of vehicles.
The simulation framework proposed in this paper can provide a general tool for different cities to solve the problem of taxi electrification-scale optimization and provide support for decision makers.In order to achieve this goal, on the one hand, the simulation framework has great flexibility in design and provides many adjustable parameters.The entities' attributes, the triggering conditions of behaviors and events can be set according to the specific situation of the city.On the other hand, input data of the simulation framework are easy to obtain, mainly including network analysis data set generated based on urban vector road network data, gasoline taxi track data, longitude and latitude of charging stations and the number of charging piles.
While this study provides a practical tool to support the policy-making for the urban taxi electrification problem, there are still several further improvements.First, from the perspective of travel demands, only ODs that have been satisfied in real life are considered in the simulation.Those unmet ODs get neglected because of the difficulty in collecting the necessary data.Second, as for simulation rules, it is necessary to supplement the nonoperational behavior of taxis, such as dining and shift taking.Further, attention should be paid to the impact of such behaviors on drivers' operation and recharging decisions, which involve the experience of the drivers.Third, the simulation does not consider subsidy policies for ETs, which plays a vital role for promoting the popularization of ETs at the present stage.Fourth, owing to the lag in the charging facilities, taxi electrification should still be further improved.In other words, ETs and GTs will still coexist for a long time.Therefore, an effective business model for ETs to lessen their disadvantages in market competition should be designed, and the effectiveness of the model through simulations should be verified.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Figure 1 .
Figure 1.The logic of ET simulation.

Figure 4 .
Figure 4. Simulation flow chart of CS searching behavior.
(b), and P bp3 is determined by the first derivative of T wait as shown in Figure 7(a).The piecewise regression function between T wait and N ET is shown as Equation (33) where N bp3 ; μ; σ; a 0 ; b 0 ; c 0 are undetermined parameters.

Figure 6 .
Figure 6.The changing regulation of S ET under different fleet sizes.

Figure 7 .
Figure 7.The changing regulation of T wait under different fleet sizes.

Figure 9 .
Figure 9. Study area and the road network.

Figure 8 .
Figure 8.The changing regulation of R ET under different fleet sizes.

Figure 11 .
Figure 11.Iteration of optimal substitution scale of taxi electrification.

Figure 12 .
Figure 12.Comparison of operation indices under different ET fleet sizes.

Figure 14 .
Figure 14.Temporal distribution of travel demands.
size is 778) and saturation scenario (ET fleet size is 3000) are visualized as Figure 18(a,b) shows, respectively.And the shade of the color indicates the utilization rate.where T cp occ represents occupied time of CP within one hour, R cp occ is the time utilization of a CP, and n is the number of CPs contained in a CS.

Figure 15 .
Figure 15.Spatial distribution of travel demands.

Figure 16 .
Figure 16.Spatial and temporal distribution of unmet travel demands.

Figure 17 .
Figure 17.Distribution of business districts.

Figure 18 .
Figure 18.The usage of CSs under optimal substitution scenario and saturation substitution scenario.

Figure 19 .
Figure 19.Css loads in temporal and spatial scales.

Figure 20 .
Figure 20.Frequency for each CS searching times in various electrification scale.

Figure 22 .
Figure 22.Iterations of OSS for 2016 to 2020 based on MOPSO algorithm.

Table 1 .
An example of taxi GPS trajectory for identifying serviced trips by taxi.

Table 2 .
Anxiety scale of the driver.

Table 4 .
Setting of parameters involved in the simulation program.

Table 5 .
Parameters of the MOOP model derived from simulations.

Table 6 .
The comparison of the quantity of charging facilities.

Table 7 .
Parameters mentioned in the model.

Table 8 .
Optimal number of the reference model.

Table 9 .
Comparison of the two solutions.Empty-driving ratio (%) Gasoline saved (L/d) Utilization rate of CPs