Optimal placement and sizing of charging infrastructure for EVs under information-sharing

In the next decade, charging demand from an increasing number of electric vehicles will require the charging infrastructure to be further developed. However, the planning of an optimal charging infrastructure is a complex problem as it involves representation of charging demand in space and time, interaction with supply through queuing models and optimisation of placements and sizing of charging stations. The paper takes on this challenge by proposing how trip diaries can be used to develop a space–time demand simulator for electric vehicle movements and be integrated with models for optimal locations of charging stations. In the paper, charging demand is integrated with an information-sharing system, which pass waiting time predictions from the system to the users. An approximation of expected waiting time, depending on generic station specific inputs, is derived from queuing theory. The methodology is applied to the city of Copenhagen and it is found that information sharing lead to better utilisation of charging capacity. Even in a situation where 50% of the population share information, the system performance is almost on par with a situation where all agents are informed. The paper underlines the need to for information sharing in the planning of future charging systems.


Introduction
The planning of charging infrastructure and road network planning involves many of the same challenges.Both systems represent interactions between demand and supply, which if demand exceed capacity, implies congestion.In the route choice literature, it has been widely recognised that 'selfish routing' render unnecessary queuing in road networks (Roughgarden and Tardos, 2002;Li et al., 2017).Better solutions can be obtained if users are provided with intelligent route guidance, where users are given different route options.This causes a spreading of demand, which in turn lowers congestion.It is here argued that for the charging of EVs, the problem of concentrated demand is even more serious, and the effect of being able to spread demand is therefore potentially larger.The main reason for this is that, as opposed to route choice problems where the number of reasonable alternative routes are often limited, battery flexibility of EVs combined with a diverse charging infrastructure, provides many good alternative charging opportunities for users.
In the paper, we investigate this hypothesis by presenting a model framework consisting of four main modelling stages.In the first stage, charging demand is simulated based on a microscopic space-time model.The model is essentially formulated as a sampler from a largescale trip diary.In the second stage, demand is linked with a given charging infrastructure.This requires modelling of the queuing dynamics at the level of the individual chargers.The third stage involves the prediction of a waiting time distribution that can be shared across the system.This approximation is derived from queuing theory and is shown to be effective as a mean to reduce waiting time in the system.In the final fourth stage, the demand-supply system is embedded in a black-box optimisation framework to explore optimal placement and sizing of chargers.The system is analysed for a realistic case-study in the city of Copenhagen and conclusions are drawn as regard to the value of information, the robustness of the solution approach and the general implication of the findings.
The paper is organised in the following way.First, we provide a literature review in Section 1.1.Then we present the methodology in Section 2. In Section 3 we describe a large-scale application for the Copenhagen region and offer a discussion of results in Section 4. Finally, in Section 5 we offer a conclusion and discussion of future research.

Literature review
The design of charging infrastructure for EVs has been approached from mainly two different methodological areas.In the operational https://doi.org/10.1016/j.techfore.2022.122205Received 4 February 2022; Accepted 15 November 2022 research literature, the challenge of designing an optimal charging supply has been approached as a location problem, where the P-median location problem (Hakimi, 1964) is closely related.The formulation of the problem is equivalent to placing a constrained amount of facilities at  locations such that transportation costs for a set of travelling customers are minimised.A first formulation of the location problem as a P-median problem is considered in Hengsong et al. (2010).It is here proposed to formulate the problem as a multi-objective optimisation algorithm that accounts for such factors as electricity supply, priority with respect to the transformation of existing gas stations to charging stations and EV adoption.The algorithm is applied to the area of Chengdu in China.Another related approach is that of set covering models.The introduction of set covering models to cover maximum amount of Origin-Destinations (OD) in flow-capturing allocation models is considered in Kuby and Lim (2005).The maximum coverage model is later extended by Wang and Lin (2009) and Wang and Wang (2010) to represent a flow-based set covering model where different types of chargers and facility budget constraints are allowed.In these types of models, the location problem is formulated as mixed integer problems, such as in Wang and Lin (2013).While these can be solved for small-scale problems, the problem is NP hard and has inspired researchers to consider relaxations and alternative heuristics, such as in Kadri et al. (2020) who use genetic algorithms.
In Xiangyu and Rui (2020) the maximum set covering model is combined with queuing models of the M/M/s type.Hence, arrivals are assumed to follow a Poisson process with exponentially distributed inter-arrival times and service times.The model considers reservation services as a means to reduce waiting time.Another interesting contribution is due to Yıldız et al. (2019).While their approach is closely related to the set covering type of models, it applies a stochastic programming approach to allow for heterogeneous demand, account for uncertainties in multiple factors, and be able to integrate with other systems without suffering from aggregation bias.Another recent contribution is that of Tran et al. (2020) who use a cross-entropy based method to solve the charger location problem as a bi-level optimisation problem.First by minimising the overall system cost, and secondly by solving a lower level assignment equilibrium problem.A common characteristic of models from the operational research community is that user charging behaviour is formulated in simplistic ways and often without allowing for systematic randomness.Moreover, to be able to present closed-form MIP formulations, there are limitations as regard the formulation of demand and the corresponding queuing systems as presented in Xiangyu and Rui (2020).
Another way of approaching the problem is to use simulation-based techniques.These studies can be divided into meta-simulation models and true bottom-up simulations.The first type of models represents demand and supply at semi-aggregate levels, such as in Levinson and West (2017).While these models are often comprehensive in their scope and the markets they integrate, they are simplistic in terms of geography, queuing behaviour and charging behaviour.Other examples include Márquez-Fernández et al. (2019) and Gjelaj et al. (2020).In the latter case, the charging infrastructure is integrated with the power grid, however, as opposed to a bottom-up approach, demand is simplistic.For simulation models of the bottom-up type, the idea is to simulate demand for every agent for a given charging infrastructure.By monitoring system performance (e.g., waiting time and detours) over a period of time, based on interactions between the demand simulator and a disaggregate queuing mechanism, optimal charging infrastructures can be approximated.The main motivation of applying a micro-simulation approach is the ability to frame the problem as a probabilistic problem from the bottom and up and derive implied probability distributions for waiting time and other KPIs, such as the maximum waiting time or the 99% percentile of the waiting time.This makes it possible to investigate more preciously what infrastructure is required to attain service levels in situations with peak demand.
The advantage is that demand can be heterogeneous and formulated in completely flexible ways and linked to an entirely data-driven queuing system of the G/G/c type.A few papers, such as Pruckner et al. (2017), use a discrete event-simulator to represent demand and analyse capacity utilisation in a charging system.However, their approach does not consider proper queuing dynamics at the level of the chargers and refrains from using optimisation when evaluating locations for chargers.A more advanced approach is due to Yang et al. (2019) who simulate a charging system with M/M/c queuing technology and a Markovian demand representation.While their model is used to explore optimal pricing and waiting time distributions, the location choice is absent from the analysis.Other agent-based models with an explicit representation of supply and demand is presented in van der Kam et al. (2019) and in Viswanathan et al. (2016).However, queuing behaviour is of a more simplistic type and their focus is mostly on the integration with the power grid.
In this paper, we consider the problem from a simulation-based bottom-up perspective, but with integration to queuing dynamics and optimisation of charging locations.The paper contributes to the literature in the following respects.Firstly, while the demand simulation in the paper is deliberately simplistic, it shows how realistic demand formations can be generated from large-scale trip diaries.These agents can then be calibrated to any preferred future target or integrated with behavioural models.Secondly, we allow for the most advanced type of queuing dynamics of the G/G/c type to link demand and supply at the charger level.Consequentially, it allows us to propose, as a first paper, how waiting time information can be approximated and shared under realistic assumptions, e.g. based on knowledge of queuing lengths at the stations possibly facilitated by camera technology.A final contribution is that we show how the microscopic system can be embedded with an optimisation framework to identify an optimal charging supply.While this has been done before, it has, to the best of our knowledge, not been implemented with the level of detail provided here.

System overview
Fig. 1 illustrates the model structure that consists of three main modules: (i) demand simulation, (ii) queuing simulation and (iii) supply optimisation.The queuing and optimisation modules are generic models and can be applied in combination with any microscopic demand prediction.The demand simulation, on the other hand, involves several sub-processes such as agent sampling, space partitioning, calibration, route choice calculation and the recalculation of utility functions with respect to the charging decision.The latter process is based on an updating principle of the corresponding utility functions in order for these to be consistent with the waiting time performance at the level of chargers.All of these processes define the bottom-up simulator, which is then fed to the queuing simulation and subsequently used in the optimisation algorithms.
The process starts with a list of agent trips with departure and arrival time and origin and destination sampled according to a large-scale trip diary.These trips are then passed to a discrete event simulator.At the same time, a charging structure is introduced in the geographical space, and the event-based simulator of charging demand uses this information to align demand and supply.Based on the demand and the supply of charging outlets, the model render a waiting time distribution as well as other specific KPIs of interest.All of these quantities are then passed on to the optimiser.The optimiser changes the charging infrastructure and feeds it back to the demand simulator.This process is run iteratively until the optimisation converges or terminates based on some stopping criteria.
The system is simulated over a period of time, in our case a week, and all relevant KPIs are then measured over the entire period as is common for dynamic simulation systems.

Information sharing
The idea of using information sharing in the context of an EV charging system is that the choice of well-informed users, will prevent queuing at the level of chargers and improve the overall system performance.More specifically, by informing users about waiting time at the chargers, these are able to use their battery flexibility (Hipolito et al., 2022) in combination with a geographically scattered charging infrastructure to circumvent queuing.It will cause the utilisation percentage at the stations to follow a largely uniform distribution pattern.However, information sharing can take many forms depending on the updating frequency and the precision and the amount of information that is shared.In this paper, we envision a situation where stations have equipment to monitor the length of the queues at every charger but do not know the state-of-charge or battery capacity of the incoming cars.Nor do we have information on how long cars are expected to charge.Based on this information, we develop a prediction model for the waiting time (refer to Section 2.4) that can be passed on to the users.The updating frequency is a one-shot calculation when the cars depart from home or arrive at the borders to the Copenhagen area.
The information sharing in the paper is thereby formed in three stages, as illustrated in Fig. 2. Firstly, each car in the system is assigned a departure time.This information is passed on to a virtual data sharing hub prior to departure.The data hub returns a 'snapshot' of the expected waiting time for each charging station at the time of the request and the travel time detour of visiting a given station, provided that the car travel according to a shortest route between the origin and the destination.Based on these data, the driver decide on a charging strategy based on his/her indirect utility function.
The waiting time in the system is re-calculated for small time-steps over the day and constantly fed to the data sharing hub to inform users at any time.Any station can represent many chargers and even different chargers, as will be discussed in Section 2.5.

Charging demand simulation
Charging demand is simulated by considering two levels of modelling; (i) a system level (macro level), and (ii) a charger/station level (micro level).The system level demand describes the overall need for EV charging in some constrained geographical area for some finite time period.The overall demand for charging is formed by using a sampling approach, which is similar to the approach in Rich et al. (2022), although the specific sample is different.The following states are considered.
1. Sampling of list of vehicle trips from the Danish trip diary (Christiansen and Baescu, 2020).Each trip is defined according to zone origin and destination, departure and arrival time, day variation (refer to Fig. 4).Trips are re-sampled from the trip diary with replacement to make sure that number of trips correspond to trip matrices from the National Transport model (Rich and Hansen, 2016).2. Every trip is assigned a shortest path from the origin to the destination by using a Dijkstra algorithm (Dijkstra, 1959).3. Based on national projections for the number of EVs, we sample a proportional share of EV trips.That is, if the share of EVs is expected to constitute a market share of 20%, we sample a corresponding market share of 20% from the list of vehicle trips in (1).4. Sampling of driving range and initial State-of-Charge for all vehicles is based on their expected home-charging opportunities and the length of the trip.While in this paper, we did not use the steady-state approximation provided in Hipolito et al. (2022), we project driving range by a shifted-mean approach as also described in Rich et al. (2022).5. Final simulation of EV trips that will need to charge based on (1)-( 4) and a generic probability model that links state-of-charge (SoC) with the probability of charging.The final list of EV trips in ( 5) is fed to the event-based micro simulator, where the charging needs for all agents over one week are modelled in the system.The main reasoning behind this bottom-up approach is twofold.Firstly, as described above, it is possible to align and calibrate demand at the macro level by accounting for spatial and temporal variation that conforms to (real) traffic patterns from a survey.Secondly, it is possible to introduce stochastic variation at the micro level by utilising the detailed properties of the sampled agents.As an example, it is possible to simulate variations in SoC as a function of home charging opportunities.All of this makes it possible to simulate complex microscopic behaviour conditional on proper macro scenarios.A straightforward alternative to the sampling-based approach described above, is to apply an agent-based transport model.Irrespective of which approach is used, a generic way of transforming trips into charging demand is to consider a generalised probability function  () which defines the probability that an agent  for a given trip needs to charge,   (ℎ = 1|).Agents from the transport model or the survey data are then sampled from  () to fit the margins for a given geographical area or intersection.We will refer to  as a feature set for agent , which could include information concerning homecharging availability, range capabilities of the cars, the composition of the car fleet and behavioural variation for different types of drivers.Provided that   (ℎ = 1|) is known, users face the problem of deciding where and when to charge.This problem involves a more formal representation of space and time.Consider  = 1, … ,  EVs in demand for charging.Following a similar notation as in Sheffi (1984), for each individual, the relevant charging supply is represented by a charging station  on link  belonging to a path  on a route bundle between the origin of the trip  and the destination of the trip .Links refer to a connected road-network that is used in the Danish National transport model.Formally, we may write   ∈  ∈   and define  * as the shortest path between  and  whereas (  ) represent the shortest path when charging at station   .While the supply can be considered independent of , the decision to charge also depends on the time of charging.This is because the available queuing capacity will vary through time because it depends on the demand at time .
Provided that the charging decision can be formulated as a random utility maximisation problem with indirect utility function    and an error-term structure that renders a multinomial logit probability model, the probability that a person will choose a given charging outlet   on a route  between  and  is given by In addition, we note that the feature set  for probability   (ℎ = 1|) involves a spatial and temporal mapping.Hence,   (ℎ = 1|) is the probability that a given vehicle will charge on a given path between  and  at time .It is here noted that vehicles, when choosing a specific charging location (as described below), are allowed to make detours from the path to utilise charging options further away on the expense of a detour.Moreover, to simplify the expression of demand (  | ∈ ()), the flow of vehicles can be defined as   , which will typically be formed from macro constraints as described above.The total absolute charging demand for charging outlets is then defined as The demand is thereby calculated by summing the charging needs with respect to a given charging station over all vehicles that pass that station and are in the need for charging on the specific path between  and  and at a given time .As a result, the model is calibrated with respect to the external   information provided by an arbitrary transport model (Rich and Hansen, 2016).At this stage we can think of time  as a discrete representation that enables us to measure the queuing dynamics and at the same time makes it possible to calibrate the temporal loading of the system.Technically, however, it is formulated as a discrete even-based simulator to avoid unnecessary computational overhead from calculating intermediate stages in the state-space.
It should be noted that while the notion of supply at a given location  is here expressed in a binary format for each agent, e.g. the choice of a given outlet, it naturally translates to demand for capacity at all locations when aggregated over agents.This aggregation is essentially the 'bridge' between the microscopic simulator and the optimisation framework, where capacity is optimised in a more broad sense in the space-time domain.

The charging decision formulated as a utility maximisation problem
While the previous section expressed demand in very general terms, it is relevant to consider the form of    in more detail.The problem each individual faces, given that charging is required on the trip, is to choose the charging station that involves a minimal detour and a minimum of charging time.Hence, each charging station will represent a choice in the space-time domain.This can be formulated as a utility maximisation problem over a set of alternatives and with a known utility function represented as With the indirect utility function given by is a function representing the extra travel time of a given charging detour, and   is a function that describes the time of charging.The latter involves direct charging time and waiting time.While   is relatively easy to measure through a route choice algorithm if we avoid the assumption of congestion,   involves further consideration.One of the challenges is that   is the result of a heterogeneous queuing process.Hence, at most, we can assume that there exists a good predictor of   at the point in time when our charging decision is formed.Clearly, the quality of this predictor depends on the amount of information that can be made available from the system and when this information is passed on to the users.
At the general level, we can think of the waiting time   in the system as a function of the expected time a vehicle is charging [  ] (measured over all vehicles in the system) and a size variable  derived from the system that represents the general loading of the system.At this stage,  is unknown but is dependent on capacity and charging demand.Hence, If   is the time a vehicle has charged, and because there is no knowledge of arrival times in the system, we have that The share of charge time remaining is then defined as Clearly, this formulation of   is not very useful from an operational perspective.To make it operational, further information is required from the system.Firstly, it is important to know the state of the infrastructure, e.g. the number of chargers at a station.A second reasonable assumption is that the number of vehicles at each station can be monitored.This could be based on camera technology or based on electronic tags.The problem then it to translate the observed demand into a waiting time distribution for each charger.Consider a station with  charging outlets, amount of vehicles at station equal to  and queue size equal  = max(0,  −).A new vehicle arriving to the station face one of four scenarios:.1. number of outlets is greater than the number of cars ( > ) 2. the number of outlets is equal to the number of cars ( = ) 3. the queue is greater than zero but smaller than the number of outlets (0 <  < ) 4. the queue is greater than the outlets ( ≥ ) In the first scenario, vehicles have waiting times equal to zero, since a charger is immediately free.Hence, if we think in terms of the general loading variable, we have that  = 0.In the second scenario, where capacity meets demand, the vehicle must wait for one vehicle to finish charging.In this situation  is given by the expectation of the first order statistic: In the third scenario, the queue is non-empty and the new arriving vehicle must wait for  + 1 vehicles before leaving the system.A generalisation of Eq. ( 8) is then to consider any order statistics  ∈ {1, … , } with Over uniformly distributed i.i.d variables, the order statistics have a beta distribution And the expectation of   is given by The contribution to the incurred waiting time from the vehicles currently charging at the station is then given by In the fourth scenario the contribution from vehicles that are already in the queue must also be considered.This contribution is given by where ⌊⌋ is the 'floor function'.Hence ⌊   ⌋ returns the greatest integer less than or equal to   .The waiting time function (, ) can now be expressed in a compact form for all scenarios.The factor is given by and return 0 when the station capacity is not satiated and 1 otherwise.
The fourth scenario can be explained by an example.If the queuing size  = 4, and  = 4, then the first term contribution is equal to [  ] 2 5 , which correspond to the order statistics.The second term is equal to ⌊(1∕1)⌋ = 1.This is because one needs to finish the batch of 4 before adding the first contribution.
The general waiting time formula can then be expressed as follows: The intuition behind this expression for queuing waiting time is easiest explained if we think of the queue as sets of vehicle batches of size .The term  (mod ) + 1 matches any new arriving vehicle to a charger {1, … , }.The vehicle then has to wait for the vehicle at charger  (mod ) + 1 to finish and wait for each full batch of vehicles currently in the queue.Note that the queue is configured as single queue with multiple servers.While in the paper, we allow queues to be 'infinite' this is obviously not realistic when considered from a practical planning perspective.However, it is technically straightforward to implement a queuing capacity constraint, although it will require further knowledge about the specific queuing space for each locations.While the predictor   is based on a realistic amount of information in the system, it is not based on perfect information.Hence, it could be improved if more or better information could be passed on to the users.As an example, it could be improved if exact arrival times were known or if the initial SoC of vehicles was known upon arrival.Generally speaking, if stations are large with many chargers, the precision of the estimator is expected to be relatively good due to the law of large numbers.On the other hand, if stations are smaller, random effects will increase the variance of the predictor.Depending on what information is available in the future, it could present a case for developing larger stations simply because the predictor gets better.
The term   (  ) representing the total charging time, including waiting time (before charging) and time spent while charging.However, because utility is a relative measure and the time spent on charging is not assumed to depend on the place of charging, the latter term cancels out.Hence, the utility function consists of −  ( * , (  )) and   only.As cars are forced to charge (no need to calibrate the charging decision) and no other features are introduced, e.g.facilities at charging stations, utility is here translated to a simple time-loss without the need for additional parameters.
The real sensitivity with respect to time for charging and detours may differ from what is assumed here, and it will be relevant for future studies to integrate such information into the utility functions, e.g.such as in Rich et al. (2022).

Optimisation of charging supply
The charging supply is formally represented by a set of charging stations at geographical locations, each compromising a set of chargers.The optimisation of the charging supply then represents the task of identifying an optimal configuration  based on the demand formation (  ) for all stations at all links for all time periods.As illustrated in Section 2.2, demand can be naturally linked with information from the queuing system that predicts the level of service of the system as a function of demand.

The queuing system
In the previous section we developed a closed form expression for the expected waiting time provided that information concerning waiting time for the different charging stations could be shared.However, it is important to state that while such formulation is useful from the perspective of the user, it does not mean that demand can be expressed in a simple parametric form.The traditional notation of queuing systems (Kendall, 1953) defines three main ingredients: (i) the demand distribution, (ii) the service distribution and, (iii) the explicit representation of capacity.The most widely studied type of systems is known as ∕∕ systems.These systems assume Poisson arrival processes, exponentially distributed service times and a fixed capacity.The systems are defined for known analytical distributions with time invariant parameters and have corresponding analytical solution or, at worst, bounded moments for the relevant statistics such as   .In this case, however, it is not possible to state an analytical form of the demand and service distributions.The queuing system is therefore referred to as a generalised queuing system, e.g. a ∕∕ system with  symbolising the generalised nature of the underlying distributions.While the generalised nature of the system requires fewer assumptions with respect to the parametric form, the system has no analytical closed form.As a result, it requires approximations based on simulation.In this case, we apply a discrete event simulator, which will be briefly described in the application section.

A structural optimisation approach
The placement and sizing of charging stations will affect the demand in space and time.Generally speaking, if the number of stations is decreased, it will increase the expected waiting time in the system as well as increase detours and the expected travel time associated with these tours.On the other hand, there will be costs associated with establishing a given station.This may depend on the geographical location, the number of chargers and the type of these.
It is possible to formulate the charging station location problem as a general optimisation problem, which may have different forms depending on the perspective, e.g.user, operator or system.In this context, the problem will be formulated as a system optimum problem, where the objective is to maximise total benefits for all users.It is assumed, that consumer benefits can be monetised and compared to infrastructure cost in a combined objective function.
As describe in Section 2.2, the notion of demand at the level of agents is a binary choice of choosing a given charging option.However, in the context of supply, it is necessary to think in terms of capacity.Let  ∈  = {1, 2, … , L} be a set of possible charging locations in a road network.For each possible location we define a discrete capacity function ℎ() that represent the number of chargers at a given location .The minimum number of stations is 0 and represent the situation where the location is not activated as a charging station.The maximum number of stations at location  is   and is thereby location specific.Mathematically, ℎ() is a simple integer mapping ℎ() ∶  ↦ {0, … ,   } and the total installed capacity  is then represented as  = ∑  ℎ  .For simplicity of notation we let  = {ℎ(1), … , ℎ()} define a specific charging configuration within the possible sets of configurations Based on the notion of capacity, the optimisation problem can be stated as a generic non-linear optimisation problem.The -function formally represent a mapping from {, ()} to a station specific waiting time distribution, although it has no explicit form.
subject to Here   (ℎ  ) represent a cost of installing a given capacity ℎ  at location  and the total cost of installing all chargers is given by The problem of finding an optimal supply is highly complex.It involves that we determine a trajectory of equilibrium points in the space-time domain, which is the result of interactions between demand at all times and the underlying queuing system.Due to the G/G/c nature of the queuing system and the non-linear demand functions, it represents a highly non-linear system without a closed form expression.
In the following we will consider two ways of simplifying the optimisation problem.A first observation is that complications arise from having a generic cost function.Hence, provided that the intention of the optimisation problem is to approximate an overall level of supply (number of chargers required) and indicate a number of good charging locations, rather than pinpointing exact locations, assuming costs to be uniform is a reasonable simplification.While a joint optimisation of sizing and placement is demonstrated in the paper (refer to Fig. 13) when costs are uniform, it is still relevant to consider further simplifications in order to make it useful for large-scale applications.A natural way of simplifying the problem is by decomposing it into a placement and a sizing problem.
• In the first stage, the spatial representation of stations is optimised conditional on a given uniform sizing for all stations and with uniform cost functions • In a second stage, the optimal number of stations to attain a certain system performance level is identified.
The first stage problem is equivalent to the problem represented in Eqs. ( 16)-( 18) where   = ℎ   and with the addition of the following constraint.
The problem essentially result in binary solutions of whether to open or close a given station.The motivation of first considering a conditional optimisation problem where  is fixed, is that, provided we have no information about the cost function   (ℎ  ), and as a result, assume a uniform cost structure   = ℎ  , () is strictly monotone and decreasing in  (e.g.,   < 0).This is a consequence of the charging utility function, the queuing mechanism and the assumption of a fixed demand.If charging capacity  is increased, waiting time is reduced due to the dis-utility of waiting time.Since the total demand is assumed to be unaffected, it will render an improved system performance which can be compared to the (uniform) marginal cost  of increasing capacity.While the first stage problem represent a complex non-linear distribution problem, the latter problem involves a scaling problem which can be based on a Greedy heuristic.
The above optimisation process can be illustrated in Fig. 3, where the first stage optimisation is represented by an iterative process where supply and demand is fed to the queuing models and where the equilibrated level-of-service is fed to the optimisation of charger locations subsequently.The second stage optimisation is simpler in that we here consider the sole problem of scaling the supply to represent a proper level-of-service.This is typically based on examining the elbow curve as commented above.
A generalisation of the first simplification is to apply Lagrange relaxation where the installed capacity is penalised.Here we consider a Lasso-type penalty (Tibshirani, 1996) with the  function expressed as: For a binary solution space this is equivalent to penalising the  0 norm, while for the integer problem considered here, it is the  1 norm.The solution will generally find an equilibrium point where the marginal utility of added capacity equals the penalty term.Provided that the system may have a certain cost boundary  it is possible to represent the  function as: where  is here a flexible super-linear function and  is interpreted as a standard shadow price for the constraint ∑  ℎ   < .

Black-box optimisation
The vast majority of papers from the research literature formulates the charging location problem as a problem that is solved by a dedicated mathematical program (refer to Section 1.1).While this bring about some advantages, in that it is possible to provide formal guarantees for convergence and optimally, it also introduces the need for many undesirable simplifications.First and foremost, it requires a closed form expression of the charging demand.However, this is not possible if we apply a generalised queuing system, where arrivals to stations are entirely data-driven to mimic the temporal properties of arrivals in the sampled trip diary data.Moreover, even if a parametric queue (e.g.M/M/q) was applied, the combination of non-linear queuing functions (across time and space) embedded in a non-linear demand model (in the form of a logit model) would be impossible to solve on a large scale.It is our assessment that a black-box optimisation in this situation is a much more natural choice, with fewer compromises, compared to an exact mathematical programming approach.
As discussed, the black-box optimisation of (ℎ 1 , … , ℎ L) involve iterations between demand and a generalised queuing system for which explicit arrival and service time functions cannot be stated.In other words, it represents a black-box optimisation problem, which is essentially approximated by the simulator.The optimisation could involve looking at many different variables, however, based on our system optimum perspective, it is natural to focus on the expected waiting time [  ] and the travel time increase that may result from possible detours while choosing a station.The simulation framework allows us to do inference with respect to the outputs and to consider other statistics such as the 90% percentile for these attributes.This can be useful as a means to understand how the system should be designed to handle peak-loading.In fact, if charging systems are designed to satisfy average service levels it will be very different from systems designed to handle degrees of peak-loading (Rich et al., 2022).
While the black-box optimisation problem requires a simulation process, different strategies for searching the solution space exist.Recently, methods from the machine learning literature, inspired by hyperparameter optimisation problems, have become increasingly popular.Examples include 'surrogate modelling' (Ky et al., 2016) and 'Bayesian optimisation' (Frazier, 2018).A more traditional approach is that of population based evolutionary algorithms (Sloss and Gustafson, 2020).However, the latter type of algorithms, also known as 'genetic algorithms' (Beasley et al., 1993), generally scales poorly when the solution spaces are large.A different approach is to apply 'worst case solutions' based on rejection sampling as a mean to evaluate the constraints.For high dimensional highly constrained problems, however, this approach is infeasible in practise.Many black-box optimisation strategies are heavily dependent on the specific application, and it is often not straightforward to identify a proper solution to a specific problem.The present charging location optimisation is no exception to the case and many different strategies has been tested while preparing the paper.In conclusion, two search principles has been found effective for the problem at hand, a Greedy-type heuristics (Silver, 2004) and a cross-entropy approach (Rubinstein, 1999).In addition, to compare these algorithms with a traditional genetic-type algorithm, the binary genetic algorithm based on (Willighagen and Ballings, 2015) was applied as well.
Greedy-type heuristics (Bendall and Margot, 2006) provide a simple yet effective way of approximating the optimum in this case.While this family of heuristics cannot guarantee a global optimum, the algorithms works particularly well for self-organised systems and when the geographical distribution of charging stations is defined according to some rigorous optimisation process as described above.The algorithm aims at locating a fixed capacity  over the possible links in the charging network.The most classical example of such algorithms is based on a forward selection rule that allocates capacity by; i) starting from an empty space, ii) implementing a simple selection rule , and (iii) introducing a stopping condition.The selection rule  defines how elements are selected from an incrementally increasing set of possible solutions and how these are evaluated with respect to their contribution to the objective function.It is not immediately clear how such a rule should be derived, as it is linked to two effects: i) minimisation of detours and ii) spill-over effects from the capacity of other stations.
Due to this, it is proposed to formulate the heuristic as a backwards selection problem.The algorithm starts from a maximum capacity   , which is constituted by a maximum number of possible stations and their corresponding capacity.The algorithm then evaluate the performance loss from eliminating one charging outlet in a sequential manner.If, as an example, the simulated utilisation  of a given outlet  is a sufficient statistics for its contribution to the objective function, the solution to a problem with  * outlets can be found by applying the following recursive algorithm: If the capacity for each charging station is identical, the algorithm works well as the selection rule can be formulated in a straightforward manner.However, for problems with variable station capacity and/or unequal cost structures it is less trivial to design an efficient selection rule.In particular, it is a challenge to formulate such rule in ways that guarantee the feasibility of the solution.The development of more sophisticated selection rule is beyond the scope of this paper.Instead we apply a structured optimisation approach based on cross-entropy when considering more complex problems.
The cross-entropy optimisation scheme was formulated in the seminal paper from Rubinstein (1999) and represent a global search algorithm suited for large-scale combinatorial optimisation problems.The motivation for considering this approach in this particular context is due to Tran et al. (2020), where the approach was applied to a largely similar problem, although with a different demand side representation.In the current paper we apply the 'CEoptim' R package implementation (Benham et al., 2015) for discrete problems.It is beyond the scope of the current paper to discuss the cross-entropy formulation in more details, however, the approach is a combination of importance sampling in the solution space and the evaluation of a fitness function, which involve minimising a cross-entropy function to guide the search direction.

Application
The micro demand simulator is based on a large-scale Danish travel diary that contains around 250,000 trips made by car (Christiansen and Baescu, 2020).The data include a spatial and temporal logging of individuals and detailed socioeconomic data that describe the individuals and the households.

Demand simulation
When the population is large, it may be possible to apply a conditional sampling or sampling on the basis of a probabilistic description of demand.A detailed explicit modelling of charging demand, however, is outside the scope of this paper.Rather, the paper focus on the balancing of demand and supply at the charge level, e.g. by using queuing models, and the subsequent optimisation of the location of the chargers.
As a result we apply a simpler sampling scheme where we essentially force all agents to charge.Hence, rather than using up-sampling to fulfil certain margins, we estimate the total number of trips that are expected to charge.This is based on a future population of EVs, the trips they carry out, and their expected battery range.Provided that the feature set  represents a reasonable approximation of the population, and we can assume that EVs in future scenarios are distributed approximately evenly, except for some dependencies of home charging availability, this should be a reasonable approximation of charging demand in space and time.Fig. 4 illustrates the simulated traffic intensity based on sampling of 35,000 agents from the TU data, while Fig. 5 illustrates the urban area.
To limit the modelling area, we consider the urban area of Copenhagen and Frederiksberg and consider a 2030 scenario with 380,000 EVs based on Energistyrelsen (2020).Trips with a destination inside this area are differentiated according to three trip types; short, medium and long.
For short range trips, i.e. intercity trips that starts inside the red area in Fig. 5, we assume that no home-charging is generally available and calculate the induced demand for charging as a function of average trip length, maximum range, average effective range and amount of trips per week.It can be added that 80% of the households in Copenhagen and Frederiksberg cannot charge at home according to a recent study by DEA (2019).
For medium range trips, i.e. trips from the suburbs and rural areas (blue area in Fig. 5), home-charging is considered widely available.A small percentage of 10% is sampled without home charging to represent forgetfulness and unavailability.
Long range trips (outside the blue area in Fig. 5) are forced to charge.Hence, we do not consider availability of charging between their origin and the urban destination.In general, such trips are more interesting from the perspective of modelling charging infrastructure for the highway and state road network.
While the demand may represent a reasonable scaling of the expected charging demand in space and time, it does not allow a detailed model specification of demand.Hence, the only demand-supply interaction is the trade-off between where to charge and the waiting time in the system.

Definition of the initial supply
To discretise the search space with respect to the optimal charging infrastructure we consider fixed coordinates located on the links of the existing road network.These points are then down-sampled by a simple exclusion process to ensure that the size of the search space becomes manageable.To achieve this, we start in a single point and eliminate all points that are within a 500-meter radius.This process is iterated until no further points can be added without violating the 500-meter rule.In Fig. 6 the discretised search space is presented.
Clearly, the location of charging supply will often be subjected to additional constraints that relate to power grid proximity, physical space and land-use regulation.However, in this setting we will not consider these other restrictions.

Convergence and computational complexity
As the system represents a black-box model of interacting agents, for which guarantees of local optimum cannot be provided, it is relevant to consider the convergence and the calculation complexity of the system in more details.
In the simplest case, where there are a fixed capacity of  potential locations for which we model whether stations are to be activated or not, the number of combinations required to do an exhausted search is 2  .In our case this is equal to 2 179 or 10 53 .In the combined problem, the search space is much larger as the sizing of the different locations needs to be included.In principle, it gives rise to as many as   combinations, although it will be significantly lower if adding boundaries to the sizing of the different locations.In any case, the solution space it extremely large and require significant computational effort when solved as a black-box problem.
For this reason, and as described above in Section 2.5.2, two solution approaches were investigated.Firstly, an optimised approach where both location and sizing is modelled jointly.In order to limit the search space, the sizing problem is discretised into only three categories which refer to small, medium and large stations.In the second approach, we decompose the problem into a distribution problem and a sizing problem.The first part of the problem is solved by applying black-box optimisation (cross-entropy optimisation) while the latter part of the problem is solved by a Greedy type heuristic.Below, we consider the convergence and the variance of the objective value for the cross-entropy optimisation for the distribution problem.Hence, in this case, we assume an initial uniform supply of four stations per potential charging location and uniform costs.The optimiser then determine which stations to close and open conditional on the queuing dynamics, trip patterns and agent-interactions in the system due to information-sharing.
The cross-entropy optimiser involves two iterative search loops.First, it explores by random sampling, different distribution patterns in the search space.In the next iteration, it starts from the best distribution and apply a new stratified sampling in the neighbourhood of this point.The convergence is based on a standard variance criterion.It implies that the solution for a certain stage is said to converge if a certain lower bound variance criteria  for the distribution is meet.
As seen in Fig. 7 the convergence of the problem appears after around 11 outer iterations after which the number of activated stations is 23.After five iterations the optimum stabilises and only small fluctuations are observed hereafter.A similar type of convergence is seen for other types of problems, such as the joint problem presented in Fig. 13, although the added complexity of sizing require a few more iterations.
Because of the information-sharing and uniform cost structure, specific geographical placements are less important.It generally means that there are many good solution candidates which will render approximately the same objective value.In other words, the informationsharing and the self-organisation of agents that follows from this principle, give rise to important robustness with respect to the performance of the system.

Results and discussion
A principal aim of the paper is the investigation of how information sharing affects the performance of the system.In an ideal world such a result should be compared with a baseline reflecting current/real world performance.However modelling such a baseline is complicated and is an entire research project in itself.As an alternative we consider a worst-case scenario without any information sharing and where users do not change decisions as a function of the waiting time they accidentally experience at the charger.This is compared to a system where information is shared at the start of each trip.
In Figs. 8 and 9 a supply configuration of 20 randomly sampled points is shown and the simulated system performance is depicted.For simplicity all stations are assumed to have the same capacity of four chargers.The worst-case scenario can be considered as a scenario where the waiting time component of (Eq.( 4)) cancels out and where only detour minimisation is considered.
The introduced information-sharing has two major effects for system performance as a whole.While in the worst-case scenario there was a tendency to form (almost) infinite queues at overloaded stations, this phenomenon is now completely avoided.As a result, we see a very significant reduction of the waiting time in the system.A second and related effect is that the utilisation of stations is now balanced to a much higher degree.Hence, we are seeing largely similar effects as would be seen in road networks where demand is spread uniformly.
To compare the heuristic algorithm introduced in Section 3 to the more rigorous optimisation schemes, we consider a location problem with a fixed penalty  of one.This corresponds to a system where adding a charging station should decrease   by one minute.The optimisation thereby resembles the Lagrange relaxation problem in Eq. ( 20).In Figs. 10 and 11, the optimised solution based on the cross-entropy approach is shown to the left while the heuristic Greedy solution is shown to the right.Almost similar performance is obtained when locating an equivalent number of stations  = 23.Furthermore, the heuristic solution achieves a distribution of capacity utilisation that is very similar to the fully optimised version.This is a consequence of the backward selection rule  as described in Section 2.5.3.
Another observation is that the spatial configuration is significantly different while system performance is similar.As earlier noted, the introduced information sharing ensures that the system is essentially self-organising and that most solutions are near optimal as long as excess supply is avoided.The importance of the information sharing depends on the quality of the predictor of the waiting time distribution.Hence, the better the predictor the better the system performance.
However, placement of chargers cannot be entirely arbitrarily and good solutions will share some general properties.The most obvious of such properties is that a share of stations should always be located next to major inflow corridors.In our case, this is highlighted by the stations located at the border of the inner zone and where the major highways connect to the inner zone.
It is also interesting to consider how an optimal system is affected by adding one extra station.In other words, the marginal systemic waiting time effect of adding an extra station.In fact, we consider two different performance metrics, the expected waiting time [  ] and the 99% quantile for the waiting time,  99 .The latter is interesting to understand extreme queuing.
The functions have a highly sub-modular form with decreasing marginal return.In fact, this type of function is well known from the machine-learning literature, where it is referred to as the 'elbow principle'.Fig. 12 clearly show the separation from super-linear to sub-linear form which happens approximately where the colour changes.
The function serves the purpose of providing an estimate of the total (supply) capacity in the system.This information represents a simple expected threshold of the needed capacity and provides decision makers with a trade-off between investment cost and system performance.It can also be used to develop more advanced heuristics, for example for non-heterogeneous capacity problems or as a means to provide a 'warm start' for a global optimisation process.Another interesting question is if allowing for different sizes of stations will improve system performance even further.We consider a base station with capacity ℎ  = 4 and a scenario where the capacity configuration is a set with ℎ  ∈ {4, 5, 6}.A uniform cost structure is imposed such that adding an extra station is linearly proportional to the marginal cost of a single station.The problem correspond to the objective function in Eq. ( 20) where The results are shown in Fig. 13.The waiting time for the scenario is 0.1 min higher while using half a station less.The latter is roughly worth 0.5 min of waiting time performance based on the estimated performance functions in Fig. 12.In conclusion, for this problem, the relaxation of uniform station size then leads to a modest increase in performance.
Clearly, in practise, it seems unlikely that the cost structure is uniform over the entire solution space.Rather, we would expect the initial costs of the first station to be high, while subsequent stations have a decreasing marginal cost.This leads to a problem where the marginal cost of capacity is less than linearly proportional but constant or even decreasing as a function of capacity.Further, it is likely that there are large geographical variations depending on the proximity to the power grid.While such charging infrastructures are very interesting it is difficult to compare with a standard baseline.It is outside the scope of the paper to develop such cost formation and to optimise demand under such formations.

Information sharing
In the paper we envision that users are provided with information about the state of the system, when they start their trip.However, it is interesting to examine what happens if only a certain share of the population are provided with waiting time information.This scenario is relevant because it illustrates the effects of having parallel systems or drivers who do not use a given information-sharing platform.
We examine the effect by sampling different proportions of the population with information-sharing, and then by Monte-Carlo experiments, investigate the impact on waiting time in the system.
Fig. 14 illustrate an interesting pattern.Namely, that when approximately 50% of the population has access to information-sharing, the system is actually close to its waiting time minimum.This is because the self-organisation principle will tend to 'solve' the biggest inefficiencies early on.Hence, the first 10% with information will circumvent the biggest queues and the value of providing information to the next 10% is thereby much smaller.This finding is obviously very interesting from a practical perspective as it provides a good case for information-sharing even in the absence of a system where everybody is informed.
The self-organised nature of agent interactions in the way they share information, give rise to emergent behaviour in the form of effective flatter utilisation curves and the non-linear effect of the value of information in the system.Hence, Fig. 14 is a classical example of emergent behaviour in a agent-based system.

Conclusions and future work
In this paper we have considered the optimisation of charging infrastructures for electric vehicles under information-sharing.The paper approaches the problem from a micro-simulation bottom-up perspective where demand is simulated in space-time and further linked with advanced queuing systems of the G/G/c type.As the queuing dynamics of this type cannot be represented in any explicit form, the paper integrates the event-based simulator with black-box optimisation in order to find optimal locations for chargers.
By sharing waiting time information in the system to all agents, it is found that capacity utilisation improves significantly and that congestion in the form of waiting time is reduced.This is because of 'self-organisation' in the system where EV users are able to utilise range flexibility to advance or postpone their charging decision to avoid queuing.This finding suggests that it is important to incorporate information sharing when designing charging systems as information sharing of some kind must be expected in the future.If information sharing is not considered, it may lead to unnecessary investments and over-supply of charging stations.
Due to the aforementioned self-organisation principle, it is found that system performance is less dependent on specific solutions.Hence, solutions based on a backward searching Greedy heuristic render solutions that are on par with solutions based on a cross-entropy formulation, where in the latter, both the location of stations and the capacity at each station is optimised.
In the paper we also examined the effect when only a proportion of the population have access to information sharing.It illustrates that when only approximately half of the population has access to information-sharing, the system is actually close to its waiting time minimum.This is because the self-organisation principle will tend to 'solve' the biggest inefficiencies early on.As an example, if information is provided only to 10% of the population, these will circumvent the biggest queues whereas the value for the next 10% will become smaller.This finding underline that, from a practical perspective, information-sharing will be efficient even in the absence of a system where everybody is informed.

Future work
While the proposed modelling strategy is both flexible and scalable, it is based on several simplifying assumptions which could be relevant to extend as part of further research activities.
Firstly, in our case, the demand simulator is largely based on a probabilistic sampling scheme from a trip diary.While this renders a detailed space-time demand that serves our purpose, it hinges on the assumption that behaviour will remain unchanged from what it is today.The model could be extended in a straightforward manner by integrating more complex models of charging behaviour and models of  EV ownership if necessary.Hence, rather than sampling from calibrated data, demand would be the result of sampling from a set of discrete choice models.Such models could be based on utility functions for when and where to charge and incorporate all relevant details of these choices.As opposed to the current setup, where we model only the place of charging on a given designated trip, this would greatly increase the choice set.
A second interesting topic for further research is to consider information sharing in more detail as it can be defined in many different ways.In this paper, we propose a relatively simple information setup, where users are provided with information regarding the system waiting times when starting the trip.While this is likely a realistic assumption for many, it could be relevant to investigate how system performance would be affected by having information closer to realtime.This could be investigated by providing agents with information every 5 or 10 min by simply re-calculating the system waiting times and offer these to the users.It could also be interesting to assess an extreme scenario with perfect information.That is, knowledge about battery levels when arriving to the station, the required state of charge that agents want to achieve, their charging speed and their battery size.This would make it possible to estimate an upper bound for what can be achieved on the basis of information-sharing.
Thirdly, it would be relevant to consider the design of the charging system from a rigid cost-benefit assessment perspective.Hence, by investigating the trade-off between waiting time costs and the cost of infrastructure.This could then be used to design infrastructures which were optimal from a societal welfare perspective.Clearly, this would necessitate a better representation of infrastructure costs and how such costs varies in space as a function of the distance to the power grid.
Fourthly, it would be interesting to investigate the sensitivity of the system for a wider range of inputs, such as range and speed of charging.Moreover, it would be interesting to extend the system to a larger area to avoid boundary effects and to investigate a similar setup for a state-road system for long-range trips.

Fig. 4 .
Fig. 4. Weekly traffic intensity over the day based on the trip diary data for the Greater Copenhagen Area.

Fig. 5 .
Fig. 5. Outline of the area for optimisation and the Greater Copenhagen Area.The inner red area is the area in which we optimise the charging infrastructure.As described above, trips from the different areas are classified differently with respect to homecharging availability.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 7 .
Fig. 7. Convergence of objective function.100 draws per iteration, blue dots indicate the expected number of stations for a given iteration.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 12 .
Fig. 12. Performance of system as a function of number of stations .

Fig. 13 .
Fig. 13.Solution to location and sizing problem with linear cost structure.

Fig. 14 .
Fig. 14.Performance of the system as a function of the share of the population which use information sharing.