Synchronizing Energy Production and Vehicle Routing

— The emergence of locally produced renewable energies induces the appearance of a new generation of local energy players, which are at the same time producers and consumers. In case of time dependent solar energy production, it raises the question of synchronizing production and consumption. We deal here with this issue, in the context of an experimental Solar Hydrogen (H 2 ) production platform. More precisely, we try here to simultaneously schedule a H 2 fueled vehicle which follows a pre-computed route while being compelled to periodically refuel, and the H 2 production micro-plant which is required to produce related energy under time dependent production costs and productivity rates, both processes being subject to storage capacity constraints. In order to do it, we design a global dynamic programming (DP) algorithm for the resulting NP-Hard problem. This DP algorithm involves a 2D time space which links energy consumption by the vehicle and its production by the micro-plant. Since the number of states induced by this DP algorithm becomes an issue as soon as the size of the problem increases, we first propose a theoretical Polynomial Time Approximation Scheme (PTAS), next design several practical pruning devices and finally perform numerical tests in order to check their efficiency.


I. Introduction
The rise of renewable energy sources (photovoltaic, wind, hydrogen or biomass based: see [6,7,13,21]), which aim at replacing CO 2 emissions by clean electrical power, together with the current scarcity of related charge/recharge infrastructures, have been motivating during the last decades an interest from O.R researchers to green issues.
Most contributions were related to electric or hybrid vehicles where the goal is to minimize energy consumption (Green VRP, Pollution-Routing Problem and Hybrid Vehicle Problem: see [14,15,28]). Related models consider recharge transactions submitted to time windows or shared access constraints and aim at minimizing some mixed cost which partially reflects environmental concerns. Green VRP is for instance introduced in [14] by Erdogan et al. with the purpose of promoting either the minimization of fuel consumption or the use of alternative-fuel powered vehicles. Their model is an extension of standard VRP involving limited fuel tank capacities and an objective function which mixes total travel distance and the number of refuelling transactions. Numerical handling relies on a MILP: Mixed Integer Linear Programming formulation and local search heuristics. In [15], Franceschetti et al. tackle the pollution-routing problem which aims at minimizing a cost that depends on driver's wages and fuel consumption while considering time-dependent traffic congestion. They use an ALNS: Adaptative Large Neighbourhood Search metaheuristic in order to handle a model whose main complexity lies in the time dependency of the costs of the arcs which define the transit network. In [28], Raylan et

al. address what they call Green Vehicle
Routing, which means the problem of routing a fleet of vehicle inside a network whose arcs are provided with time dependent lengths, while minimizing a cost based on CO 2 emission due to both the routes and the vehicle loads. They manage resulting model through multi-start ILS. In [17], Kara et al. follow a similar approach and propose a variant of the well-known CVRP: Capacitated Vehicle Routing Problem by setting a new cost function, which also involves the loads of the vehicles and their impact on energy spending. They propose an ILP to solve their model. In [29], Sachenbacher et al. also deal with time dependency, but they introduce specific recharging schemes and consequently adapt shortest path computation through an adaptation of A* algorithm. In [19], Kuo addresses time dependency through the prism of energy consumption and sets a routing model involving time windows, where the drivers may reach an optimal tradeoff between energy consumption, transportation distance and transportation time by conveniently controlling speed. He designs a simulated annealing algorithm which computes routing strategies with lower fuel consumption but longer transportation times and transportation distances. In [30]. Schneider et al. apply a hybrid VNS&Tabu scheme to the handling of a VRP model with time windows, which includes specific battery recharging schemes. In [22], Lin et al. provide a survey on the Green Vehicle Routing problem and the Pollution Routing problem, which both make speed control and traffic jam avoidance part of the models.
In [18], Koç, Jabali, et al. deal with electric vehicle routing problems which refers not only to the assignment of the customers to the vehicles and their sequencing, but also to where and how much the vehicles are recharged: The feasibility of the routes is therefore constrained by the characteristics of recharging configurations (kind of charging functions: linear, piecewise linear,…), and the autonomy of the vehicles. This last issue is also addressed in our paper. In [18] authors consider customer sequencing as part of the problem, and study an Electric-VRP model with non-linear charging functions, multiple charging technologies and variable charging quantities, while explicitly accounting for the number of chargers available at privately managed charging stations. They handle their model first through a MILP formulation, and next by through a metaheuristic scheme which alternatively generates routes and schedules the refueling transactions along those routes.
Still, not all contributions put the focus on optimization: In [31], Waraich et al. propose a multi-agent discrete event simulation model in order to evaluate the impact of the time dependence of both demands and costs on the behavior of a fleet of last mile electric vehicles. In [20], Lajunen uses a vehicle simulation method in order to compare the reductions of energy consumption and emissions induced by different configurations of urban shuttle&bus fleets. They take into account capital costs, operating costs and costs induced by energy storage devices.
While transportation has been paid most attention by O.R. communities, some authors have been addressing the issue of scheduling an industrial process (see [4,11]) while taking into account temporal variations of the energy costs, access restrictions and environmental concerns. In [24], Moon et al. Deal with parallel machine scheduling while taking into account time dependent energy costs, and in [25], they address flexible Job Shop with the same kind of hypothesis. In both case, they use MILP formulation and genetic algorithms. In [26], Mustapha et al. schedule an industrial production process subject to piecewise linear electric cost while using MILP models and dynamic programming algorithms. In [27], Pechman and Schöler focus on the management of energy consumption processes which involve peaks and breaks, as in steel industry or inside large buildings. In [16], Irani and Pruhs address the specific case of information processing and propose a survey about existing scheduling models and algorithms for the minimization of energy consumption inside a data center or a cloud architecture. In [1] Albers deals with the same issue, and provides a survey about algorithmic scheduling and resource allocation techniques, mostly based on heuristic decision rules designed for dynamic contexts, for reducing energy consumption of current computers, including techniques like sleep states and power-down mechanisms, dynamic speed scaling and temperature management.
While most articles have been related to applications, we should mention the existence of several theoretic studies which deal with complexity and approximation issues, for models which put at stake the cost of idleness and the impact of time dependencies (see [1,10,16]). Typically, in [2,3,11], respectively Angel et al, Baptiste and Demaine et al., rely on dynamic programming algorithms in order to explore models involving energy costs induced by set up transactions and gaps in the use of resources, which may be solved in pseudo-polynomial time. By the same way, in [8,9], Chretienne et al. propose polynomial time algorithms for specific scheduling models (oven management for the tyre industry,…) whose costs are mainly due to breaks inside energy consumption processes. On another side, main energy producers have been for a long time carrying on systematic studies about big grain energy production planning (gas, electricity, dam or nuclear plant management), with the purpose of meeting large scale uncertain aggregated demands (see [24,25]). In [5] L.Benini et Al provide a survey about dynamic power management (DPM), while focusing on multi-level systems which require synchronizing coarse grain production with distributed fine grain distribution.
A current trend in Energy Economics is towards the decentralization of Energy production. This is due to both the deregulation of energy markets and to the rise of technologies, which more and more allow local players to become small energy producers (solar, wind, hydrogen, …), while simultaneously remaining consumers. Those local producer/consumers may be factories, farms and even individual householders. This raises complex questions to traditional energy producers, who lost their monopolistic position but remain key players, because of their control on backbone networks and main hydraulic and nuclear plants. In the context of the activities of Labex IMOBS3 in Clermont-Ferrand, France, devoted to Innovative Mobility Technologies and Services, we are participating into a project about the design and control of a local microplant for Solar Hydrogen (H 2 ) production. This micro-plant is asked to provide autonomous vehicles with electrical power obtained from hydrogen fuel cells. While most H 2 production is usually performed through power costly electrolysis processes, we assume here that it relies on solar power and photolysis (see [7,13]), which involve a marginal amount of external electric power but also make the productivity of the process deeply dependent on the intensity of solar illumination. In [21], one may find a review about both existing and prospective ways to generate hydrogen from sunlight and water. While Solar Hydrogen currently remains an uncertain goal, at least from a cost perspective, the scientific principles behind its generation are well understood: Solar Hydrogen is the largest renewable carbon-free resource among the other renewable energy options, and it may be produced in an almost fully autonomous way.
According to this paradigm, energy production/consumption becomes endogenous, and performed according to a kind of closed loop. It induces a need for high level synchronization between both heterogeneous production and consumption processes. Still, very few works address the issue of synchronizing endogenous transportation and energy production processes (see for instance the survey of Drexl [12], in order to get a review of the contributions related to synchronization in the case of vehicle routing alone). It comes that present work is about the simultaneous management of, on one side, a fleet of small electric vehicles provided with H 2 power cells, which are required to perform local logistic tasks inside a restricted area, and, on the other side, a micro-plant in charge of producing the H 2 fuel which is going to be periodically loaded into those vehicles. Taken as a whole, this management involves forecasting, safety (related to vehicle autonomy) ensuring and synchronizing: one must match the pickup and delivery activity of the vehicles with the H 2 production/stock strategy of the micro-plant. Nevertheless, we only address a very specific part of this problem: we consider only one vehicle, which is required to perform tasks according to a pre-fixed order. This vehicle starts its route with some H 2 fuel load, and its tank has a limited capacity. Therefore, it must as in [18] periodically go back to the micro-plant in order to refuel. The micro-plant has a limited production/storage capacity, which depends on solar illumination. Our goal is to simultaneously schedule both the refueling transactions of the vehicle and the production/storage activity of the micro-plant, while minimizing production economical cost and the duration of the vehicle tours.
As mentioned in the title of the paper, we focus here on synchronization, and neither deal with uncertainty nor with the design of the vehicle route. Still, since this very generic word synchronization may refer to very distinct contexts, among them real time contexts where agents involved into the process are provided with some level of autonomy, we must explain what we mean here: We do not deal here with real time synchronization of both players (vehicle and micro-plant) and on the various protocols required in order to ensure those players to meet when necessary, even in the case of unforeseen events. Our point of view here remains a standard Combinatorial Optimization one, and suppose the existence of a central manager in charge of a priori scheduling the activity of both the vehicle and the micro-plant. It comes that Synchronization refers here to the design of a centralized static scheduling algorithm and to the way we link together decisions related to the vehicle with those related to the micro-plant in order to ensure that refueling transactions will be performed in a way consistent with production, consumption and storage constraints.
So the first contribution of the paper consists of a static model related to resulting collaborative scheduling problem. We check that this model is NP-Hard, and propose a dynamic programming scheme (DPS), close to the schemes designed in [3,11] for energy consumption planning models involving break and set up costs. Synchronization is contained here into the specific structure of this DPS, which involves compound time and state sets in order to link together the vehicle and micro-plant processes.
Then this DPS allows us to state a PTAS (Polynomial Time Approximation Scheme) result. Still, in practice, this theoretical result does not keep it from generating to many states as soon as the size of the model increases. So we introduce pruning devices, close to the ones which were introduced in [23] by Lozano and Medeglia in their Pulse algorithm for the Constrained Shortest Path problem. They are either logical based devices, which aim at anticipating inconsistencies, or upper bound based, involving the precomputation of a lower bound together with a feasible initial solution. Part of the study is devoted to an experimental evaluation of the filtering power of those devices.
The paper is organized as follows: Section II provides the Synchronous Management of Energy Production and Consumption (SMEPC) model, together with an ILP formulation. Section III first checks that the SMEPC model is NP-Hard, next proposes a global DPS for this model, and finally states a Polynomial Time Approximation Scheme (PTAS) like result. Section IV introduces the filtering devices: logical ones, lower/upper bound based filtering devices and heuristic ones. Section V is devoted to numerical experiments, which mainly aim at evaluating the impact of the various filtering devices and the quality of the upper bound heuristic.

II.1. The Model.
We consider here some vehicle which has to perform internal logistics tasks, while following a route  which starts from some Depot node and ends in the same way after going through stations j = 1, …, M, according to this order. Start-node Depot has label 0 and End-node Depot has label M + 1. The time required by the vehicle in order to go from j to j + 1 is equal to tj, taking into account the time spent by the vehicle in order to perform local tasks. The vehicle may leave Depot at time 0 and should finish its route no later than some threshold time TMax.
It happens now that our vehicle is powered by hydrogen (H 2 ) fuel. The capacity of its tank is denoted by C Veh and we know, for any j = 0,..., M, the H 2 amount ej required in order to move from station j to station j + 1. The initial H 2 load of the vehicle is denoted by E0, and the vehicle is required to end its trip with at least the same energy load E0. It comes that the vehicle will have to periodically refuel. Refueling transactions take place at a micro-plant, close to Depot: The time required by the vehicle in order to move from station j to the micro-plant (from the micro-plant to j) is denoted by dj (d*j); by the same way, the energy required in order to move from j to the micro-plant (from the micro-plant to j) is denoted by j (*j). Quantities dj, d*j and tj, as well as quantities ej, j*j are non null and satisfy the Triangle Inequality (see Figure 1).   On another side, the micro-plant produces H 2 in situ from water through a combination of photolysis and electrolysis. Resulting H 2 is stored inside a tank located directly in the micro-plant, whose capacity (in energy units) is denoted by C MP . We suppose that: -The time space {0, ..., TMax} is divided into periods Pi = [p.i, p.(i + 1)[, i = 0, …, N -1, all with a same length equal to p such that TMax = N.p. For the sake of simplicity, we identify index i and period Pi. If the micro-plant is active at some time during period i, then it is active during the whole period i, and produces Ri hydrogen fuel units, with production rate Ri depending on period i. At time 0, the current load of the micro-plant tank is equal to H0 ≤ C MP and the micro-plant is not active. We impose that the same situation holds at time TMax. Figure 3 displays an example of production performed by the micro-plant: periods that are underlined in red corresponds to the periods when the micro-plant is activated. Periods in blue corresponds to the periods when the micro-plant is active. -Because of safety concerns, the vehicle cannot refuel while the micro-plant is producing. Any vehicle refueling transaction should start at the beginning of some period i = 0, …, N -1, and end at the end of period i. Since vehicle refueling and energy production are mutually exclusive, the vehicle may wait at the micro-plant before being allowed to refuel.
-Producing H 2 fuel has a cost, which may be decomposed into 2 components: o An activation cost, constant and denoted by Cost F , which is charged every time the micro-plant is activated. This activation cost is due to the fact that ensuring the safety of activation requires some human intervention. o A time-dependent production cost Cost V i, which corresponds to the power consumed during period i, provided that the micro-plant is active during this period: Cost V i is independent on the amount of hydrogen really produced during period i and is only related to the cost of the electrolysis process which, combined to some photolysis process, will derive hydrogen from water. The electricity amount required by this electrolysis process is independent on the amount of hydrogen effectively produced, which depends on the efficiency of the photolysis process (the lightning). It comes that Cost V i only reflects the time-indexed prices charged by the electricity provider at period i.     Remark 1: We focus here on synchronization mechanisms and so consider a deterministic version of our problem. Still, in practice, a key issue is about the uncertainty related to the production ratio Ri, i = 0, ..., N -1. This issue will be addressed in a future work.
The following Table 1 summarizes the input data for the SMEPC problem: Vehicle related input M: number of stations (Depot excluded) DepotDepot = M + 1): vehicle tour (without refueling) TMax: maximal time for the vehicle to achieve its tour C Veh : vehicle tank capacity E0: initial vehicle hydrogen load tj: required time to go from station j to station j + 1 dj: required time to go from station j to the micro-plant d*j: required time to go from the micro-plant to station j ej: required energy to go from station j to station j + 1 εj: required energy to go from station j to the micro-plant ε*j: required energy to go from the micro-plant to station j

II.2. A Mathematical Programming Oriented Formulation
Though Integer Linear Programming is not well-fitted to SMEPC handling, we first propose a Mathematical Programming oriented formulation, which allows to clearly identify main variables and constraints, and which comes as follows: Those variables will be constrained as follows (we use here ILP: Integer Linear Program formalism, and explain the way some of those constraints must be understood as the linearization of logical implications Big M technique): -Production constraints: o For any i = 0, …, N -1: Explanation: Those 3 constraints linearize the quadratic constraint: (1xj) ; Explanation: Those 2 constraints linearize, through Big M technique, the quadratic constraint: Explanation: those 3constraints are the translation of the following logical implication: o TM + 1 ≤ TMax.
Remark 2: Constraint (E1) means that at any time, the vehicle must be able to go to the micro-plant and refuel, and relies on the Triangle Inequality for energy coefficients ej and j.
In order to get a complete formulation, we must explain the way the activities of both the vehicle and the micro-plant are synchronized. This requires introducing a synchronization variable U = (Ui,j, i = 0, …, N -1, j = 0, ..., M) with {0, 1} values, which will tell us, in case the vehicle decides to refuel between station j and station j + 1, during which period i it will do it.
Then we complete our SMEPC formulation with the following synchronization constraints: -Synchronization constraints: o For any j = 0, …, M: o Explanation: This constraint is the linearization, through Big M technique, of the following implication:

Remark 3:
In practice, we should be able to replace the '≤' in Constraint (E2) by a '=' symbol, since in most case, a good strategy for the vehicle will make it to refuel as much as possible. Still we must be sure that we avoid producing more than necessary.

III. A Dynamic Programming (DPS) Scheme
We first easily check that, even when Cost F is null, SMEPC can be reduced to a Knapsack problem, and so we state:

Proof.
Let us consider the following specific instance of SMEPC: -N is given as an even number N = 3.Q + 1, where Q is some parameter supposed to be ≥ 2; Then we see that the micro-plant must produce H hydrogen energy units during periods 0, …, Q -1, in such a way that vehicle may leave at date Q -1 and achieve its tour in 2Q + 1 periods (including p = 1 time unit to refuel, see Figure 6) and return to Depot with a tank loaded with one hydrogen energy unit. It comes then that under those hypothesizes, minimizing the economic cost of the process means solving the following Knapsack instance: {Compute a {0, 1}-valued vector Z = (Z0, …, ZQ -1) such that: We link periods i and stations j by introducing relations (<<, >>, ==) which express the relative positioning of period i and the time T at which the vehicle is at station j. Namely, for any i {0,…, N -1} and T  {0, …, TMax}, we set: Those relations are going to help us in defining the state variables of our DPS scheme. As a matter of fact, we say that, for any such a time pair (i, j), a state is a 4-uple s = (Z, T, V Tank , V Veh ), with: -Z = 1 ~ the micro-plant is active at the end of period i -1, that means at time p.i.
-V Tank and V Veh are respectively the loads of the micro-plant tank at the beginning of period i and the load of the vehicle tank when it arrives at station j; -T is a value in 0, …, TMax with the meaning: o If T >> i, then the vehicle is on the road to j, which it shall reach at time T; o If T << i, then the vehicle is between j and the micro-plant, possibly waiting for being refueled; o If T == i, then the vehicle is in j, and must choose between keeping on to j + 1 or moving to the micro-plant in order to refuel.

Remark 4:
We see that according to this explanation, relative positioning of T and i with respect to the ==, << and >> relations acts as an implicit state variable, which will help us in identifying the decisions which may be associated with a current time pair (i, j) and a current state s, together with their impact.
Decisions. Then a decision D related to a time pair (i, j) and a state s = (Z, T, V Tank , V Veh ) is a 3-uple D = (z, x, ) in {0, 1} 3 , with the meaning: z = 1 ~ the micro-plant decides to produce during period i; x refers to a decision taken as soon as T == i: in such a case, x = 0 means that the vehicle moves from station j to station j + 1 without refueling; and x = 1 means that it refuels at the micro-plant while traveling from j to j + 1. In the case when x = 0, the induced transition will turn j into j' = j + 1 and i into i' = i + 1, except in the case when T + tj == i: in this last case, no real decision is taken with respect to the production and z is equal to 0 by default. If Not (T == i) then x becomes irrelevant and its value is by default equal to 0, no transition being associated with it. - = 1 ~ the vehicle is located at the micro-plant and decides to refuel during period i, forbidding the micro-plant to be active during this period. This means that T << i and that the difference p.i -T is at least equal to the time required in order to move from j to the micro-plant.
Decision is taken at the end of period i -1. Before providing the detail of preconditions, transitions and costs related to those states and decisions, we may come back to the example of Section II.1 and describe the way states and decisions will evolve according to the feasible solution described in Figure 7 below:     If (T == i), then the vehicle starts travelling from j to j + 1. We shift from (i, j) to (i + 1, j + 1) and resulting state is s'= (1, T + tj, V Tank + Ri, V Vehej). Transition cost is (Cost F .(1 -Z) + Cost V i.) + .tj.
z = 1, x = 1,  = 0: The micro-plant produces during period i and the vehicle decides to move from j to the micro-plant in order to refuel. It requires: We shift from (i, j) to (i + 1, j) and resulting state is z = 0, x = 0,  = 0: The micro-plant does not produce during period i and the vehicle keeps on its way. It requires: o If T == i, then x = 0 means that the vehicle is fueled enough in order to move from j to j + 1: V Vehej -j + 1 ≥ 0.
o If T + di ≤ p.i then the vehicle previously decided to refuel between j and j + 1, and is currently located at the micro-plant. Decision  = 0 is feasible if the vehicle has enough time to keep waiting, that means: p.(i + 1) + p + d*j + 1 +  k ≥ j+1 tk ≤ TMax. If T >> i or T << i, then we shift from (i, j) to (i + 1, j), resulting state is s'= (0, T, V Tank , V Veh ), and transition cost is null. If T == i then we shift from (i, j) to (i + 1, j + 1) but in the case when T + tj == i: in this case i remains unchanged. In any case resulting state is s' = (0, T + tj, V Tank , V Vehej) and transition cost is .tj.
z = 0, x = 1,  = 0: The micro-plant does not produce during period i and the vehicle moves towards the micro-plant in order to refuel. It requires: Then we shift from (i, j) to (i + 1, j) and resulting state is s'= (0, T, V Tank , V Veh ). Transition cost is null.
z = 0, x = 0,  = 1: The micro-plant does not produce during period i and the vehicle decides to refuel during period i. It requires: Then we shift from (i, j) to (i + 1, j + 1) and resulting state is s'= (0, p.

Remark 5:
In the specific case when the energy amount V* the vehicle needs to fuel in order to achieve its trip until depot with a load at least equal to initial load E0 is less than Inf(V Tank , C Veh -V Veh + j) then the resulting s' is (0, T, V Tank -V*, V Veh -j + V* -*j + 1). In such a case, we see that this loading transaction will be the last one. We denote by DPS-SMEPC the dynamic network algorithm designed this way, whose full description may be summarized as follows:

DPS-SMEPC
The solution is the sequence of decisions induced by the father field of Current-Sol.
-As shall be discussed in next section IV, instruction (I1) has to be strengthened by filtering devices, which are going to allow us to kill (s', W', Father') in case some kind of infeasibility may be forecasted, or in case we may check that keeping on with the trajectory defined by (i', j') and (s', W', Father') will not yield a better solution than an already existing one. -Instruction (I2) tells us about the way we scan the time space , in a way which is consistent with its standard partial ordering. Several functions Succ may be involved, which scan  according to rows i, to columns j, or to diagonal layers.

III.2. A Polynomial Time Approximation Scheme.
DPS-SMEPC may be in trouble as soon as M and N are large: every state is a 4-uple, with T, V Veh and V Tank with potentially large values. Still, we may notice that if we consider TMax, C MP and C Veh as bounded by polynomial functions of N and M, then DPS-SMEPC becomes time-polynomial since the number of states it involves also becomes bounded by a polynomial function of N and M.
As a matter of fact, we may go further and introduce a FPTAS like rounding scheme as follows: Rounding States inside the DPS-SMEPC Scheme: L being some integer, then, for any integer n with binary decomposition n = a0 + a1.2 + … + aq.2 q , we set: - This clearly means that we want to take into account here the fact (see Remark 4) that relative positioning of T and i through relations <<, >> and == acts as an implicit state variable. Because of the rounding effects, which are likely to perturb those relations, we introduce variable , whose purpose is to explicit the information provided by those relative positioning. -We do in such a way that, at any iteration of the main loop of our algorithm, the set S(i, j) does not contain two 3-uples (s1, W1, Father1) and (s2, W2, Father2) such that respectively s1 and s2, as well as W1 and W2, are equivalent modulo the K + 2.L(N, M) largest bits; We do it while giving priority to the 3-uple (sq, Wq, Fatherq), q = 1, 2, related to the smallest Wq value. -We replace initial values H0 and E0 by respectively Round*(H0, K + 1) and Round*(E0, K + 1); -We relax the time capacity constraint by replacing TMax by Round*(TMax, K); By the same way we relax the H 2 capacity constraints related to both the micro-plant and the vehicle by replacing capacities C MP and C Veh respectively by Round*(C MP , K) and Round*(C Veh , K): this means that (i, j) being current time pair, and s being current state, we compute Decision Set Dec((i, j), s) (Instruction (I3) below) in such a way that the feasibility of any decision D = (z, x, ) is checked with respect to the time capacity Round*(TMax, K) and that H 2 capacity constraints are checked with respect to H 2 capacities Round*(C MP , K) and Round*(C Veh , K). Insert (s', W + CT, (s, W, Father)) into S(i', j');

DPS-SMEPC(K)
Then we may state:

Theorem 2 (Polynomial Time Approximation Scheme): K being fixed, DPS-SMEPC(K) is time-
polynomial. Besides, for any value  > 0, we may choose K large enough in such a way that in case SMEPC admits an optimal solution with value W Opt , then DPS-SMEPC(K) yields a solution which is feasible with regards to initial values (1 + / 2).H0 and (1 + / 2).E0, threshold values (1 + ).C MP , (1 + ).C Veh and (1 + ).TMax and whose cost value Current-Value is no larger than W Opt .

Proof.
K being fixed, the fact that algorithm DPS-SMEPC(K) is time-polynomial comes the same way as when TMax, C MP and C Veh are supposed to be bounded by polynomial functions of N and M: the number of possible 3-uples (s, W, Father) in the list S(i, j) which we deal with at every iteration of the main loop is bounded by a polynomial function of N, M and the encoding size of TMax, C MP and C Veh . By the same way,  being given, we see that if K is large enough, then the relative error (Round*(TMax, K) -TMax) / TMax induced by replacing TMax by Round* (TMax, K) does not exceed . The same thing holds for capacities C Veh and C MP .
In order to achieve the proof of Theorem 2, we need now to consider an optimal solution Sol Opt , given together with its value W Opt . The solution Sol Opt may be associated with a sequence of time values (ih, jh), h = 0, …, H ≤ N + M, a sequence of states s0, s1, …, sH, with related value W0, …, WH, and a sequence of decisions D0, …, DH -1, which induces transitions ((ih, jh) , sh) → ((ih + 1, jh + 1) , sh + 1), h = 0, …, H -1, and to check that if K is large enough DPS-SMEPC(K) computes a solution Current-Sol which is feasible with regards to threshold values (1 + ).C MP , (1 + ).C Veh and (1 + ).TMax and whose cost value Current-Value is no larger than W Opt + . In order to do it, we fix K and prove by induction on h that, for any h = 0,…, H: We suppose it is true for a given h and try to apply the decision Dh to state sh. We derive from (E3-1) that applying Dh will not contradict the Round*(TMax, K) threshold. Because of (E3-2), Dh will not make the load in the micro-plant become negative or exceed Round*(C MP , K). By the same way, (E3-3) implies that Dh will not make the load in the vehicle be negative or exceed Round*(C Veh , K). It comes that this decision is going to be feasible. Then we see that state s' = (Z', T', V Tank ', V Veh ') resulting at time value (ih+1, jh+1) from application of Dh to state s and related value W' are going to be such that:

IV. Filtering Devices
In spite of above result, we face an issue related to the large number of states when N and M increase, and so we must think into filtering devices. Clearly, the following strong dominance rule may be applied: o At least one among above inequalities is strict; then s1 strongly dominates s2, and we kill s2 (i.e. we remove it from the list S(i, j)).
This Strong Dominance rule has little filtering power, since it is too restrictive. Still, other filtering devices may be implemented, close to the ones which were introduced in [23] by Lozano and Medeglia in their Pulse algorithm for the Constrained Shortest Path problem, and which are: Heuristic Weak Dominance rule, Logical filtering rules and Upper/lower Bound based filtering rule.

IV.1. The Weak Dominance Rule
The idea here is to design a flexible device which will help us in keeping the number of states under some threshold NS. Let A, B, C, D, E be some positive parameters. We proceed in such a way that for any time pair

IV.2. Logical Filtering Rules
Given a time pair (i, j), and a related state s = (Z, T, V Tank , V Veh ), the idea here is to anticipate the non feasibility of a trajectory starting from state s at time (i, j). This non feasibility might be related either to a lack of time (impossible to achieve the vehicle tour before deadline TMax) or to energy production (impossible to get enough fuel in order to achieve the tour). More precisely, for any j = 0, …, M, we get a rough estimation of both energy and time required in order to allow the vehicle to return from j to Depot by proceeding as follows:  Apply the following process: The vehicle will have to load at least H -V Veh in order to achieve its journey from j and perform at least Refuel refueling transactions.
 Denote by 0, ..., Mj, the quantities (k + *k + 1ek), M ≥ k ≥ j, labeled in increasing order;  Denote by 0, ..., Mj, the quantities (dk + d*k + 1tk), M ≥ k ≥ j, labeled in increasing order: for any station j, we denote j =  k≥ j tk +  q = 0, …, Refuel -1 q: the vehicle needs at least j time units in order to achieve its trip from j to Depot;  Denote by Prod-Max(i) the quantity  k ≥ i Rk: the micro-plant cannot produce more than Prod-Max(i) from time p.i.
This allows us to state the following filtering rules: Though logical filtering devices as those applied in (II_1) will have a strong impact on the running times of the algorithm, it is not clear that they have also an impact on its theoretical worst case complexity. In order to prove such an impact, we should be able to quantify the number of states s which are going to survive in state set S(i, j) for every time pair (i, j), and this is really a difficult issue. It comes that, at the current time, we must consider that the theoretical worst case complexity of DPS-SMEPC strengthened by instruction (I1_1) is the same as the complexity which we got in Section III.1. The same remark will hold in the case of upper/lower bound based filtering rules.

IV.3. Upper/lower Bound based Filtering Rule
We easily characterize, for any energy amount V and any period number i, the minimal cost Cost-Min(i, V, Z) required from the micro-plant to produce V energy units between time p.i and time TMax, Z denoting the state of the micro-plant at the end of period i -1: Cost-Min(N, V, Z) = 0 if V = 0 and undefined else; We implement it through backward driven DPS, and keep in memory the values Cost-Min(i, V, Z), provided that the scope of values for V is not too large (else we perform some rounding of V values).
Given now a time pair (i, j) and some related state s = (Z, T, V Tank , V Veh ) with value W. Then we get a lower bound LB of the best SMEPC value which may be derived from (i, j) and s by setting: LB((i, j), s) =  j+ Cost-Min(i, (H -V Tank ) + , Z).
If we suppose now that we are provided with some SMEPC feasible value Current-Value , then we may implement the following filtering rule: 3) Upper/Lower Bound Based filtering rule: If W + LB((i, j), s) ≥ Current-Value, then kill state s = (Z, T, V Tank , V Veh ), related to time pair (i, j).
If we refer now to the the DPS-SMEPC Algorithm of previous Section III.A, then we replace instruction: (I1): 'Insert (s', W + CT, (s, W, Father)) into S(i', j'), by the following instruction (I1_2):

IV.4. Computing an upper bound through a Greedy Adaptation of DPS-SMEPC
Turning DPS-SMEPC into a greedy algorithm can be done by using our dynamic programming scheme and performing a forward driven route in the SMEPC state network according to the best LB value. As a matter of fact, we only need to apply the weak dominance filtering mechanism of previous Section IV.1 with NS = 1. Once it has been done, we shall modify the Initialization step of the DPS-SMEPC Algorithm of previous Section III.1, by setting: Current-Value ← Val-Greedy; Current-Sol ← Sol-Greedy; As usual, it will be possible, by introducing some random sorting device inside the procedure in charge of choosing an ad hoc decision D, to make this greedy algorithm work in a non deterministic way and perform it according to the multi-start paradigm.
Still, while adapting DPS-SMEPC in order to restrict the weak dominance rule of Section IV.1 to the case when NS = 1,we must take care of avoiding deadlock strategies which would make both the micro-plant and the vehicle wait for better prices Cost V i and production values Ri. Since our ability to anticipate inconsistencies related to a lack of time or a lack of energy refueling capacity only derives from approximation devices, there is a risk that such waiting strategies may make our process fail in computing some feasible solution. So, in order to make this risk of failure decrease, we forbid: decision (z = 0, x = 1) related to situations when T == i and the micro-plant tank is not loaded enough in order to allow to fully refuel the vehicle; decision (z = 0, x = 0) related to situations when T << i and (p.i -T) ≤ dj, and the micro-plant tank is not loaded enough in order to allow to fully refuel the vehicle; decision (z = 0, x = 0,  = 0) related to situations when T << i and (p.i -T) ≥ dj.
Then the Greedy-SMEPC greedy algorithm comes as follows: Clearly, we may make this greedy algorithm work in a non deterministic way and perform it according to the multi-start paradigm.

V. Numerical Experiments
Purpose: What we want here is to get an evaluation of the impact of the filtering devices described in Section IV and the quality of the greedy procedure described in IV.2. This impact is two-sided: 1) It makes decrease the number of states which are handled throughout the process; 2) It may induce a gap to optimality.
Technical Context: Algorithms were implemented in C++, on a computer running Windows 10 Operating system with an IntelCore i5-6500@3.20 GHz CPU, 16 Go RAM and Visual Studio 2017 compiler.
Instances: Instances are generated as follows:  We fix N, M and p, and randomly generate stations j and Depot and the Micro-Plant as point of the R 2 space.
 Then dj, d*j and tj, ej, j*j respectively corresponds to Euclidean distance and Manhattan distance roundings, in such a way they take integral values.  Then we fix C MP , C Veh and TMAX, in such a way it ensures the existence of a feasible solution.  Finally, we fix the cost coefficients, in such a way that the fixed cost Cost F is at least equal to the largest coefficient Cost V i, i = 0,…, N -1.
In the case of this experimental section, we use a package of 15 instances, while making vary N, M, p, as well as the economic costs Cost F and Cost V i, i = 0,…, N -1, and fixing C Veh = 20, and C MP = 50, TMax = 200. Main characteristics of the 15 instances package, where Mean_Cost V denotes the average value Cost V i for i = 0,…, N -1, may be summarized inside the following Outputs: For any instance, optimality is achieved through exact DPS-SMEPC and so: A. We first run a multi-start version of Greedy-SMEPC with 50 replications, and compute related gap G-G to optimality (in %), together with related CPU time G_T (in seconds); B. Next we run heuristic DPS-SMEPC(NS), with NS as described in section IV.1 and ranging from 20 to 100, in such a way that no more than NS states are allowed for every time pair (i, j). DPS-SMEPC(NS) involves the filtering rules described in sections IV.2 and IV.3. For every instance, we compute gap W_G(NS) to optimality (in %), together with related CPU time W_G(NS) (in seconds); C. Finally we run exact DPS-SMEPC according to the prospect of evaluating the impact of the filtering rules described in sections IV.2 and IV. 3. It comes that we successively run DPS-SMEPC: 1) while only using the Strong Dominance rule: then, ST(1) denotes the maximal number of states which have been generated this way for a given time pair (i, j) and T(1) denotes related CPU time (in seconds) ; 2) while using the Strong Dominance Rule as well Logical Filtering rules : then, ST(2) denotes the maximal number of states which have been generated this way for a given time pair (i, j) and T(2) denotes related CPU time (in seconds) ; 3) while using the Strong Dominance Rule as well Logical Filtering rules and Upper/Lower Bound based filtering rule, implemented while using the best value (according to Table 2 Results: Results related to A) and B) are summarized in Table 4. Results related to C) are summarized in Table 5.