Data-driven robust optimization for optimal scheduling of power to methanol

Power-to-Methanol is a newly emerging technology to decarbonize hard-to-abate sectors. However, little research on its flexible and optimal operation has been proposed. In this paper, a grid-connected Power-to-Methanol system is introduced, modeled, simulated and optimized for its daily operation by considering its participation in day-ahead electricity markets. The system builds on a real-life application in Denmark. We first predict the electricity prices and then strategically schedule the involved components taking advantage of the potential flexibilities. The uncertainty of electricity price prediction is handled by introducing a Wasserstein metric-based data-driven robust optimization. We further compare the proposed approach with widely-used stochastic and robust optimization. The results show that, for the selected case, the proposed data-driven method could reduce the operational cost by 4.5% compared to the imperfect prediction, and it moderately outperforms stochastic and robust optimization. Using the optimal operation strategy, we find that the levelized cost of methanol ranges from 584 to 1146 € /t. Both CO 2 price and the renewable electricity proportion signifi- cantly affect the cost.


Introduction
Power-to-X has been viewed as a promising pathway to store excessive renewable energy and to cope with increasing challenges from climate change via decarbonizing different sectors. The 'X' hereby refers to various chemical products incorporating hydrogen, methane, methanol and ammonia etc. Methanol, also known as liquid sunshine, is a promising alternative among others [1]. This liquid fuel bears following benefits [2]: 1). It has a comparatively high energy density (16.9 MJ/L) and thus could serve as storage for renewable energy. 2). It could contribute to CO 2 valorisation. Methanol could be produced from the catalytic reduction of CO 2 and offer an opportunity for a sustainable carbon cycle. As a value-added commodity, it also makes the CO 2 more valuable and creates revenues from CO 2 [3]. 3). Methanol is widely used in different sectors, e.g. formaldehyde and acetic acid production, a fuel for fuel cell and residential heating. 4). The storage and transport of methanol are much easier and less costly than hydrogen.
Although power-to-methanol (PtMeOH) is attractive in the sense of sustainability, a critical barrier for its advancement is the poor costcompetitiveness. Owing to the relatively low prices of natural gas, the costs of traditional methanol production are rather low where natural gas is reformed to syngas and further converted to methanol. This fuelbased methanol (fMeOH) has accounted for most of the methanol market and makes the adoption and market entry for electricity-based methanol (eMeOH) difficult. The typical price of fMeOH is 300-500€/ t [4] while it is 800€/t or more for eMeOH [5]. Consequently, reducing the cost of eMeOH production is a central pathway to facilitate the deployment of PtMeOH.
For a grid-connected PtMeOH system, electricity consumption cost is the major operational cost and thus should be minimized during operation. For example, this could be achieved by energy arbitrage in the day-ahead electricity market. The key problem here is how to make optimal operation with imperfect forecast and limited flexibility of such a system.
Recent studies have claimed that flexibility in chemical production can benefit its economies [6][7][8]. It is well-known that electrolysers hold high flexibility implying that the owners could reduce costs by strategically regulating their power consumption. In a PtMeOH scheme without very large hydrogen storage (which is quite expensive), however, the methanol production plant should also follow a flexible schedule. In this regard, little research has focused on the identification of flexibility from methanol synthesis. The only available knowledge is that methanol synthesis seemingly possesses certain flexibility in terms of partial load properties, ramping up/down and on/off properties. Lurgi AG, a German chemical company, has claimed that its steam raising methanol converters could be operated with a partial load as low as 10%-15% and a load change from zero to full within a few minutes [9]. Dieterich etc. reported a liquid phase methanol converter allowing variable feed flows (5% of nominal flow per minute) [10]. These data argue that PtMeOH has a certain flexibility.
While researchers have been focusing on the flexible operation of hydrogen production, little attention has been paid to the optimal operation of a PtMeOH system, nor to its modeling which enables the consideration of flexible operation and steady characterization. It is plausible that the ideas for hydrogen production are also applicable to PtMeOH systems. For example, Pengfei Xiao showed that an integrated grid-connected wind-hydrogen system is profitable given that the electrolysers can be flexibly operated and hydrogen price is relative high [11]. However, the cost of storing and transporting hydrogen is not involved, which may reduce the profitability. Replacing the hydrogen with methanol is a good alternative while few discussion is available now. The most attention paid to PtMeOH is the techno-economical assessment [12], system design [13] and simulation [14]. It is important to find out the potential of flexible operation of methanol production especially in the context of on-grid methanol production.
Another challenge for optimal scheduling lies in the uncertainties caused by electricity markets. Uncertainties have been widely investigated by stochastic optimization (SO) [15,16] and robust optimization (RO) [17] as well as methods derived from them, such as distributed robust optimization (DRO) and data-driven robust optimization (DDRO). SO relies on assumed probability distributions of uncertain parameters to generate scenarios or evaluate the probability of chance constraints [18]. As the true distribution is usually unknown, the SO models may have poor statistical performance. RO does not depend on a pre-specified probability distribution but an uncertain set, thereby offering more reliable results. The drawback of RO is its conservativeness in most cases and the data are not fully utilized because only the worstcase is considered. Both the DRO and DDRO are proposed to build a bridge between SO and RO and are based on the conception of ambiguity set. Their difference lies in how the ambiguity set is established. These two notations are not well distinguished sometimes. As suggested by [19], in this paper, the DRO is referred to as the ones using momentbased ambiguity sets (also called Chebyshev ambiguity sets) while DDRO is regarding the Φ-divergence or Wasserstein metric (WM) based ambiguity sets. In DRO, however, [20] argues that Chebyshev ambiguity sets remain conservative as they do not shrink to a singleton even though the number of samples approaches infinity. This problem can be alleviated by DDRO. Owing to its superiority, DDRO is widely used in unit commitment [21][22][23][24], reactive power optimization [25] and integrated energy systems [26,27]. This work focuses on modeling, analyzing and optimizing a PtMeOH system and handling the uncertainty by solving a WM-based worst-case expectation (WCE) problem. It contributes to the state-of-arts as follows:

Parameters
1. We build an optimization model for the operation of a typical on-grid PtMeOH system including a detailed electrolyser model, methanol converter model, energy flow balance and material flow balance. 2. We improve the performance of day-ahead scheduling by addressing the uncertainties. The uncertainty from price fluctuation is handled by optimizing worst-case expectations. 3. We investigate the influence of key parameters on the levelized cost of methanol to reveal the trade-off between cost-effectiveness and low carbon emissions.
The remainder of this paper is organized as follows: An overview of the concepts in power to methanol system is provided in Section 2. Section 3 introduces the modeling of key components. In Section 4, we consider a day-ahead optimal operation problem and reformulate it into tractable mixed-integer programming in the presence of uncertainty. Section 5 presents the main results and discusses the system economics. Section 6 concludes this work.

Concept overview
A PtMeOH system typically incorporates an electricity source, an electrolysis unit, a carbon source, a methanol synthesis plant and a distillation plant. In accordance with whether connected to a utility grid, PtMeOH systems could be classified into off-grid and grid-connected. Under the former configuration, the large volatility from renewable generation entails large hydrogen storage to ensure energy and material balance. The grid-connected configuration enables a PtMeOH system to purchase or sell electricity to a utility grid, reducing the need for large storage and bringing energy arbitrage opportunities in the electricity spot market.
CO 2 is a raw material for methanol synthesis. In fact, one of the contributions of PtMeOH is the reuse and valorisation of CO 2 . The sources of CO 2 are various, including biogas treatment plants, ammonia plants, air, flue gas, lime and cement furnaces, acid neutralisation plants, ethylene oxide plants, natural gas purification plants etc [28]. As per the composition, temperature and pressure of different CO 2 -containing gas streams, several technologies, such as absorption, adsorption and membrane are utilized to separate CO 2 . For example, in the process of biogas upgrade, CO 2 is separated from biogas that is mainly composed of CH 4 and CO 2 to get pure CH 4 . In the case of ammonia production, CO 2 is an intermediate product from steam reforming and subsequent water-gas shift reactor and it has to be removed to obtain hydrogen. Direct removal of CO 2 from air is another CO 2 source that bears the advantage of closing the carbon cycle [29]. However, it is still costintensive and thus not a good choice for the cost reduction of PtMeOH system. The costs of the CO 2 used for methanol production also varies by its sources and by the policies towards CO 2 reduction support.
Electrolysis is used to produce hydrogen. Three electrolysis technologies are available at present: alkaline electrolysis(AEL), proton exchange membrane electrolysis (PEM) and solid oxide electrolysis (SOEC). While SOEC is still at the R&D stage, the former two have been commercially available. AEL bears high technological readiness and relatively low costs while PEM can provide higher operating flexibility and current density. Despite that PEM shows better dynamic operating properties such as following fast load changes and a wider range of variable loads, we believe that the flexible operation of a PtMeOH system is limited by other components. Therefore, the less costly AEL may be preferable.
The produced hydrogen and CO 2 from the carbon source are further converted to methanol in a methanol synthesis plant (MSP). This process is highly exothermic implying that lower temperature is preferred for equilibrium conversion while the kinetics require high temperature.
Consequently, cooling is needed and recycling of unconverted feed gas is important to raise the conversion rate. The generated crude methanol has to go through a distillation plant to meet purity demands.

Problem description and system modeling
In this research, we particularly focus on a conceptional PtMeOH system based on a real-life implementation that is under construction in Denmark, as shown in Fig. 1a and 1b. This is a typical grid-connected system with renewable energy involved. The stable power supply from the utility grid enables continuous operation but the power is not necessarily green and such e-MeOH is actually accompanied by carbon emission. It is also noted that in this configuration the electrolysers and wind turbines share the same point of interconnection thereby reducing the grid interconnection charge, which is more favorable than the case with wind turbines and electrolysers located in different areas.
The CO 2 derived from a biogas upgrading plant is compressed to prespecified pressure and temporarily stored in a buffer tank. The required hydrogen is obtained from an alkaline electrolyser (AEL). Part of the hydrogen is directly compressed and supplied to methanol synthesis while another part is stored in a buffer tank. Finally, methanol is obtained in the synthesis reactor and treated in the distillation plant.
Since operation optimization is the main concern and the critical problem addressed by this work is how to model and schedule such a system to reduce operational costs considering the fluctuating prices, we inherit the sizing parameters from the real-life project. The key parameters are summarized in Table 1.

Wind turbine
The available data is the time series of wind speed in the area in Denmark. Therefore, we use the most classical wind turbine model for simulation and convert wind speed to power output. More details of the model can be found in [30]. The technical specifications of the adopted wind turbines are given in Table 2. One could also find these parameters in [31] but the hub height and rated speed are not available. Here we assume the hub height to be 50 m and estimate the rated speed based on the reported rated power.

Electrolysis
An AEL model is introduced to quantify the conversion from electricity to hydrogen. To link hydrogen production rate ṁ H2,Ele with power consumption of an electrolyser P Ele , the simplest way is to assume that they have a linear relationship. This assumption however is not able to capture the efficiency increase with relatively lower current. Therefore, in this work we choose a piece-wise linear function to describe the relation between ṁ H2,Ele and P Ele .
In addition, to enable the start-up and shutdown of an electrolyser, we define three operating states: production state, standby state and off state. Production state is referred to the normal working state where piece-wise linear relation between ṁ H2,Ele and P Ele applies. In the standby state, hydrogen is not produced while a small amount of power is consumed to maintain both the pressure and temperature of an electrolyser. The standby state is typically followed by a hot start soon. No power consumption is needed in the off state which generally implies a long-time shutdown where pressure and temperature would decline [32]. Both the mutual transitions among different states and the piecewise linear approximation are surmised in Fig. 2. In parallel, Table 3 provides detailed information of the utilized alkaline electrolyser.

Methanol synthesis
From the point of system operation, it is important to reveal the input and output of a methanol synthesis plant, i.e. how much methanol could be obtained given a specific H 2 and CO 2 input. A trivial estimation can be established based on the stoichiometry of the main reaction, but this may not be accurate due to the existence of side reactions and limitations from chemical equilibrium. In this work, we tackle the problem by considering a first-principle model for methanol production. Two main reactions are happening in the reactor: (1) Reaction (1) is the hydrogenation of CO 2 , the main process of producing methanol while the competing reaction (2) is the reversed water-gas shift reaction. The existence of reaction (2) means CO 2 cannot be completely converted to methanol. In this paper, we build a onedimension plug flow model (see Fig. 1d) d) to represent the methanol synthesis process, as follows:  Table 1 The key parameters of the studied PtMeOH system.
Eq. (3) describes mass conservation where ρ mix is the density of mixed gases, u is the fluid velocity and A is the sectional area of the reactor; Eq. (4) is the one-dimension representation of the Euler equation of ideal fluids with P denoting pressure. Energy conservation is included in Eq. (5) where h is the enthalpy of mixed gases. Note that we assume the reactor is adiabatic thus there is no heat loss. Eq. (6) describes the specie conservation where the first term is the mass transport incurred by flow and the second term is the mass change caused by chemical reactions.
Here, ṁ is the mass flow rate; Y i the mass fraction of component i; ρ cat the density of catalyst; MW i the molar mass of mixed gases; r ′ 1 , r ′ 2 the reaction rates: in which k (⋅) is the kinetic constant and P i is the partial pressure of each component. The calculation of k (⋅) could be found in [36]. It is worth noting that pressure is given in Pa and temperature in K. Two more equations are required to close the group of Eqs. (3)- (8). Based on the assumption of ideal gases, we have the ideal gas state equation and enthalpy of mixed ideal gas as: where h 0 i is the standard enthalpy of formation for component i and C p,i is the heat capacity at constant pressure. To verify the effectiveness of this model, we compare it with the results in Aspen Plus provided by [36], where the reactor design parameters and catalyst properties are also available. It is shown in Fig. 3 that our model could successfully describe the change of molar fraction for each component along the reactor.   The one-dimensional plug flow model is able to quantify the relationship between inlet and outlet flow of methanol synthesis reactor. However, due to the limitation of chemical equilibrium, the conversion rate of reactants is quite low thus a reaction-separation-recycle scheme is required to raise methanol production [37]. The reactor outlet flow consists of products as well as unreacted CO x and H 2 . The condensable products are then separated in flash vessel and the CO x and H 2 stream is recycled back to the reactor. Note that a small percentage (e.g. 1%) of the recycled stream is purged to avoid the accumulation of inert gas. The recycle stream is then mixed with fresh feed for further reaction.
For the recycle plug-flow model, the recycle ratio is an important parameter affecting reactor performance, which is defined as [38]: Recycle ratio = recycle flow rate fresh feed flow rate (11) Recycle ratio being zero means that no gas is recycled and a very large recycle ratio means nearly all reactor feed is recycled gas. Given a recycle ratio, we could obtain the final methanol production and reactant conversion rates. In this paper, the parameter of the methanol synthesis process is inherited from [39]. The catalytic reactor is operated at 210 • C and 76 bar, with the fresh feed flow having H 2 : CO 2 ratio as 3 : 1. However, the recycle ratio can not be directly found in [39] and we estimated the value according to the table B7 and B8 in the reference, which is calculated to be 4.115. In Table 4 we compare the results from our model with that from the literature. These results are adopted in the following optimization scheme to quantify the relation among hydrogen, carbon dioxide and methanol.
To identify the flexibility of such a methanol converter, a key problem is to compare the chemical reaction timescale and transport rate. Provided that the reaction rate is much faster than transport, then the change of inlet feedstock will be balanced very soon. Damköhler number is the dimensionless number to quantify the relationship between reaction rate and transport rate, which is defined as [40]: with L the length of the reactor. Da is calculated to be as high as 45 at the inlet of the reactor implying that the flexibility is not limited by the chemical kinetics of MSP. However, as other ancillary devices are not taken into account, we assume moderate ramping limits for MSP, i.e. ± 20%/h of its nominal production. In fact, there is currently no literature reporting the ramping limits of an MSP, this value is estimated as per work for ammonia reactors [41].

Optimization problem formulation
In this paper, it is assumed that the daily methanol demand is fixed and the key problem here is how to minimize operational costs by utilizing flexibility. The system operator participates in the day-ahead market, predicts spot prices and makes decisions based on the price forecast.

Objective function
The operator aims at minimizing daily operational costs: The time resolution Δt takes one hour and the time horizon is 24 h in this work. C u t is the cost of purchasing electricity, negative for the case of net export; C HS and C CS are the hot start cost and cold start cost for the electrolyser, respectively; y b t and z b t are binary variables indicating hot start and cold start. C CO2 is the cost of the consumed CO 2 , depending on the carbon source and related policies. It could be either negative, positive or zero under different scenarios.
To be specific, C u t could be expressed as: with P u t denoting the power purchased from utility grid; π DA t the predicted day-ahead spot prices; Δπ DA t the forecast error, which is regarded as an uncertain parameter. The decision variables for such a problem are collectively written as: and will be explained afterwards.

Constraints
Problem (13) is subject to a series of constraints, mainly energy and material balance, process flexibility limitations and physical properties of each component. The active power balance is involved as follows: where P w t is the wind power. P Con t , P MSP t , P Dis t are the power consumption of the converter, methanol synthesis plant and distillation plant respectively. Note that according to [42], P Dis t is a fixed value regardless of methanol mass flow rate. P c1 t , P c2 t , P c3 t are the power consumed by the corresponding compressors in Fig. 1. The purchased power P u t is restricted by the transmission capacity of the line linking the PtMeOH system and external grid, which is written as: P Con t depends on the electrolyser power and they are linked as follows: with η Con the efficiency of the converter, a fixed value for simplicity; P Ele t the power consumption of the AEL.
By assuming an adiabatic process in the compressors, we have the relation between power consumption and mass flow rate of the compressors as: where α 1 , α 2 , α 3 are constant values. ṁ (⋅) t is the mass flow rate related to each compressor. The electrolyser is a key component in a PtMeOH system and thus requires a more detailed model. Hereafter we refer to p b t , s b t ,i b t , three binary variables, as indicators of production, standby and off state; I t as the current and ṁ H2,Ele t as the mass flow rate of the generated hydrogen. We first introduce a piece-wise representation of the relation between P Ele t and I t : where P Ele b and I b are the power and current at breakpoint b. w b,t is the weight of breakpoint b at time t, belonging to a special ordered set of type two (SOS 2 ), where at most two adjacent components are non-zero. SOS 2 could be expressed by introducing binary variables. It should be noted that Eq. (22) and Eq. (23) are only applicable when the electrolyser is operated at production state and should be relaxed in stand-by or off state, as could be achieved by adding slack variables s P t and s I t : Eq. (28) enables s (⋅) t to be sufficiently large or small with either s b t or i b t taken one, where M is a sufficiently large constant. Other constraints for the electrolyser are surmised as follows: Eq. (29) indicates that the three states are not compatible. The electrolyser current is limited to feasible range under production state, which is included in Eq. (30). The next equation enforces that electrolyser power is zero in off state, and is P s in standby state and limited to an upper bound in production state. Eq. (32) grantees the transition from off state to standby state is not permitted. Constraint (33) is the Faraday's law linking hydrogen production with electrolyser current. The last two equations are the definitions of hot start (indicated by y b t ) and cold start (z b t ), which could be reformulated into linear forms with extra binaries.
For the hydrogen storage and CO 2 buffer tank, we have: For simplicity, it is assumed that material loss during storage is quite small and could be neglected. Here, M H2 and M CO2 are the tank sizes. Eq. (36)-(37) quantify the SOC change of the tanks while Eq. (40) and (41) are related to continuous operation. The next group of equations describe the power consumption of MSP and related material balance.

Data-driven robust optimization
To fully understand the structure of the above-mentioned optimization problem, the primal mixed integer programming is surmised in a compact form as follows: Here without loss of generality, the equality constraints are replaced by their inequality representations and we consider a maximization problem, i.e. maximizing profits. The symbol . indicates the vector involving an uncertain parameter. To highlight the uncertainty, the whole optimization problem is hereby rewritten by separating the deterministic and uncertain parameters, and for notational convenience The fluctuating electricity prices (to be precise, price prediction errors) occur in the objective function, highly influencing the strategy of exchanging power with utility grid. This paper handles it by considering a worst-case expectation problem (WCP) that builds upon Wasserstein metrics. Let Q be the possible probability distribution of ζ and P N the observed empirical distribution. Problem (50) is considered as: This reformulation argues that the decision variable x is chosen to maximize the expectation of objective function with respect to all possible distributions Q of parameter ζ, in which Q falls in a Wasserstein ball centered at P N with radius ∊. The inner minimization problem is equivalent to: In the spirit of the paper [43], we demonstrate that this problem could be represented as a mixed integer programming. Detailed derivation is provided in Appendix A. The final form of problem (50) becomes: where N is the number of observed data and [N] = {1, 2, ⋯, N}; λ, v l , s l , θ 1 , θ 2 are dual variables obtained during the theoretical derivation.
Optimization problem (53) is finally a tractable reformulation of primal problem (49), and could be solved using the off-the -shelf MILP solvers, such as Gurobi and Cplex.

System parameters
In addition to the parameters mentioned in Section 2, some fixed parameters that are used in the optimization are provided in Table 5. The hourly electricity price is also required for the day-ahead scheduling decision. Herein, we use a multi-layer perception neural network (MLPNN) to complete this task. Since this is not the main focus of this paper, the details may not be discussed here. On the whole, the MLPNN model takes the previous seven-day historical hourly data as the input (a vector in R 168 ) and the hourly data for the next day as output (a vector in R 24 ) to perform a time series forecast. The model is first trained using a dataset generated from recent historical data and then utilized for prediction. The training dataset changes with a moving time window to get better performance. The structure of MLPNN and prediction process are briefly described in Fig. 4a. The predicted electricity prices as known parameters are presented in Fig. 4b, where the observed data is also presented.

Base scenarios
In this paper, we try to mimic the operator of such a PtMeOH system who aims to minimize operational costs by optimally bidding in the dayahead market and scheduling the owned components. The available data for the operator are only electricity prices predicted by the MLPNN, upon which the day-ahead decisions have to be made. To show the influence of the forecast errors, the case based on a perfect forecast is also provided. Fig. 5a presents the predicted prices along with the real observed prices (or perfect forecast) for a specific day. It is clearly shown that the prediction is rather precise during the first half-day, which is especially true in the sense of price fluctuating trends. After 13:00, however, the predicted prices exhibit a second price peak at 20:00, which does not exist in a reality where the prices hold very low in the remained halfday. It is found that such forecast errors are not very common when we span the results during the whole year while the comparatively large errors on this specific day do highlight the uncertainty of the forecast.  The imperfect prediction leads to non-optimal scheduling strategies, as is shown in Fig. 5b where the trajectory of the electrolyser is presented.
In the case of perfect prediction, to avoid the single price peak, the electrolyser works in partial load from 7:00 to 12:00 and gradually increases its production as the prices go down. With imperfect prediction, the electrolyser strategically reduces its load during the two price peaks. The resulted scheduling strategy is of course non-optimal since the second price peak would not happen. The operational cost is found to be 75€ given the perfect strategy while it goes up to 319€ when adopting the strategy based on an imperfect forecast. In parallel, it is worth noting that although the electrolyser has unlimited ramping rates, the step-wise adjustment is a result of less flexible methanol synthesis. We then look into more details regarding the buffer tanks for hydrogen and CO 2 as well as the methanol synthesis plant in the case of imperfect prediction. The buffer tanks offer the possibility to decouple the production of raw material (H 2 and CO 2 ) and production of methanol, enabling the maximal utilization of components with distinct flexibility. It is observed from Fig. 6a that the hydrogen tank is charged before the two price peaks and discharged when the prices are high. CO 2 tank exhibits an opposite trend of total hydrogen production. When the electrolyser works at a high load, it is discharged and when the electrolyser produces less hydrogen, it is charged. It is worth noting that every time the tanks reach lower or upper bound, it means that the raw material provision and methanol production are strongly coupled. It could be inferred that sufficiently large buffer tanks would completely decouple these components. However, it is rather costly. This trade-off should be discussed at the planning stage while it is outside the scope of this paper.

WCE scenario
Here we show how the decisions based on imperfect prediction could be improved by considering uncertainties. Fig. 7 provides the daily costs given different Wasserstein ball radii and the number of historical samples. Notably, the radius is plotted on a logarithmic coordinate. The first observation is that the daily operational cost has reduced to nearly 171€ with best-tuned parameters, higher than the perfect result but moderately lower than that based on imperfect prediction, i.e. 319€. The reduction of costs is achieved by obtaining more knowledge of the MLPNN model from its historical performance.
Further insights into the proposed DDRO approach are provided by examining the influence of DDRO-related parameters. While insufficient samples result in higher daily costs, a large enough sample set has better  performance. Nevertheless, the increase of the performance halts when the number of samples reaches a certain level, e.g. nearly 200 in this figure. Such trends have been observed in a range of research on DDRO and imply that given limited computational resources, a larger sample set is not always better [24,23]. The radius ∊ also imposes great effects on the performance. It is illustrated that within a certain range, a larger radius is beneficial to approach the real distribution. For instance given a specific sample number, the optimum cost declines as the radius rises. Too large radius (more than 100), however, appears to be less favorable and leads to a considerably higher cost. This could be caused by the involvement of pathological distributions in the Wasserstein ball [43]. On the whole, it is suggested the radius should be chosen around 10 and the sample size less than 300.

Comparison with SO and RO
To further evaluate the effectiveness of the DDRO approach, we compare it with SO and RO. SO is built as a scenario-based two-stage stochastic programming, where the scenarios are generated from the same historical data set. RO builds on the assumption that the price prediction errors fall in a polyhedron uncertainty set. Table 6 gives the minimal daily costs during ten consecutive days obtained from different approaches. For reference, the costs based on perfect and imperfect forecasts (indicated as PF and IPF) are also available in this table. It is found that: 1. The MLPNN model offers a precise prediction for such day-ahead scheduling decisions. In most cases, the resulting minimal costs of PF and IPF are very close, and the IPF are only hundreds of euros higher than that for PF or even less. 2. In most cases, DDRO results in fewer costs than SO, RO and IPF, and the overall costs of DDRO are also the fewest. However, it is not always true. For example, SO outperforms DDRO on the fifth day; RO gives better results on the second day. Therefore, to be precise, DDRO is superior to SO and RO concerning the expectation of minimal costs. 3. RO frequently holds the worst performance, as the price of robustness.
In the main, DDRO results in the optimal costs that are closest to the best results and on average outperform imperfect forecast. Notably, the accurate prediction eclipses the methods handling uncertainty. Decisions based on imperfect prediction is also acceptable. One reason is that the decisions are mainly made based on the trend of price trajectory rather than the specific value, which reduces the need for high accuracy prediction. Nevertheless, the proposed DDRO method further reduces costs and brings benefits to the system. The operational costs over the ten days are reduced by 4.5% using the DDRO methods compared to imperfect prediction. It is also expected that the DDRO method would be more significant provided that the prediction is less accurate or other uncertainties are also handled.

Discussion on economics
In this section, we investigate the production cost of methanol and examine the influence from critical system parameters.
The levelized cost of methanol (LCOMe) is defined as [2]: Since the wind turbines are considered to be owned by the same system operator, the annuity calculation should also contain the CAPEX of wind turbines. The operational costs are comprised of those from trading electricity in spot market (which can be both positive and negative), CO 2 sources and maintenance of components. By investigating the everyday operation in 2016, we obtained the yearly electricity costs as the sum of daily electricity costs. The less important operational costs are not accounted in this work, such as the cost of water, employees, land, building etc. Potential revenues from selling the by-products of electrolysis, i.e oxygen or heat, are also not involved. One could refer to [2] where the credit for O 2 is assumed to be 50€/t. To sum up, the financial parameters are presented in Table 7. Fig. 8 illustrates the LCOMe given a different number of wind turbines and CO 2 prices. It is shown that with more wind turbines employed, methanol becomes more costly. For more wind turbines, although operational costs fall as less electricity is required from the day-ahead market, the annuity grows because of the increasing CAPEX. The former reduces LCOMe while the latter raises it. As the results suggested, the increase of CAPEX dominates and more wind turbines are not preferable for better economics. The CO 2 prices, from another aspect, exert influence on the LCOMe. Higher CO 2 prices significantly make methanol more costly. These results show that the methanol would not be cost-competitive with fossil-based methanol, which is   Although more wind turbines are not favorable in terms of economics, they contribute a lot to CO 2 avoidance. The electricity from wind turbines, as shown in Table 2, is associated with CO 2 footprint being 7.3 g/kWh. The utility electricity, however, is accompanied by a much higher CO 2 footprint, as is on average 200 g/kWh [50]. The methanol becomes greener using more renewable energy. The annual CO 2 emission for every ton of produced methanol is shown in Fig. 9, as a function of the number of wind turbines. These discussions indicate that there is a trade-off between economics and sustainability in such an e-MeOH production system.

Conclusion
PtMeOH is believed to be a promising technology for deep decar-bonization. This paper investigates the optimal operation of a typical grid-connected PtMeOH system. For such configuration, a key problem is the reduction of operational costs, which are mostly derived from electricity costs. To this end, we build up a first-principle model to quantify the methanol production given specific composition of inlet feedstocks, i.e. H 2 and CO 2 . We further validate the model using data from the literature, which shows that our model is reliable.
Based on the model we then consider the participation of the system in the day-ahead electricity market. The spot prices are firstly predicted using an MLPNN model. Furthermore, an optimal scheduling problem is built, solved and the solution is further improved by handling the uncertainty from price prediction via DDRO. The results show that the proposed model could capture the behaviours of such a system and the day-ahead scheduling works well in cost reduction. The introduced DDRO method is demonstrated to be able to further improve the performance of the day-ahead scheduling optimization. Comparing DDRO with SO and RO also indicates the superiority of DDRO.
Considering the operational strategies, we then discussed the production cost of methanol. We find that the LCOMe varies from 584€/t to 1146€/t given different CO 2 prices and the number of wind turbines. This suggests that the produced eMeOH is not yet cost-efficient. It is also observed that more renewable electricity leads to higher LCOMe but fewer CO 2 emissions. The operators need to make a trade-off between economics and sustainability.
For futher work, from the perspective of operational research, more uncertainties, e.g. wind power output, could be considered by using the proposed DDRO method. It is also possible to simplify and discretize the differential equations describing the methanol production process for a more flexible operation. On the other hand, the optimal design of such a PtMeOH system could be of great value since methanol is gaining more and more attention. Research on the methanol or hydrogen supply chains would also be of interest where the benefits from methanol transportation and storage would be valued.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Reformulation of data-driven robust worst expectation problem
The primal problem is repeated as: If the inner optimization could be converted to a maximization problem, the whole problem would be simplified. To do this, we focus on the inner minimization problem. The Wasserstein distance between Q and empirical distribution P n in discrete form is: Here, Π is a joint distribution of Q and P N . The Wasserstein distance is the minimum cost of shifting distribution P N and Q by finding the best possible bi-variate distribution. The constraints mean that the two marginal distributions of Π are exactly Q and P N . Since P N is the empirical distribution which has same probability on all N samples, the Wasserstein distance is reformulated as: In Eq. (55), we have d w (Q,P N )⩽∊. The minimum value of a function being less than a constant is equivalent to the fact that there exists a point on this function that is less than the constant. Hence, the minimization symbol could be simply dropped and by replacing Q with the second constraint in Eq. (57), we rewrite the inner problem (55) as: with s l and λ denoting dual variables. There are infinite decision variables and limited constraints in this problem. To cope with this, we have its dual problem as: The constraint is transformed to: The norm in this formulation could be replaced by its equivalent dual form according to: Hence This is equivalent to: , ∃‖v l ‖ * ⩽λ (63) Problem (59) could be re-expressed as: Still, there is a sub-maximization problem in the constraint, which could in fact be expressed as a minimization problem via its dual problem. Assuming ζ falls in a polyhedral, we have the sub-problem as: s.t. ζ⪯ζ : θ 1 ζ⪯ζ : θ 2 The dual counterpart of this linear programming is: Again, we can move θ 1 and θ 2 to variables of outer layer to remove the minimization problem. Optimization (64) becomes: max 0λ⩾0, ‖v l ‖ * ⩽λ, s l , ‖⋅‖ * is referred to ‖⋅‖ ∞ so that this optimization can be solved by linear programming. To elaborate this, problem (67) is rewritten as: max λ⩾0,vl ,sl,θ1,θ2⪰0