The effect of price-based demand response on carbon emissions in European electricity markets: The importance of adequate carbon prices

Price-based demand response (PBDR) has recently been attributed great economic but also environmental potential. However, the determination of its short-term effects on carbon emissions requires the knowledge of marginal emission factors (MEFs), which compared to grid mix emission factors (XEFs), are cumbersome to calculate due to the complex characteristics of national electricity markets. This study, therefore, proposes two merit order-based methods to approximate hourly MEFs and applies it to readily available datasets from 20 European countries for the years 2017-2019. Based on the resulting electricity prices, MEFs, and XEFs, standardized daily load shifts were simulated to quantify their effects on marginal costs and carbon emissions. Finally, by repeating the load shift simulations for different carbon price levels, the impact of the carbon price on the resulting carbon emissions was analyzed. Interestingly, the simulated price-based load shifts led to increases in operational carbon emissions for 8 of the 20 countries and to an average increase of 2.1% across all 20 countries. Switching from price-based to MEF-based load shifts reduced the corresponding carbon emissions to a decrease of 35%, albeit with 56% lower monetary cost savings compared to the price-based load shifts. Under specific circumstances, PBDR leads to an increase in carbon emissions, mainly due to the economic advantage fuel sources such as lignite and coal have in the merit order. However, as the price of carbon is increased, the correlation between the carbon intensity and the marginal cost of the fuels substantially increases. Therefore, with adequate carbon prices, PBDR can be an effective tool for both economical and environmental improvement.


Motivation
High penetrations of variable renewable energy sources (vRES) in national electricity grids are essential to achieve the global aims of the Paris agreement. Price-based demand response (PBDR) is a promising approach, especially in smart grids, to provide the operational flexibility, that is needed for the integration of vRES. Due to the characteristics of the electricity market, available approaches used for carbon emissions accounting lead to misleading results when applied in the context of demand response (DR). Specifically, they overestimate the environmental potential of PBDR by ignoring the nature of the electricity markets and their special phenomenon called the merit order dilemma of emissions. This phenomenon refers to the fact that, for electricity generation, certain emission-intensive fuel types are, due to their low marginal costs, preferred to relatively * Corresponding author: MichaelD.Murphy@cit.ie lower emission-intensive technologies. At the time of writing, e.g., highly efficient combined-cycle gas power plants, are, according to the merit-order dispatch principle, only considered after more carbon-intensive lignite-fired power plants. Adequate approaches based on the idea of marginal emissions (MEs), however, require detailed data, which are usually not available. As a consequence, as long as the merit order of an energy market does not correlate with the carbon emission factors, PBDR cannot exploit the full carbon reduction potential of load shifting. Increasing the carbon price to an appropriate level is crucial to exploit this potential by internalizing the external cost of climate change and creating financial incentives for sustainable development [1].

Background
The integration of additional vRES requires an increase in the operational flexibility of the electricity grids to compensate for the increasing fluctuation of the residual load, which defines the total load minus the production from vRES. Operational Price-based demand response may increase MEF-based emissions, even if XEF-based emissions decrease.  . Exemplary case where price-based or XEF-based load shifting leads to increased total emissions: We assume two time points t 1 and t 2 . t 2 differs from t 1 only by a decreased wind supply ( 1 ). Since the total load is identical in both time points, the wind decrease leads to a higher residual load in t 2 and therefore to a shift of the operating point in the merit order curve from the area of coal-fired power plants to the area of gas-fired power plants ( 2 ). The different marginal prices of the two marginal power plant types lead to a spot market price spread between t 1 and t 2 ( 3 ). This price spread incentivizes a load shift of 1 kWh from t 2 to t 1 ( 4 ). However, the shifted load causes an additional production of 1 kWh at a carbon-intensive coal-fired power plant in t 1 , whereas in the case without load shift, the load causes an additional production of 1 kWh at a less carbon-intensive gas-fired power plant ( 5 ). Since the marginal emissions of the coal power plant (MEF(t 1 )) are higher than the marginal emissions of the gas power plant (MEF(t 2 )) the load shift causes increased MEF-based emissions ( 6 ). In contrast, XEF(t 1 ) is lower than XEF(t 2 ) since in t 2 , the reduced wind power is compensated by coal and gas. if effects on emissions are evaluated using average electricity mix emission factors (XEFs), a load shift from t 2 to t 1 suggests a decrease in emissions, since the XEF at t 1 is lower than at t 2 ( 7 ). flexibility in this context means that it is not based on structural changes in the electricity system such as power station commissioning / decommissioning or fuel price changes [2]. This flexibility can be provided by flexible generation, interconnection, energy storage, and demand-side resources [3]. PBDR is seen as an essential and promising approach for unlocking the flexibility potential of demand-side resources in a cost-efficient way [4]. Thus, PBDR has been the subject of many recent studies across the residential, commercial, and industrial sectors, as shown in the review articles [5][6][7][8]. Also, PBDR can be combined with incentive-based DR as demonstrated in [9].
For the quantification of electricity-related carbon emissions, the temporal granularity of carbon emission factors (CEFs) is important. Annual average CEFs, which are still commonly used, lead to inaccurate results because of the high variance of emission factors of different fuel types. This holds all the more with the evaluation of emissions due to load changes such as with PBDR. Several studies, therefore, suggest the usage of time-varying CEFs instead of yearly average CEFs for short and long term decision making [10][11][12].
Dynamic XEFs are useful for calculating carbon emission balances of energy consumers. Since usually it is not possible to trace electricity from a specific producer to consumer, the average carbon intensity of the entire generation system is attributed to each customer [13]. However, if XEFs are used to assess or reduce the real effect of DR on carbon emission, the results may be misleading [14,15]. The reason for this is that not all power plants react proportionally to a change in demand [14]. In theory, the electricity requested will come from the power plant with the lowest marginal cost and spare capacity, the so-called marginal power plant [16]. Dynamic marginal (power plant) emission factors (MEFs) estimate the carbon intensity of demand changes as the carbon intensity of the marginal power plant for each time step [14]. This is why, if the real impact of DR on operational carbon emissions is to be determined or even minimized, MEFs should be used where possible. However, the calculation of hourly MEFs requires a very detailed database, which is why hourly MEFs are not available for most areas [15]. As a consequence and despite a growing body of literature that recognizes the necessity of MEFs for assessing the environmental effects of PBDR [11,14,15,[17][18][19][20][21][22], XEFs are still used in this context.
Through PBDR, price incentives and potential savings that arise in an energy spot market by varying supply and demand of electricity are passed on to the energy consumer. However, energy spot markets lead, under perfect competitions, to a costminimizing dispatch, which, depending on the correlation between prices and emissions along the merit order, may lead to a suboptimal dispatch in an environmental sense. This phenomenon is known as the merit order emission dilemma as illustrated in [17].
The fluctuating feed-in of vRES significantly affects the carbon emissions of the electricity supplied to consumers. In times of high vRES shares, the carbon emissions per produced unit of electricity are usually lower than in times of lower vRES shares since less fossil-based power plants per produced unit of electricity are in operation. This phenomenon leads to the hypothesis that a shift of energy consumption from an hour with a low vRES share to an hour with a high vRES share leads to a decrease in carbon emissions. In fact, a calculation according to XEFs, which describes the current generation mix of the electricity system supports this hypothesis in many cases, e.g., [12,23,24]. However, price-based or XEF-based load shifting might lead to increased emissions as illustrated in Figure 1.
One option to minimize carbon emissions through load shifting is the usage of MEFs instead of prices as a primary DR incentive signal, as Leerbeck et al. [25] suggest. Real-time MEFs with resolutions between 5 and 15 minutes are already commercially available via application programming interfaces from companies such as WattTime [26] or Tomorrow [27]. However, since prices and emissions along the merit order are not fully correlated, cost minimization and emission minimization are conflicting. As a second option, this cost-emission conflict can be resolved by internalizing the external costs of climate change by placing an adequate price on carbon emissions. Both options will be analyzed in this study.
In summary, fuel type specific, temporally resolved national generation data (e.g., from the ENTSO-E Transparency Platform (ETP) [28]) allows for the calculation of temporally resolved XEFs which are necessary for carbon emissions accounting of electric consumers. However, for the environmental evaluation of PBDR activities like load shifting, dynamic MEFs are needed, since they reflect the effects of system changes, which XEFs do not. Due to the lack of power plant specific electricity generation data, the purely empirical identification of the marginal power plant is not straightforward [29]. As a consequence, dynamic historic MEFs are not available free of charge for most European countries.

Literature review
In the literature, essentially there are two dimensions to classify electricity grid CEFs. The methodology dimension classifies the methods behind the CEFs into Empirical Data & Relationship Models (EDRM) such as regression analyses and Power System Optimization Models (PSOM) such as economic dispatch models [30]. The other dimension divides the CEFs according to their CEF-type into XEFs and MEFs. XEFs describe the current generation mix of the electricity system and account for renewable energy sources (RES) shares while MEFs quantify marginal system effects [17].
The methodology behind XEFs is an attributional approach where all emissions of the electricity grid in a particular time frame are shared across all electricity consumers in proportion to their demand [11]. This prevents double counting of emissions and is therefore often used in the context of carbon accounting [32]. The XEFs are determined by weighting the power plant type specific carbon emissions with its share of total electricity consumption during each hour. Therefore, it reflects the current state of the electricity mix, e.g., the share of renewable and conventional electricity production [17].
The marginal power plant method to calculate MEFs focuses on the element of the power plant mix that will actually be affected by changes on the demand side and reflect the consequences on carbon emissions of the electricity system [41]. In uniform-pricing markets, power plant owners are incentivized to bid at their individual marginal cost. This merit order sorts all power plants according to their marginal costs. A load reduction or increase within a specific hour is now compensated by an increased power output of the marginal power plant. The MEF is therefore equal to the specific emission factor of the marginal power plant.
Ryan et al. [30] provides a useful guideline for selecting the most appropriate method. For a review of carbon emission accounting approaches in electricity generation systems, the reader is referred to the work of Khan [22].
Within the literature, different approaches for the determination of CEFs can be found. Table 1 shows an overview of studies clustering them by type, methodology, temporal and geographic scope, and temporal resolution into three groups.
Studies in Group A calculated historic, hourly or half-hourly XEFs with EDRMs for different countries. Most of these studies [10,12,23,24,[31][32][33][34][35] focus on calculating simple emission factors for either individual or a few countries by weighting the historic generation data with the according carbon emission factors per fuel type.
Tranberg et al. [18] proposed a more sophisticated approach. Using flow-tracing which allows the tracking of power flows on the transmission network from the region of generation to the region of consumption they calculated consumption-based XEFs for 27 European countries. The results show significant deviations between the consumption-based and the generation-based XEFs and suggest to include cross-border flows for carbon emission accounting of electricity. This method is demonstrated in the electricityMap [42], which is a real-time visualization of the carbon emission footprint of electricity consumption. However, they do not focus on MEFs as we do.
For Group B, Hawkes [14] set the foundation in 2010. Using half-hourly data from the UK for 2002-2009, he calculated the first difference of system carbon emissions and system load, respectively, and determined the average MEF by the slope of the regression line of the two difference vectors. Compared to purely merit-order based methods, this has the advantage that it implicitly takes into account the trading decisions of the players in the market, the logistical constraints of power plant operation, and transmission and distribution restrictions. However, the nature of statistical relationship models restricts these methods in the temporal resolution of the resulting MEFs which is significant for assessing DR measures. The regression can be performed repeatedly on subsets of the data to obtain, e.g., time-of-day or time-of-year MEFs as also shown by subsequent studies following Hawkes' approach by applying it to different geographic regions [15,20,37,38] but this still neglects important information such as the fluctuating RES share. Through the comparison of XEFs and MEFs, SilerEvans et al. [15] found that XEFs may misestimate the emissions that can be avoided from an demand-side intervention.
Pean et al. [21] extended Hawkes' approach by clustering the data of Spain in the year 2016 according to system load and RES share and then realizing a linear regression on every cluster. By fitting a quadratic function of RES share and system load to the results of the regression they were able to compute hourly MEFs. Although this approach seems to be promising for our aim of comparing the environmental effect of DR for different countries, it involves constant emission factors per fuel type that disregards the efficiency differences of power plants within a fuel type which is of high significance for national energy systems that base on only one or two fuel types as, e.g., in Lithuania or Serbia.
Greensfelder et al. [36] studied the relationship between load and emissions for the four US regions IL, NY, TX, and CA. The availability of power plant-specific emission and load time series allowed for the utilization of the flexibility weighted hourly average emissions rate method, which determines the power plants that are most likely to be operating at the margin. However, due to the lack of comparable data, this method is not applicable to Europe.
Finally, studies in Group C computed historic or future MEFs in hourly or higher resolution using PSOMs, more precisely Economic Dispatch Models. Bettle et al. [39] used historic generation data per power plant of the UK for the year 2000 to calculate half-hourly MEFs that indicated up to 50% higher carbon emission savings than the XEFs. However, this method is not applicable to most of the European countries where power plant specific generation data are not readily available. Furthermore, they used fix power plant efficiencies per fuel type. Regett et al. [17] computed future, hourly XEFs and MEFs for Germany and found that they are negatively correlated with each other meaning that they can lead to opposing results, which highlights the importance of the choice of CEF type. Based on [17], Böing & Regett [40] proposed an emission accounting method that determined dynamic XEFs and MEFs for different energy carriers in multi-energy systems. However, their focus laid on the future energy system for an individual country and not on the compari-son of multiple countries based on historic data. Baumgärtner et al. [11] ran an economic dispatch model on German data of the year 2016 that resulted in hourly XEFs and MEFs which fed into a subsequent multi-objective synthesis problem of a low-carbon utility system. They used power plant specific efficiencies that were approximated by a logarithmic size-dependent regression of real power plants. However, the power plant sizes are not available for all countries and efficiencies also depend on the year of construction.
In summary, the literature contains multiple studies that calculate CEFs for different countries and years with most studies focusing on XEFs (see Group A of Table 1). Only a few studies calculate MEFs with hourly resolution and none of them compares hourly MEFs between different countries.

Contribution
Within this work, we quantitatively compare the environmental effects of PBDR between 20 European countries. Since MEFs are currently not readily available and XEFs are not appropriate for measuring the effects of PBDR activities, we propose two approximating methods (Power plant (PP) and PieceWise Linear (PWL)) to calculate time-dependent CEFs with readily available data for European countries.
The aims of this paper are: • Propose and validate a method that approximates MEFs and XEFs from readily available datasets.         • Apply the method to 20 European countries for the years 2017-2019 to compare resulting CEFs.
• Simulate load shifting based on prices, XEFs, and MEFs to assess its impact on carbon emissions.
• Evaluate the impact of carbon prices on the marginal costemission correlation, the merit order, and on the effects of load shifting.
The resulting hourly CEFs, marginal costs, and marginal fuel types for 2017-2019 can be downloaded as CSV files under a GitHub repository 1 .

Paper organization
The remainder of this paper is structured as followed. First, all data sets used for the calculation of the CEFs are described in Section 2. Then, the methods for the CEFs calculation, PP and PWL, are detailed in Section 3. In Section 4, the PP method is applied to the power plant resolved dataset for Germany in 2019 to evaluate the approximation error of the PWL method. In Section 5, the PWL method is applied to the data of 20 European countries. First, the resulting merit order and CEFs are described, Section 5.1. In Section 5.2, cost-emissions correlations are analyzed. In Section 5.3 and Section 5.4, load shifts 1 https://github.com/mfleschutz/marginal-emission-factors based on prices, MEFs, and XEFs are simulated and the results discussed. In Section 5.5, the phenomenon of increasing carbon emission as effect of PBDR is explained using the results of six exemplary countries. In Section 5.6, a sensitivity analysis of carbon emissions on the merit order is conducted that demonstrates the impact of the carbon price on the merit order and the mitigation of the merit order dilemma of emissions. Section 5.7 summarizes the results and discussions and in Section 6, we conclude the findings of this paper.

Data sources for this study
This section describes the datasets that are used in this study grouped by data source.

ENTSO-E Transparency Platform (ETP) 2.1.1. Net generation
For the calculation of the temporally resolved residual and total load of a national electricity system, historic electricity generation time series data ("Aggregated Generation per Type") for each fuel type f and country c from the ETP [28] were used and accessed via the ENTSOE-py client [43]. For the years 2017-2019 and the 20 countries used in this paper, 10.2 Million data points were processed. Across all European countries, 20 different fuel types are used for electricity generation. The temporal resolution of the raw data varies depending on the country (15,30, and 60 minute intervals). A comprehensive review of the data composition of the ETP can be found in [44]. Within the data preprocessing for the simulation, all time series were downsampled to hourly resolution. In the case of missing values and outliers, the last available data points were considered instead. The percentage of missing values that were filled were 1.9% for 2017, 1.3% for 2018, and 0.4% for 2019. However, 67% of the missing values appear in only two fuel types ('Fossil Hard coal' and 'Other') for the Netherlands for the years 2017 and 2018. Nine outliers were detected by a combination of Z-score analysis and treated as missing values, see Supplementary Material D. To save space, the following renaming was conducted: Biomass → biomass, Fossil Brown coal/Lignite → lignite, Fossil Oil → oil, Fossil Gas → gas, Fossil Hard coal → coal, Hydro Run-of-river and poundage → hydro, Nuclear → nuclear. Figure 2a shows the share of annual generation per fuel type and country in relation to the total generation for 2019 after preprocessing. Supplementary Material A and B contain basic analyses of available generation data from the ETP [28] for Europe and Germany, respectively.

Installed generation capacity
Installed electricity generation capacities per fuel type f were also obtained from the ETP [28]. The same renaming of fuel types as in Section 2.1.1 was conducted. Figure 2b shows all preprocessed data for 2019 in a concise way. Supplementary Material A contains an overview of available installed generation capacity data on the ETP [28].

OPSD Power plant list
For the PP method and to verify the PWL method with a higher granularity of generation capacity, the power plant list from the Open Power System Data [45] was used. This data consists of 893 power plants for the electricity market region Germany, Austria and Luxembourg with 38 features each. The most relevant features were country, capacity, date for commissioning and shutdown, type, as well as efficiency estimate. In this paper, for the sake of simplicity, individually controllable power plant units are referred to as power plants. In Figure 6, e.g., the nine light brown horizontally distributed dots near 20 GW represent the nine generation units of the Jänschwalde power plant which have similar characteristics. For the construction of the merit order, only active power plants were considered by constraining the year of simulation after date of commissioning and before shutdown. The distribution of efficiency and capacity can be seen in Supplementary Material C.

GEO Power plant list
Due to the particularly strong influence of power plant efficiency, power plants are further distinguished according to their technology type in open cycle gas turbines (indicated with gas) and combined-cycle gas turbines (indicated with gas cc). The division into gas and gas cc was based on k cc which is the proportion of combined-cycle gas turbines in all gas power plants for each analyzed country. These shares were calculated from the GEO power plant list [46] with two exceptions: For Germany, the share was calculated from the OPSD list in favour of more detailed data. For Serbia, no gas power plants were listed on [46] at the time of data retrieval. Independent investigation identified its share to be k cc = 0%. All used values for k cc are listed in Table 2.  [49]). The data was downsampled from weekly to annual resolution with average method and used in mean annual form as c GHG , see Figure 3.  Table 3 shows an overview of the fuel type f specific input parameters used in this study. The fuel type specific cost were used depending on the year and in the case of natural gas, also depending on the country. As fuel type specific CO 2 -intensity ε f , operational net emission factors from [50] were used which, in contrast to life cycle assessment approaches, do not include emissions embodied in infrastructure. This is consistent with the short-term nature of PBDR.

Transmission efficiency
Based on the methods employed in [33] and [11] we employed a constant transmission efficiency η T to consider all losses for transmission and distribution. Table 4 shows the used average value over the last four published years (2010-2014) for each country [54].  [51], DESTATIS [55]. Light grey boxes indicate subsets or modified data. The dimension of each dataset is denoted by the characters t, y, f, and p in brackets. Dark grey boxes indicate special mathematical operations.

Table 3
Overview of the fuel type f specific input parameters for the merit order and its data sources.

Methods
In this study, we propose two approximating methods -the Power Plant (PP) method and the PieceWise-Linear (PWL) method -to calculate dynamic MEFs and XEFs from the available data. Both methods (PP and PWL) include the application of a country and year specific merit order on historic residual load data to simulate the time-dependent dispatch of conven- tional power plants and to identify the marginal power plant per time step t. In other words, the prices, XEFs, and MEFs from the simulation depend on the available conventional power plants, the time-dependent RES shares, and the total load, which was proxied by the national total generation. The PP method uses a power plant list with given efficiencies. Since power plant specific data is not readily available for most European countries, we also propose the PWL method, which does not rely on given power plant specific efficiency and capacity. Instead, national installed generation capacities per fuel type were used to discretize the fuel type specific generation capacity into virtual power plants. Figure 4 shows a schematic depiction of the data sources and calculation steps. Besides the two main methods (PP and PWL), it also shows the PWL validation mode (PWLv), which is detailed in Section 4.
For the sake of simplicity, from here onwards, the term fuel types also include gas cc which technically is a special combination of a fuel type and a technology. The remainder of this section describes the calculation of XEFs and MEFs using the PP and the PWL method.

Calculation of MEFs
Regardless of the merit order calculation method, the power plant specific emissions ε p are given by where ε f is the carbon emission intensity per fuel type f and η el emission factor MEF t is given by the power plant specific emission intensity ε p of the marginal power plant in given time step t divided by the transmission efficiency η T : where γ m t,p is a binary variable:

Calculation of XEFs
In all methods (PP, PWL, PWLv), the grid mix emission factor XEF t is calculated with the following equation: We simplify Equation (4) to: where ε p is the carbon emission intensity per power plant p, P inst p is the installed power plant capacity, γ x t,p is the capacity utilization rate defined in Equation (6), ∆t is the time-step-width, η T is the constant transmission efficiency considering all transmission and distribution losses, E gen t, f is the generated energy per fuel type f and time step t, and F is the set of all generation fuel types.

Residual load
Since fuel type specific generation data P gen t, f is available in most European countries, the residual load P resi t is approximated by the sum of energy generated by the conventional power plants for all methods (PP, PWL, PWLv) and both CEF types (XEF and MEF): where F CONV is the set of all available conventional fuel types (first 11 fuel types in Figure 2a). Figure 5 shows the distributions and average values of the time-dependent CONV share and RES share of the total national generation for the year 2019 for different countries. The average RES share varies from 7% in the Netherlands (NL) to 77% in Denmark (DK). Given Equation (7) and the limitation to national electricity supply, in this analysis, the CONV share in Figure 5 is also the residual load share of the total national load.

Merit order calculation with the PP method
The PP method can be used to calculate the merit order if power plant specific efficiencies are available. The merit order results from sorting all active power plants p according to their ascending marginal costs c m p , which are given by: where η el p is the efficiency per power plant p (data from Section 2.2.1), c GHG is the carbon emission price (data from Section 2.4.1), and ε f and x f are the fuel type specific emission intensities and prices, respectively (data from Section 2.4.2). Figure 8a) exemplarily shows the resulting merit order with power plant specific marginal costs c m p and emissions ε m p for Germany, 2019.

Merit order calculation with the PWL method
The PWL method is a piecewise-linear approximation approach that can be used to approximate the country and year specific merit order if power plant specific capacity and efficiency data are unavailable. For most European countries only fuel type specific data are provided, hence the PP method which is based on the power plant list including power plant efficiencies is not applicable in these countries. A very simple approach to approximate MEFs is the usage of fuel type specific efficiencies. In this PWL method, instead, we assume that all countries have the same maximum and minimum efficiency per fuel type and that within each fuel type the capacity over the range of occurring efficiencies is uniformly distributed. These assumptions allow us to approximate the efficiencies of the power plants by a piecewise-linear function. For each fuel type, the minimum and maximum efficiencies were determined by the minimum and maximum values of an ordinary least squares regression of the power plant efficiencies of the German power plant list, see   Figure 6. To enable the non-disjunct structure of the merit order, which can be observed in reality, the total generation capacities per fuel type were discretized into discrete equally sized virtual power plants. In our case, country and fuel type specific values shown in Figure 7 were used. They were computed from the GEO power plant list [46], which contains capacity data but no efficiency data.

Limitations
The resulting MEFs depend on the determination of marginal power plants, which is subject to some uncertainty since the merit order is calculated based on an approximation of marginal costs and installed generation capacity. Therefore, the error generated through the improper calculation of the merit order has a substantially stronger effect on MEFs than on XEFs.
As did all previous studies presented in Table 1, we do not consider transnational power flows in the calculation of MEFs since there is no existing method to do so. Commercial entities like Tomorrow [27,56] provide estimated MEFs considering cross border flows, however their method is not published.
Due to these two limitations, the MEFs calculated in this study are subject to an unquantified level of uncertainty. However, the authors deem this to be reasonably low due to the high number of empirically derived parameters factored into this studies.

Validation of the PWL method
To evaluate the approximation error of the PWL method, the PP method was applied to the detailed dataset for Germany described in Section 2.2.1 for the years 2015-2019. For this, we introduce the PWLv method as a validation variant of the PWL method which ensures consistency of data sources between the two main methods (PP and PWL). In Figure 4, it can be seen that in contrast to the PWL method, the PWLv method used installed generation capacity data from the German OPSD power plant list to ensure the same data basis with the PP method. The PWLv-based merit order for Germany, 2019 is shown in Figure 8b); and for better comparison, Figure 9 shows the same merit order together with the PP-based merit order. Figure 10 shows a comparison of the two CEF-types and the two calculation methods of Germany for the first week in May 2019. It can be seen that the PWL method succeeds to represent the MEFs well. A whole-year comparison between the resulting MEF PP t and MEF PWLv t can be found in Supplementary Material F. Figure 11 shows the values of the three different relative error types that were calculated to evaluate the approximation quality of the PWL method compared to the PP method: 1. The error of power plant specific marginal costs c m p and emission intensities ε p along the merit order (δ MO,c , δ MO,ε ). δ MO,c and δ MO,ε were calculated as averaged relative errors of marginal costs and emission intensities, respectively, between the PWLv method and the PP method along the merit order. To be able to deal with the different cumulative power values in the merit orders, the merit order was discretized by 10 MW-elements which are more than ten times smaller than the average power plant. 2. The error of time-dependent prices, MEFs, and XEFs (δ P , δ MEF , δ XEF ). 3. The error of yearly aggregated prices, MEFs, and XEFs (δ P , δ MEF , δ XEF ).
From Figure 11, one can see that while all other error types are below 2.5%, δ MO,ε and δ MEF are above. However, due to the high number of lignite and coal power plants with similar marginal costs, Germany has by far the most fuel type changes along the merit order, see Figure 12. And since δ MO,ε and δ MEF correlate with the number of fuel type changes, they are expected to be smaller for all other countries.

Analyses and discussions for European countries
In the following analysis, we apply the previously developed methods (PP and PWL) presented in Section 3 to data described in Section 2, see also Figure 4. More specifically, we apply the PP method to the German power plant list described in Section 2.2.1 and the PWL method to data of 20 European countries described in Section 2.1.2. The remaining European countries were removed from the analysis due to poor data quality or insufficient electricity demand (small countries). The rejection criteria are detailed in Supplementary Material F.

Merit orders and CEFs for European countries
Of the analyzed countries, Figure 14 shows the distribution of the residual load (the load that has to be provided by conventional power plants) described in Section 3. In Figure 13, the distributions of XEFs, MEFs, and marginal prices resulting from the simulations are depicted. One can see that the MEFs tend to be higher than the XEFs. Only in

Correlation analysis of emissions and prices
For the correlation analysis of the CEFs, we follow [24] in using the Spearman rank correlation coefficient r as the relationship between CEFs and marginal costs is not expected to be linear [24]. This coefficient r quantifies how well the relationship between two variables can be described using a monotonic function.
In the Supplementary Data H, the correlation between electricity prices (simulated and historic) and the calculated CEFs (XEFs PP , XEFs PWLv , MEFs PP , MEFs PWLv ) for Germany for the years 2017-2019 are presented in scatter plots. It can be seen that the XEFs correlate positively with the simulated and the historic prices for all three years 2017-2019 and for the methods PP and PWLv (r-values range between 0.62 and 0.88). In contrast, the MEFs have negative r-values (between -0.16 and -0.42) for all mentioned combinations, which is a first indicator of the phenomenon where PBDR leads to a carbon emission increase. Figure 15 shows r-values for all analyzed countries using the PWL method. The correlation values are positive for 16 and negative for 4 of the 20 countries.

Price-based load shift analysis
For environmental evaluation of PBDR, only the source and sink hours are relevant, i.e., the hours where energy is shifted from and to. The analysis of the emissions without the consideration of the electricity price, which is the driving signal for PBDR, might be misleading. Thus, we quantify and compare the incentives and effects of PBDR for the years 2017-2019 and the 20 analyzed countries with a simulated load shift: Every day, an hourly load of 1 kWh is shifted from the most expensive to the cheapest hour of that day. Annual simulations were conducted for the years 2017-2019 for the 20 analyzed countries. The electricity prices relate to the marginal costs of the previous simulation.
Since, MEFs quantify the marginal system effects, the ME changes are the total emission changes, i.e., if MEs increase by 1 t, total emissions increase by 1 t, too. In the remainder of the paper, we refer to the total emission changes as ME changes only to distinguish them from grid mix emission (XE) changes, which are the change in emissions calculated from the XEF. In Figure 16, the relative changes of cost and carbon emissions of the shifted energy due to load shifts are shown. Since we consider PBDR, the cost reductions are the incentives and the changes in carbon emissions are the effects. It can be seen that, in contrast to the costs, which decreased for all countries as expected, the carbon emissions increased for some countries. XEs are usually reduced since PBDR leads to load shifts from high-XEF-hours to low-XEF-hours. The only exception of the 20 countries is Serbia (RS), where the load shifts increased the total XEs by 3%. The reason might stem from the fact that Serbia's power supply system is dominated by hydro (run-ofriver and pondage) and lignite. In the simulation, the marginal power plant for Serbia is always a lignite power plant (see also Figure 19 and the merit order in Supplementary Material G). With the aid of the pondage, hydro power plants have storage capacities enabling them to shift electricity generation from lowload-hours (0:00-6:00) to peak load hours (8:00, 19:00), see Figure 17. While this also happens in other countries, in Serbia the pondage effect has more weight as there are no significant shares of vRES such as wind or solar. A similar case, where PBDR led to an increase in XEs is reported by [34] for Norway in the year 2015.  Fig. 17. Historic data for the year 2019 for Serbia: Hydro run-of-river and pondage (left) and total load (right). Data source: [28]. Figure 18 shows the source and sink time of the load shifts. These are the times with the highest and lowest prices of the day. For most countries, the distribution of source time has two humps: One for 5:00-10:00 and another for 16:00-20:00. This result was expected by the authors since it reflects the doublehump pattern of historic price curves. The load shift sink is mainly at night (21:00-07:00). Only in Germany and Italy does the sink occasionally occur around midday (11:00-15:00). In France, source and sink hours are both just after midnight. This is because France's sole conventional energy source is usually nuclear which is evaluated with a constant efficiency by the data source [45]. Therefore, the daily price spreads were zero on most days, and load shifts were only carried out in cold winter months where coal and gas cc power plants additionally stepped in to cover the increased load. Figure 19 gives insights on the marginal fuel type combination of the conducted load shifts. For the small country of Lithuania (LT), it can be seen that the national energy supply with gas as the only energy source and a low ratio between load and average power plant size provides only little incentive for load shifting.

CEF-based load shift analysis
For PBDR, the time-dependent electricity price is the incentive signal that determines the source and sink hours. However, conducting DR based on XEFs or MEFs could be an option for consumers who want to minimize their carbon emissions. In this analysis, we rerun the previous load shift simulation with XEFs or MEFs as incentive signals, so 1 kWh is shifted every day from the hour with the highest XEF or MEF of that day to the lowest, respectively. In Figure 20, the relative changes of cost and carbon emissions of the shifted energy due to load shifts are shown; note that the results from Section 5.3 are repeated for better comparison. The resulting effects of the XEF-based load shifts are similar to that of the cost-based load shifts: Both lead to increased MEs in 8 countries. In contrast, the MEF-based load shifts lead to ME savings between 3 and 84% with an average of 35%, albeit reducing the average cost-saving potential by 56% compared to cost-based load shift.

Detailed discussion of six sample countries
In the following, the reasons and conditions under which PBDR lead to increased carbon emissions will be discussed using detailed results of the load shift analyses of six exemplary countries for the year 2019, shown in Figure 21. For corresponding analyses of the remaining 14 countries, see Supplementary Material J. The six countries were selected because first, they are representative in terms of country size and the range of relative annual changes from Figure 16, including its maximum (GR) and minimum (FR) of ME changes, and second, they have different interesting attributes, e.g., the dominance of nuclear (FR), high wind share (DK), the dominance of lignite/coal (PL), or being a small country very reliant on wind and gas cc (IE). For each country, Figure 21 shows a) the merit order, b) the histogram of P resi t as the range of possible load shifts, c) the load shifts with its sources and sinks, and d) their inverted load duration curves. For according plots for all 20 countries for the years 2017-2019, see Supplementary Material I. In Figure 21, the three countries in the first row (DE, GR, IE) have increased MEs in Figure 16 types are lignite, coal, and gas cc, which are also the main causers of the merit order dilemma of emissions. The results show 159 load shifts in which the marginal fuel type does not change. In these cases, the load is generated by a generation unit with the same fuel type but with a higher efficiency, which leads to reductions in cost and MEs. With 192 cases, more than half of the German load shifts display a reduction in MEs (148 coal-to-lignite, 22 gas cc-to-coal, and 22 gas cc-to-lignite). Only 14 energy units were shifted towards a situation with a greener marginal fuel type (lignite-to-coal).
• Greece (GR): In Greece, the increase of MEs due to load shifting is particularly strong with 53%. The reason is that the residual load oscillates most days around the capacity limit between lignite and gas cc at around 4 GW leading to 241 gas cc-to-lignite load shifts, which are the most disadvantageous occurring load shifts regarding the unwanted effect of increasing MEs. Due to Greece's still moderately developed expansion of renewable energies with 30% RES share, the potential for reducing XEs is only 3%.
• Ireland (IE): With a wind share of 40%, Irelands RES share of 43% is similar to that of Germany even with hardly any photovoltaic power. This drives the XE saving to up to 30%. However, ME changes are slightly positive with 3% due to 63 gas cc-to-coal load shifts. Only the 36 coal-to-gas cc-load shifts imply changing to a greener marginal fuel type while the vast majority of load shifts (266) stay within the dominant fuel type gas cc.
• Denmark (DK): In Denmark, the load shifts lead to 40% XE reductions. This is mainly caused by the high RES share of 76% of which wind onshore contributes almost half. The national residual load P resi t is subject to strong seasonal fluctuations. Causing factors could be winterly heating power demand and reduced imports of German excess solar power. The MEs are moderately reduced by 6%. More than 99% of the load shifts are within the fuel type of coal, thus, yield only emission reductions through higher power plant efficiencies.
• France (FR): In France, 71% of electricity is produced by nuclear power plants making the country's national power supply system heavily conventional-based, yet low in carbon emissions. The national power supply gives only little economic incentive for load shifting since the residual load is predominantly in the range of marginal nuclear power. Only in the cold winter months, when the residual load exceeds the nuclear power capacity limits, incentives for intraday load shifting are created. There are 31 load shifts into hours with nuclear as marginal fuel type: 9 shifted from gas cc and 22 from coal. These few cases do not lead to particularly high CO 2 savings in absolute terms. In relative terms, however, due to the low emission baseline level of France's power system, they contribute to the highest savings of all analyzed countries: 92% XE and 64% ME savings.
• Poland (PL): Poland has a low RES share. Predominant fuel types are coal (52%) and lignite (26%). In the merit order, they are intertwined due to similar marginal cost levels -in contrast to Greece where the fuel types lignite and gas cc form continuous blocks in the merit order. This leads to all combinations of load shifts between the two fuel types: 35 lignite-to-coal-shifts, 77 coal-to-lignite-shifts, 7 lignite-to-lignite-shifts, and 246 coal-to-coal-shifts. Together these result in 7% ME savings mainly stemming from power plant efficiency gains of the coal-to-coal load shifts.

Impact of carbon price
A promising measure to solve the merit order emission dilemma is to set a price for carbon emissions in the form of a carbon price or carbon tax to an appropriate level. This increases the correlation between marginal costs and carbon intensities in the merit order.

Impact on marginal costs-emissions correlation
To demonstrate this, a sensitivity analysis was carried out in which the merit order of the German power plants for the year 2019 was determined for different hypothetical carbon prices. To quantify the effect on the merit order, the Spearman correlation coefficient r between the marginal prices c m p and the power plant-specific carbon intensities ε p along the merit order was calculated for each scenario. The quantitative results can be seen in Figure [57]. For r 0.8 or c GHG 235.6 e/t, the marginal gains of r decrease. In order to reach r = 0.99, a carbon price of 1269.5 e/t is needed. Figure 23 shows the according effects on the merit order. One can see that with increasing carbon prices, the low-emission gas cc power plants gain a comparative economic advantage over lignite and coal and shift to the left side of the merit order. With c GHG = 100 e/t, almost all gas cc power plants are directly behind nuclear. The same happens to the gas power plants, coal power plants, and for c GHG somewhere above 236 e/t even to oil power plants. These values mostly align with the simulation results in [58]. For c GHG = 10 000 e/t, where r reaches the value 1.00, the carbon emission intensity ε p (black line) is, except for a few small power plants, monotonically increasing and fully correlated with the marginal price c m p (blue line).

Impact on load shifting
In a final analysis, we calculated the load shift effects on MEs, XEs, and costs for the different carbon prices for the case of Germany in 2019. The results in Figure 24 are counterintuitive. E.g., while MEs decrease to 6.5% with a carbon price raise to 24.9 e/t, they increase again by 6.2 percentage points to 12.7% when the carbon price is further increased to 42.6 e/t. This is because with c GHG = 42.6 e/t, gas cc power plants move into the 35 GW area, around where the residual load fluctuates (see Figure 21b for Germany) increasing the number of gas cc-to-lignite and gas cc-to-coal load shifts, see Figure S20 in Supplementary Material I. At c GHG = 65.3 e/t, gas cc power plants already arrived on the lower half of the residual load range and therefore act more frequently as load shift sink. c GHG = 65.3 e/t is already sufficient to achieve 58% of the possible achievable ME savings, and c GHG = 100.0 e/t yields 93% of the possible achievable ME reduction of 21.4%. For carbon prices around 235.6 e/t, the aforementioned counterintuitive increase occurs again, however, with lower impact and through increased gas-to-coal load shifts.
The effects on XEs are similar to the effects on MEs. Counterintuitive increases occur for the same reasons but with lower amplitude, due to the damping influence of averaging.
The effect curve on costs in Figure 24 is concave. For c GHG ≤ 65.3 e/t, the cost savings decrease with increasing carbon prices. The reducing daily price spreads are caused by the convergence of marginal costs of power plants in the relevant residual load range (around 35 GW) as the high carbon intensities of formerly cheap coal and lignite power plants are effectively penalized by the increasing carbon price. For c GHG > 65.3 e/t, the carbonrelated marginal costs (cf. Equation (8)) increasingly outweighs the fuel costs and ultimately makes the fuel costs insignificant. Through the formation of coherent technology blocks -now in the new ascending order of carbon intensity -steps in the marginal costs curve are shaped (e.g., at 23 GW) which in turn lead to higher price spreads.

Discussion summary
While MEFs are essential for quantifying the carbon differential of load shifts, XEFs are more suitable for calculating the carbon emissions of a static electricity load profile. Also, XEFs can be determined with less uncertainty and in a straightforward approach which is reflected in their high availability.
European national electricity supply systems differ widely in both size (0.13-49 GW residual load) and composition (7-77% RES share), which is reflected in the varying prices and CEFs that resulted from the simulation, Figure 13. The differences between the European countries became even clearer when running yearly simulations of daily load shifts on the basis of the calculated prices and CEFs. The electricity cost-saving potentials ranged between 3% for Lithuania (LT) and Serbia (RS) and 24% for Austria (AT) and Greece (GR). The resulting changes in MEs varied between 64% decrease for France (FR) and 53% increase for Greece (GR), with increases in eight countries (AT, DE, ES, GR, HU, IE, PT, RO). The changes of XEs varied between -92% for France (FR) and +3% for Serbia (RS), with Serbia being the only country where XEs increase. Averaged over all countries, the costs decreased by 10.4%, the XEs decreased by 26.9%, but the MEs increased by 2.1%. While XEF-based load shifts, like the price-based load shifts, led to ME increases in eight countries, MEF-based load shifts resulted in average emission savings of 35%, albeit with 56% lower cost savings. A final sensitivity analysis regarding the carbon price brought the following salient findings: (1) For Germany, a carbon price of 42.6 e/t was necessary to decouple emissions from prices, i.e., where r, the Spearman correlation coefficient of emissions and prices along the merit order, is zero.
(2) A carbon price increase from 0 to 156.1 e/t led to a switch between gas cc and lignite/coal in the German merit order and flipped effects of the according price-based load shifts on carbon emissions from a 10% increase to a 21% decrease. (3) Increases of the carbon price beyond 156.1 e/t led to insignificant changes.

Conclusions
The aim of this paper was the quantification and discussion of the effects of PBDR on operational carbon emissions for European countries. Straightforward approaches based on the calculation of XEFs are not suitable for this purpose due to the characteristics of electricity markets. More adequate methods based on the knowledge of marginal power plants require detailed data, thus MEF values are not readily available for European countries. In this paper, we therefore proposed a method (PWL) to approximate MEFs with readily available datasets and validated it with another method (PP) using power plant-specific efficiency data from Germany. We then applied the PWL method to 20 European countries for the years 2017-2019 to calculate prices, MEFs, and, for comparison purposes, XEFs. The resulting prices and CEFs served as basis for subsequently conducted load shift simulations, to evaluate its effects on carbon emissions. Starting from the so-called merit order dilemma of emissions, the results were discussed for six representative countries. In a final analysis, the impact of carbon pricing was analyzed by calculating the Spearman correlation coefficient between prices and emissions along the merit order for different carbon prices.
The key findings of the paper are: 1. The great diversity of European countries in terms of the composition and the size of their national electricity supply systems is reflected in the XEFs and MEFs. 2. Price-based and XEF-based load shifts led to increases in operational carbon emissions in 8 of 20 European countries. 3. MEF-based load shifts led to average carbon emission savings of 35%, however decreasing the cost-saving potential by 56% compared to price-based load shifts. 4. Emissions and prices along the German merit order for the year 2019 decoupled with a carbon price of 42.6 e/t. 5. An carbon price increase from 0 to 156.1 e/t led to a switch between gas cc and lignite/coal in the German merit order of 2019 and improved the carbon emission effect of the according price-based load shifts for European countries from a 10% increase to a 21% decrease.
Despite the limitations outlined in Section 3.6, the following main conclusions can be drawn: While PBDR leads to negative environmental effects under specific circumstances, it is a very promising method of reducing operating cost and carbon emissions when adequate carbon prices are implemented. To exploit the full positive environmental potential of PBDR, high correlations between carbon intensity and marginal cost in the merit order need to be ensured either by adequate carbon prices or other market interventions.
In future research, the interconnectivity of individual countries to form a large interconnected network should be investigated, as this is an increasingly important aspect, which was however outside the scope of this paper. Furthermore, dynamic MEFs may be used for the assessment of the environmental potential of PBDR in real case studies considering technical, organizational, and process-related constraints in a realistic way.