A novel multiobjective generation and transmission investment framework for implementing 100% renewable energy sources

: Over the last decade, it has been considerable attempts to replace thermal power plants by renewable energy sources (RESs), mainly to reduce harmful gaseous emissions. Ubiquitous nature of these sources in the emerging smart grid, demands a dominated RES power system for a long-term horizon. This study proposed a model for 100% RES-based system which is tractable and flexible enough to be used for any mixture of generation unit types with any level of uncertainty needless of the correlation between their random and unpredictable behaviours. To overcome the complex nature of RESs, an efficient stochastic multiobjective mixed-integer linear programming framework is proposed. The efficacy of the proposed model is evaluated via numerical simulation.


Introduction
Nowadays, electrical power industry tries to minimise the harmful environmental effect of power generation while respond to increasing customer demands. Fossil fuels are not usable due to high extraction costs and environmental emissions, and substituting thermal units by renewable energy sources (RESs) is a step forward towards the emission minimisation targets. Employing RESs can reduce the transmission loss along with the emission produced by the power generation cycle in a power plant. This can be easily achieved by accurate planning and smart RES operation. European Union (EU) reached a global deal in mid-December 2008, famously known as climate and energy package, to reduce energy consumption and tackle climate change effects. Three most important targets of the package, known as 20-20-20 targets [1], are IET Gener. Transm EU countries have committed to renewable power generation targets including the capacity of 10% in Malta and 49% in Sweden [2]. United States also has plans to increase the potential capacity of renewable power generation from 30 to 90% of US electricity demand by 2050 according to the report by National Renewable Energy Laboratory (NREL) [3].
To fulfil such targets, different methods have been proposed to model the complexity of RES expansion and the parameters within. Each of these models considers a specific set of parameters. Mixed-integer non-linear programming (MINLP) model proposed by López et al. [4] considers demands, availability status of generating units, and capacity factor of transmission lines as random events. In [5] a probabilistic mixed-integer linear programming (MILP) model considers the outage probability of generating units and transmission lines. The MILP model proposed by Samarakoon et al. [6] focuses on vital contingencies and N − 1 formulation. The multi-stage MINLP model proposed by Sepasian et al. [7] considers both the security constraints and fuel limits and employs backward approach to solve the constraint problem. The expansion planning models proposed in [7][8][9] used fuel constraint and the transportation limits as parameters. Market-based model in [10,11] is proposed to coordinate generation and transmission expansion with a mechanism of incentives and payments for transmission and capacity agents. Market-based expansion model in [12] employs cooperative game theory while in [13] a model has been proposed for coordinated generation and transmission expansion in the restructured electricity market, using game theory and Cournot model. Models in [14][15][16] simulated the behaviour of market players via multi-level mathematical programming where in [14] a bi-level programming model for simultaneous generation and transmission expansion planning (GTEP) within a competitive electricity market is implemented. In [15] a three-level equilibrium expansion model in a pool-based market has been presented, considering uncertainties of wind and hydro power generations and demands. A four-level structure, bidding strategies of generation companies (GENCOs), market clearing, and generation and transmission expansion of GENCOs, is proposed in [16]. In [17] a multi-stage model has been presented for GTEP, considering congestion of the transmission lines and the effect of transmission investment cost on planning horizon. In [18] a deterministic multiobjective GTEP model is suggested and solved using epsilonconstraint method. In [19], the suggested multiobjective GTEP model considers intermittent behaviours in load and fuel prices, which is solved using normal boundary intersection (NBI) method. A stochastic MILP model is proposed in [20] to achieve a minimum-cost structure for the transition analysis from today's thermal-based system to a fully renewable-based system (in 2050) by allowing investment in both generation and transmission facilities.
In this paper, a novel mathematical framework is proposed to analyse an economical transitional model from a thermal-based power system towards a 100% RES-based power system. In the proposed model, investment in generation and transmission are both considered. The main contribution of this paper can be briefly described as follows: • A novel multiobjective framework for generation and transmission capacity investment is proposed. Two distinctive objective functions are investment/operation costs and transmission loss. • K-means clustering technique to reduce possible scenarios is applied to achieve computational tractability. • NBI optimisation approach is employed to minimise the total cost and transmission loss objective functions and obtain evenly distributed Pareto optimal solutions. • A fuzzy decision making approach is proposed to softly select the most preferred compromise solution among the Pareto optimal solutions.
The subsequent sections of this paper are structured as follows. Section 2 describes the proposed multiobjective model. The scenario generation method is discussed in Section 3. The proposed NBI optimisation is described in Section 4. Section 5 presents the case studies and discussed the results in detail. The paper concluded in Section 6.

Multiobjective modelling and formulation of the problem
The multiobjective function of the proposed model is described as follows: Multiobjective function = f cost , investment and operation costs f loss , transmission loss where the detailed descriptions of the objective functions f cost and f loss will be given in Sections 2.1 and 2.2. The linearisation method of the loss function is further described in detail in Section 2.2.

Investment and operation costs minimisation
The main objective function of the proposed investment problem is cost minimisation, described as follows: where f cost is the total investment and operation costs to be minimised. The first term in (2) is the total investment cost of thermal, wind, concentrated solar power (CSP), biomass units and the transmission lines, respectively. The second term in (2) is the total operation cost of the thermal, wind, CSP, biomass units, and the unserved demand cost over the study scenarios, respectively.

Transmission loss minimisation
Assume that all load tap changer ratios are set to their nominal values. The transmission loss of line l can then be approximately written as [21] P l loss = G l X l P l where the loss of active power in a specified line is approximately proportional to the square of its passing active power. The piecewise linearisation of (3), which is adopted in the literature [21][22][23][24][25], is shown in Fig. 1, in which P ls 2 is approximated by linear segments.
The mathematical formulation of the proposed loss modelling approach is given as p ls L = p ls L + − p ls L − , ∀l, ∀s where the two non-negative variables p ls L + and p ls L − are used to represent the free variable of the line flow p ls L and its absolute value as shown in (5) and (6). Inequalities (7) and (8) guaranty that only one of the variables p ls L + and p ls L − can take a non-zero value [21]. The upper and lower bounds of each segment are defined in (9) for both existing and candidate transmission lines. The slope of each segment can be calculated using (10). Thus, the second objective function is described as  (11) where the f loss is the total loss. This objective function describes the total energy loss in the transmission system.

Power balance
The power balance for each individual bus in each individual scenario can be written as where the first term in (12) is power injection into each individual bus in each individual scenario and the second term is the power ejection from each individual bus at each individual scenario. The first-stage power balance constraint (13) is written as inequality equation. This is the first to avoid increasing causeless of the total power balance in the case of minimising only the second objective function, and the second to obtain a softer formulation which is feasible in all cases of our study.

Power generation limits
The power generations are limited to the specified boundaries and this limitation may be written as where the power generation limits of thermal and biomass generation units are given in constraints (14) and (15) implying that the generated power in each scenario should be non-negative and less than their installed capacity. Constraints (16) and (17) set the limit for wind and CSP units, respectively, with a varying capacity factor depending on different scenarios. In fact, the capacity factors in these constraints show the available power capacity which can be extracted from each generation unit and in other words, they limit power generations to some specific levels that are different for different scenarios.

Maximum installed capacity
The total capacity can be installed, which depends on various conditions including size, location, geographic and disposal budget. These boundaries are described as where constraints (18)-(21) limit the maximum capacity to be installed for each RES unit according to the resource availability at each zone.

Served and unserved demands
We assume that each load can be shed by the independent system operator and comprised of two parts; a served part and an unserved part. These two parts are defined by the following constraints: where constraints (22) and (23) determine the jth load demand level in zone z and scenario s and maximum value of the unserved demand.

Power flow
In the power system expansion studies, a DC power flow model is usually used based on the assumption that without loss of generality the transmission losses can be ignored. Thus, to reduce the complexity of the formulation, we used DC power flow described by the following constraints: −P l L y l ≤ p ls L ≤ P l L y l , ∀l ∈ Ω CL , ∀s where (24) and (25) determine DC power flow through transmission lines and constraints (26) and (27) stand for maximum line flow limits, for both existing and prospective lines, respectively. It should be noted that in order to minimise the complexity of the model, transmission line investment binaries (y l ) are just defined for the prospective lines.

RESs penetration level
The aim of this paper is to propose an analytical method for constructing an economic 100% renewable-based power system. However, in order to analyse and show the efficiency of the proposed framework, a constraint condition is defined to force the optimisation process to install RES units in a specified penetration level On the other hand, the decision maker can use (28) to transform the proposed static model to a rolling-window model which is a good approximate for accessing to a more optimal solution. For example, it is possible to set target of the proposed static model to the end point of first investment decision stage (5 years) and fix the value of η to 20%. Afterwards, this window is moved forward to the next period and the value of η is set to 40%. Thus, approximately simulate the dynamic model with five static models (for a 20 years study horizon) that are simpler to model and analyse.

Scenario generation
In most electric power systems, demand, CSP and wind-power generation are not statistically independent parameters in terms of magnitude. Low values of demand usually occur during the night, when wind-power generation is comparatively higher and high values of demand usually occur during the day, when CSP-power generation is also high. Therefore, considering demands, CSP and wind-power as independent phenomena may render suboptimal and inefficient investment decisions. This necessitates to properly represent the statistical correlation between these parameters based on the historical data of the demand in the electrical power system. These historical data are adequately scaled to account for demand growth. Once the demand is given, we can model the CSP and wind-power generations. Historical data corresponding to each day constitutes a single scenario, each one comprising a value for the demand, CSP and wind-power generation in each location of the system. The scenario set is reduced to a tractable data set using a clustering algorithm to group data according to those similarities afore-mentioned correlations. Here, the data are the observations of the demand, CSP and wind-power generation for a given location of an existing electric power system. The K-means method is used for clustering as it showed good performance [26].

K-means clustering method
This method is in fact an unsupervised learning algorithms used as a clustering tool. All various types of K-means method share similar routine which is based on • Selecting a few points as the centre of the clusters.
• Assigning each point to a cluster whose centre has the shortest distance to the point.
In the standard form of K-means algorithm, at first N clu points are randomly selected such that N clu represents the number of demanded clusters. Then more number of points will be assigned to the clusters, based on the shortest distance to the centre of the clusters (similarity), and in this way N clu will be finally determined. The first step (early grouping) is completed when all points are assigned to clusters. New binding would take place between the same data and the nearest new centroid repeatedly till N clu centroids find their desired location by minimising the objective function described as follows: where ∥ x a (e) − c e ∥ 2 is a distance between a sample data point x a (e) and the centre of a cluster c e , J is the distance of the N data data points from their respective cluster centres. Fig. 2 illustrates the flowchart of this clustering algorithm.

NBI multiobjective optimisation method
The NBI method [27] is an algorithm for solving linear and nonlinear multiobjective optimisation problems. Compared to the widely-used algorithms, such as weighting method and epsilon constraint method, NBI has two main advantages: • It is independent of the scales and dimensions of the objective functions [27]. • It produces a Pareto optimal frontier with a regularly distributed set of points [28].
In NBI method, a geometrically intuitive parameterisation is used to obtain a set of Pareto solutions. To achieve this, a pay-off matrix Φ should be formed as where the non-diagonal element f i (x¯j * ) indicates the value of f i (x¯) which calculated using the optimal solution x¯j * obtained by minimising f j (x¯). Thus the diagonal element f i * (x¯i * ) represents the optimal value of f i (x¯) obtained from the solution x¯i * .
After constructing the pay-off matrix, the objective functions are normalised, as they have different physical meanings. The Utopia and the Pseudo-Nadir points obtained from the pay-off matrix are used to normalise the objective functions as follows: where f ξ * is the diagonal element of the pay-off matrix corresponding to the ξth objective function (among N obj objective functions of the problem) which called a dimension of the Utopia point and f ξ max is a point in the feasible region with the worst design of ξth objective value which is called a dimension of the Pseudo-Nadir point and may be calculated through f ξ max = max f ξ x¯1 * , …, f ξ x¯N obj * . In the next step, the multiobjective optimisation problem is solved in a non-dimensional criterion space.
Note: From this point on, the normalised values are denoted with a hat.
The Convex Hull of Individual Minima (CHIM) is defined as a set of convex combination of points of each row of the pay-off matrix. Any point in the normalised description ℜ β 1 , β 2 , …, β r , on the CHIM can be expressed as follows [27]: (32) where β is a positive value and ∑ ϑ = 1 r β ϑ = 1. These terms are shown in Fig. 3 for an arbitrary two-objective problem. In this figure, it is considered nine values for and thus nine points (N β ) will be introduced between two objective individual optimum points.
Therefore, the original multiobjective optimisation problem is divided to r single-objective sub-problems. Each sub-problem aims to maximise the distance between the CHIM and the Pareto frontier (t) subject to main problem constraints. The formulation of the NBI sub-problems is described as follows: where n is a normal vector at the point on the simplex to the CHIM, β ζ is a r × 1 vector, Φ is the normalised pay-off matrix and F x¯= f^1 x¯, …, f^N obj x¯T. Optimisation problem (33) should be solved for different vectors of β ζ to generate the Pareto frontier.
The presented structure of NBI method for solving a multiobjective problem is briefly summarised in the flowchart of Fig. 4.

Fuzzy decision making
Although there are many efficient solutions obtained by NBI method called Pareto optimal solutions, one of these solutions is the most suitable due to the focus on its objective functions. The fuzzy decision making is used to softly select the most preferred compromise solution among the Pareto optimal solutions [29][30][31]. Here a linear membership function (μ ξ ) is defined for each objective function f ξ In (34), the membership function represents the degree of achievement of the ξth objective function. Using (35), the normalised membership value (μ ζ ) is then calculated for the ζth Pareto optimal solution using the relative importance of the objective functions (ω ξ ) which are determined by the decision maker according to his/her priorities of the objective functions Finally, a solution with the highest value of μ ζ will be selected as a best final solution.

Numerical results
The proposed model is validated by two different case studies. The scenario generation and reduction processes are simulated in MATLAB [32] and the proposed model is simulated using the CPLEX solver of GAMS software [33].

Scenarios
The wind speed data are obtained from the System Advisor Model (SAM) 2015.1.30 [34] for different locations in the US, i.e. Alabama (AL)-Birmingham, Arizona (AZ)-Tucson, Massachusetts (MA)-Boston (Fig. 5a) and applied the power curves of two wind turbines (Vestas V80-2000 onshore and Vestas V66-1650 offshore) to generate the hourly wind-power generation of each location for a year.
To obtain normalised wind-power generation during a year, we use feature scaling method to scale its range in [0, 1]. The general formula is given as where x is the considered data vector, x¯ and x are the maximum and minimum values of the data vector, respectively, and x is the rescaled or normalised data vector. Eight thousand seven hundred sixty normalised wind-power generation values are then obtained for each location. These normalised wind-power generation values are used to characterise the wind availability at the buses which wind units are built. We use PJM's hourly demand data profile [35] to delineate the demand behaviour. The efficiency of the CSP units is used to describe their probabilistic behaviour, using historical hourly irradiance data collected from the same three locations in the US power grid [34]. The data represents the efficiency factor of the CSP units, i.e. availability, capacity factors and system efficiency. Eight thousand seven hundred sixty normalised CSP-power generation values for each location is then obtained using (36). As a result, normalised hourly demand, CSP and wind-power generation data for a year are available. We now have 8760 operating points correlating demand, CSP and wind-power generation in each bus of the system. Since each operating point is considered as a scenario. To attain computational tractability, we reduce the 8760 scenarios to a set of 100 scenarios using K-means clustering technique (ref. Section 3.1). The weight of each final scenario is the number of the original operating points (hours) comprising that scenario (H s ). Fig. 5b depicts 8760 samples of demand/CSP/wind scenarios (blue points), which grouped in 10 clusters to represent 10 scenarios (red points) and the summation of weighting factors corresponding to these 10 scenarios is 8760, i.e. ∑ s = 1 10 H s = 8760.

First case study
The first case study is a test power system based on Garver's six bus system [36].
To supply the demand and assume no current existing units would be in operation in the next 20 years, we consider the possibility of installing up to 8 various generating units (see Table 1). The operation costs consist of fuel, emission, and the operating/maintenance costs.
The data related to investment and generation costs of units are obtained from the International Energy Agency reports [37]. The time horizon of our data is 20 years (according to our study). To annualise investment costs, we apply the capital recovery factor (CRF) to the investment cost data in [37] It is noteworthy that the offshore wind units are isolated and far from the network. Submarine high-voltage DC transmission lines connect these units to the network. The transmission system includes nine candidate lines (see Table 2).

Second case study
The second case study is a modified power system based on the IEEE 118-bus system [20]. This system is made out of 3 supposing zones and 56 generating units. The units will supply the demands in a 20-year horizon, whose investment costs, variable costs and locations are given in Table 3. According to the IEEE 118-bus standard, the system can be topologically divided into three zones: • zone 1: buses 1-32 plus buses 113-115, and 117,  The demand factor of zone 1 is considered to be 2.5, 1.75 for zone 2 and 3.25 for zone 3. Most of the biomass units are located in zone 1, most of the onshore wind units in zone 2, and the CSP units are located in zone 3. Thermal units are distributed throughout the network, but they are located around three zones and the centre of the zones are freed from any generation and probably loss is in the highest level (no RES unit in the system).
The transmission system consists of 174 existing lines (line capacity of 700 MW, buses 69-70 and 68-116 with line capacity of 800 MW. There are 16 other candidate lines whose characteristics are given in Table 4.

Results
Five different rates are specified for the rate of return g as 5, 7, 9, 11 and 13%. To analyse the solutions while only CRF changes the constraint (28) is relaxed. The Pareto optimal solutions obtained by NBI method for both test systems are shown in Figs. 6a and b. The final optimal solutions are obtained with weight values of 0.65 and 0.35 for total cost and total loss objectives, respectively. As shown in Fig. 3, we consider nine values so nine points are introduced between two objective individual optimum points. As we expect, Pareto fronts are gone far away from the centre, while CRF increases. At low levels of power losses, the total loss comes down to zero in the 6-bus test system as it has one or more generation units in each bus supplying the local demand. Furthermore, there are not any flows in the transmission lines and as a result no power losses. At minimum cost points, the total loss is increasing and then decreasing with the CRF increasing. The increasing CRF changes the path of the optimal decisions towards choosing a combination of cheaper units and as a result the total loss increasing. However, when CRF increases and the total costs reach a high level, choosing a new combination of units (which expanded all over the system) will be cheaper as witnessed from the decreasing total loss.
Unless we use the maximum number of the thermal units, changes in the types of the generation units and the level of utilisation do not follow a specific rule (see Figs. 7a and b). This is mainly because the first objective function chooses the cheaper generation units (the most available units with a lowest total cost) and the second objective function makes a decision with the most speeded generation units -those generation units which respond faster due to their short distance to the load -to supply local demands, hence the combination of these two objectives push the decisions towards selecting a set of cheap generating units around the system and this event sorely depends on the topology of the system.
There is another note in Figs. 7a and b. Maximum total demands are 760 and 9266.75 MW for two-test system, respectively. There is always a higher level of total power capacity (compared to the real demand) due to the availability level of wind and CSP units (F ws W and F ws W ). We can reduce this extra capacity by increasing CRF. In higher values of CRF, it is more affordable to select more dispatchable units rather than the ones with a lower availability level. The units, however should be built in a high capacity level to supply the same demand. This change depends on the system topology. In the first test case, the CSP capacity gives way to the biomass and thermal generation units, but in the second, only the combined cycle gas units are the suitable candidates.
If the impact of RES penetration level (η) is represented by 11 values from 0 to 100% and CRF value is set as 10%, Figs. 7c and d show the increasing usage of the RES units as their penetration level increases. As seen in the figures, the onshore wind capacity is always fully utilised, then CSP and in the step after that the biomass and offshore wind units will be built. The results are very useful for construction order of the generation units; no matter what time is set for implementation. Another advantage of this approach is that we can always move toward a 100% RES-based system in an economical way without considering the time domain in the formulation.
Figs. 8a and b show the investment costs of thermal, RES units and the transmission system. Investment in the transmission system is small when compared to the investment cost of the generation units. The investment cost of the RES units is mainly higher than thermal. Therefore, whichever size of the system, the intersection of the investment cost of thermal and RES occurs in lower penetration level of RES units. This fact is shown through the comparison of two points of 0 and 100% penetration of RES units.
The CSP units normally need a large investment. This is evident when the reduced usage of CSPs (see Fig. 7d) causes a quick decrease at 70% RES penetration level (see Fig. 8b), and the increase at 20% RES penetration level (see Fig. 8a) is driven by implementing the CSPs (see Fig. 7c).
Figs. 8c and d illustrate the operation costs. Unlike the investment cost, the operation cost of the RES units is quite lower  than thermal, so no matter the system size, the intersection of the operation cost of thermal and RES occurs in higher penetration level of RES units. A comparison between two points of 0 and 10% penetration of RES units represents this fact. Adding the biomass units drives an increase in the operation costs at RES penetration level of 40 and 70% for the first and second test systems, respectively. Connecting the offshore units to the system with smaller operation costs drives a gradual increase in the operation cost of the first case study in RES penetration level of 90% (see Fig. 8c) and decreasing in the operation cost of the first case study with the same RES penetration level (see Fig. 8d). Increment in the transmission investment costs in Figs. 8a and b represents this fact.

Conclusion
In this paper, a novel stochastic multiobjective approach based on MILP mathematical formulation and NBI method has been successfully developed to study the capacity investment problem. The proposed approach overcomes the multicriteria decision making problem showing a pathway for a soft and proper migration from today's thermal-based power system towards a fully renewable one. To better represent the Pareto curve, fuzzy-logicbased multiobjective MILP algorithm is proposed, able to simultaneously minimise the investment/operation costs and transmission losses. Employing the NBI method provides a simple relation between different objectives in finding appropriate weighting factors with different order of magnitude for two objectives with different dimensions and/or sizes.
An effective method is used for creating realistic investment scenarios. The method can be used for scenarios in which the weather conditions and the correlation between different parameters, wind speed, sun irradiance and demand level, do not have a serious effect in the study horizon. Two case studies demonstrate the final topology of the system and generation unit investments will be affected by transmission losses. Since the proposed approach does not impose any limitation on the number of objective functions, introducing more objectives, such as components reliability, is straightforward.