Efficient algorithm for locating and sizing series compensation devices in large power transmission grids: I. Model implementation

We explore optimization methods for planning the placement, sizing and operations of flexible alternating current transmission system (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to series compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l1 norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this non-convex optimization to a succession of linear programs (LP) that are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polish Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop—a speed that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically sized networks that suffer congestion from a range of causes, including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small test system subject to congestion from uniform load growth, and demonstrate computational efficiency on a realistically sized network.

We explore optimization methods for planning the placement, sizing and operations of flexible alternating current transmission system (FACTS) devices installed to relieve transmission grid congestion. We limit our selection of FACTS devices to series compensation (SC) devices that can be represented by modification of the inductance of transmission lines. Our master optimization problem minimizes the l 1 norm of the inductance modification subject to the usual line thermal-limit constraints. We develop heuristics that reduce this nonconvex optimization to a succession of linear programs (LP) that are accelerated further using cutting plane methods. The algorithm solves an instance of the MatPower Polish Grid model (3299 lines and 2746 nodes) in 40 seconds per iteration on a standard laptop-a speed that allows the sizing and placement of a family of SC devices to correct a large set of anticipated congestions. We observe that our algorithm finds feasible solutions that are always sparse, i.e., SC devices are placed on only a few lines. In a companion manuscript, we demonstrate our approach on realistically sized networks that suffer congestion from a range of causes, including generator retirement. In this manuscript, we focus on the development of our approach, investigate its structure on a small 1. Introduction Power grids are undergoing significant evolution that often results in increased stress from sources including fluctuating renewable generation, generation retirement, and the continual growth and evolution of electrical loads. However, the basic operational functions of the grid remain the same, i.e., to balance the generation and load in interconnected transmission grids while respecting the transmission line constraints (among several other constraints and stability limits). The potential congestion created by these stress mechanisms combined with the increasing difficulty of building new generation and transmission close to loads is contributing to a growing interest (see [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]) in utilizing a class of transmission-grid devices known commonly as flexible alternating current transmission system (FACTS) devices (see [18][19][20][21][22][23]). Although FACTS devices provide new degrees of flexibility to relieve congestion without the emissions or right-of-way issues of new generation and transmission, they are still expensive (especially fast solid-state devices) and must be deployed in a manner that maximizes benefit to the overall system.
To fully extract the benefit of FACTS devices, one ought to build in the operations and control of FACTS devices into algorithms for optimal placement and sizing that consider the entire network, not just a small section of the network. There are two significant challenges in this approach. The first is the combination of planning and operations. This type of integration is not entirely new (see, e.g., [4,5,9]), but a system-level approach to the problem is deemed difficult. The second is the difficulty of a system-wide approach that suffers, in general, from the 'curse of dimensionality'-the effort to exactly compute results scales exponentially with the system size and the number of possible FACTS locations. Our approach to placement and sizing of FACTS devices attacks both these challenges.
The problem of placement and sizing is normally considered separately from the problem of operations and control problems, and vice versa. Planning problems seek to place FACTS devices and determine the required operational ranges. The planning problem is a strategic decision that takes into account multiple system conditions extending over a period of years. On the contrary, the control problem seeks the best response to current conditions or those expected within a relatively short period of time. However different the two types of problems may seem, they may be combined in a unified principled manner, similar to an approach developed in [24] for placement of storage. In [24], planning and operations were merged in an iterated two-step process. First, combined grid-and-storage operations were simulated over a range of generation and load conditions (with appropriate costs on storage charging/discharging), and statistics were kept on storage utilization on a node-by-node basis. In the second step, underutilized storage nodes were pruned, leaving only highly utilized nodes. Iteration of these steps algorithmically forces a sparse storage deployment-a generally desirable result for a large-scale infrastructure. However, the computational burden of this approach is rather high. Explicitly incorporating sparsity into the optimization formulation itself, e.g. by using a fixed cost for placing even a small amount of storage, turns the problem into an even more difficult mixed integer formulation with both discrete and continuous variables. However, recent results in compressed sensing [25] 5 suggest that using the ℓ 1 -norm instead of a fixed cost may allow sparsity to be enforced implicitly while keeping the optimization variables in the preferred and computationally efficient continuous domain. We take this approach in this manuscript.
A second challenge to the optimal placement and sizing of FACTS devices is the non-local nature of power flows over transmission networks and the need to consider system-wide impacts of FACTS placement. While it is impractical to implement direct approaches that scale exponentially with the system size for large transmission systems consisting of thousands of components, some new techniques adopted from statistical physics [27][28][29] and optimization [30][31][32][33] suggest that even seemingly difficult optimization problems can be modified into formulations that are probably polynomial (or even linear). Such simplifications are not achievable in all cases, and one often must search for computationally efficient yet empirically accurate heuristics. In our approach, we adopt two methods-linearization of constraints combined with sequential linear programming (SLP) [34] and cutting plane methods [35,36].
By combining the methods discussed above, we seek to formulate and solve a comprehensive FACTS placement and sizing problem that considers the transmission system as a whole while combining aspects of planning and operations. In a companion manuscript, we apply our method to realistically sized networks with sources of network stress such as generator retirement. In this manuscript, we focus on the development of the approach, and we use uniform load growth as an exemplary case. In uniform load growth, all of the loads are scaled by the same multiplicative factor. At some level of load scaling, grid congestion appears, and we demonstrate the effectiveness of our approach by developing system-wide solutions to FACTS placement and sizing to relieve that congestion.
Our model considers the effect of a FACTS device [20] as a continuous modification of the inductance of a transmission line-a model that is directly applicable to thyristor-controlled series compensation (TCSC) devices that use a continuously controllable reactor in parallel with a bank of switchable capacitors [9]. Phase-shifting transformers (PST) with appropriate local controls may also be considered within this model. In the development of the approach presented in this manuscript, we seek to resolve the following questions: • Can a particular infeasible configuration(s) of generation and load, i.e., a configuration that violates one or more transmission network constraints, be corrected with series compensation (SC) devices? • When such a correction is possible, what is the optimal (least 'expensive') set of SC devices that achieves this goal?
These two questions are formulated as one non-convex optimization problem with the l 1 norm of the inductance deviation vector as an objective and the thermal limits for all the transmission lines as constraints. We construct efficient heuristic algorithms that resolve these questions in a computationally efficient manner, and moreover, we empirically observe that the optimal solutions are always sparse, i.e., only a few SC devices are actually needed.
We note that we only consider the implications of quasi-static control. For the thyristorcontrolled SC devices considered here, quasi-static implies a fixed thyristor firing angle and a static modification of the transmission line impedance [2]. For certain transmission lines, this modification could have a significant impact on grid dynamical stability, which needs to be studied on a case-by-case basis. More sophisticated local control algorithms could also be implemented to affect grid dynamics in more profound ways. However, we do not take this step in the current manuscript. Rather, we focus on the implications for congestion.
Layout of material in the rest of the paper is as follows. Section 2 formulates the problem of the optimal placement and sizing of SC devices. Section 3 gives a critical discussion of the approximations and simplifications used in the formulation. Section 4 describes the algorithms used to solve the optimal placement and sizing problem. Section 5 provides a discussion of the properties of the optimization problem and results of its application to a small network. Section 6 demonstrates the computational efficiency of the approach on a realistically sized network. Additional applications to this network are given in a companion manuscript [37]. Section 7 provides conclusions and a discussion of potential future research directions.

Optimization model
Before considering the optimal placement and sizing of FACTS devices, we first discuss a few preliminaries.  into controllable nodes (traditional generators participating in control or perhaps controllable loads) and uncontrollable nodes (traditional loads and renewable generators). In our notation, the number of vertexes and edges is N | | =  and M | | =  , respectively. The transmission grid power flows are computed using the DC approximation, i.e., (a) the resistance of transmission lines can be ignored in comparison with the line inductance, (b) the normalized voltage magnitude is unity at all nodes, and (c) phase difference between neighboring nodes is small, although we will use an iterative approximation to relax this approximation. The relationship between phases θ a and powers p a consumed/injected at any node a becomes linear: In the rest of this manuscript, we find it convenient to express L as L ( ) The pseudo-inverse is well-defined because, by construction, N 1 − eigenvalues of L are positive with the final zero eigenvalue corresponding a uniform shift of θ. This ambiguity in the phase vector θ is resolved by fixing the phase at the reference node (bus). This is often the largest generator in the system or can be just one fixed generator.
Our objective in this manuscript is to optimally use SC devices to relieve transmission congestion caused by additional stress applied to a base grid configuration. The reason for that can be, for example, load growth. To ensure that our results are not biased by a poor choice of the base configuration, we select this configuration by performing an optimal power flow (OPF) calculation. Starting from a known or forecasted set of uncontrolled power injections p p a ( | ) u a u = ∈  , we solve the following DC-OPF problem: where the cost of generation c p ( ) a a is a non-decreasing function of p a . In the expression inside the absolute value on the left hand side of the first set of constraints, p is understood to be the N 1 * column vector of the p a , and this is given by equation (1), i.e., the DC approximation of the power flow on transmission line (a, b). The slightly more complicated form in equation (3) will prove useful in the discussion of optimization over the SC devices. When considering stress applied by uniform load growth, a DC-OPF could be run to re-optimize the controllable generation at each level of load growth. Instead, we uniformly grow all generation by the same scale factor α that we use to grow loads. α is increased until it reaches α c , where one or more lines are first overloaded. We then try to resolve the overloads by searching for optimal susceptances with our algorithm.

Problem formulation
We seek to improve the grid robustness to stress, i.e., increasing the value of α c , by optimally placing SC devices. We have chosen to stress the base grid configuration p * by uniform load growth to create an infeasible p in ( ) . However, the problem formulation that follows is independent on how the stress is applied.
In this initial work, we only consider SC devices whose effect can be approximated as modifying the susceptance In spite of some useful features of the dependence of the graph-Laplacian on β (see e.g., [38]), the nonlinear conditions in equation (4) are generally non-convex in β. The non-convexity is explored in detail on a three-node example in section 5. We resolve this complication in section 4 by designing a greedy but efficient heuristic algorithm that enables the solution of (4) over large practical instances.
Equation (4) is not guaranteed to have a feasible solution, i.e., SC corrections of line inductance may not be sufficient to correct all constraint violations when the system is severely stressed. It is interesting to discover the amount of stress needed to reach this second critical boundary. For the case of uniform scaling, we seek to find the c results in the infeasibility of equation 4. This analysis will be discussed in examples in section 5 and Part II [37].
Finally, we note that it does not make sense to apply (4) to radial distribution grids because, when  is a tree, fixing p leads to an unambiguous, β-independent set of power flows. However, even in the simplest case of a single loop, the optimization problem (4) becomes interesting and non-trivial (see also sections 5.1, 5.2).

Critical discussion
Before discussing implementation details in section 4, we first comment on the assumptions and limitations of the SC placement problem stated in equation (4), and also on the advantages of our approach and its natural generalizations and uses. The following basic assumptions/caveats were used in the formulation of equation (4) (1) The power flow equations are linearized, i.e., we use the DC approximation.
(2) The ℓ 1 -norm cost function in equation (4) is over-simplified as it does not include fixed costs for installing SC that do not scale with (3) Sparsity of SC placement is a desirable property of any solution of equation (4), but sparsity is not explicitly encouraged by the formulation. (4) Generation constraints g a and g a are ignored.
(5) Only a single configuration p in ( ) is considered as guidance for SC installation.
(6) The SC-corrected network is not guaranteed to be N 1 − reliable (or reliable with respect to a more general list of contingencies). (7) Optimization over the SC transformers can be generalized also to include optimization over the PST 6 .
We comment on these items in order.
(1) The DC approximation is often used in power systems where congestion is a result of thermal limits (e.g. the U.S. Eastern Grid or the Central European Grid) rather than a result of dynamic stability (e.g. the U.S. Western Grid or the Russian Grid). The DC approximation allows many computations to be carried over analytically. For example, in section 4, we will linearize the power-flow constraints in equation (4), and the DC approximation allows these linearizations to be expressed via an explicit formula. (2) The choice of the l 1 norm is illustrative. Many other cost functions may be incorporated without creating significant complications, including those with graph-inhomogeneity that encourage building new SC at specific locations. (3) Although we do not explicitly encourage sparsity, our experiments with large networks (see section 5) naturally result in sparse solutions. We conjecture that the l 1 norm may promote sparsity in ways similar those in compressed sensing [25]. (4) Generalization of equation (4) accounting for generation limits is straightforward. (5) We have developed efficient algorithms for solving equation (4) so that we may simultaneously consider many different stressed configurations by simply replicating equation (4). Specifically, we create n replicas of the constraints in equation (4) for n different stressed configurations, p p , The optimization problem (4), and its generalizations briefly discussed above, can be used both for the planning of new SC installations into an existing grid and for controlling existing SC devices. In the case of planning, the list of contingencies p p , could include a number of different configurations of a future-stressed grid (including N 1 − contingencies) for both the system peak and minimum. The solution of (4), if it exists, would then provide the lowest-cost placement and sizing of SC that would resolve all of the network constraint violations for the stressed configurations considered. In the case of operations and control, contingencies may be considered one by one, e.g. a single N 1 − contingency or one of the worst-case instantons discussed in [27,28]. Equation (4) then enables a control that quickly computes the SC set points to maintain a feasible solution.

Optimization algorithm
Equation (4) presents two challenges. First, the inequality constraints are nonlinear. Second, there are too many constraints-thousands for a large transmission grid like that discussed in section 5. In sections 4.1 and 4.2, we will describe ways to overcome these challenges independently, and in section 4.3 we present a synthesis strategy that combines the two approaches into one iterative procedure.

Linearization of constraints
Our use of the DC approximation allows for explicit expression of the constraints in equation (4). We take advantage of this by linearizing around the current value of * β ,  (4) can now be approximated by the following SLP [34] Direct Algorithm Step 1: Linearize conditions in equation (4)  Note that in the actual experiments discussed below, the algorithm converges after a small number of five to maximum seven iterations.

Cutting plane
The large number of constraints in (4) and its SLP reformulation in equation (6) increase the complexity significantly. However, in practical cases, very few of these constraints are actually violated, suggesting that the complexity of the brute-force implementation can be drastically reduced with standard cutting-plane algorithms (see e.g., [35,36]). A modification of equation (6) consists of cutting the constraints in two groups, 'included' and 'excluded', and updating the two groups until convergence is reached. Initially, the included group consists of the constraints that are violated for k ( ) β β = , while all other constraints are placed in the excluded group. Equation (6) is solved using only the included constraints. The excluded constraints are then checked for violations in the new solution. If no excluded constraints are newly violated, we conclude that an optimal solution of the full problem has been found. Otherwise, we pick all the currently violated constraints and move them from the excluded group to the included group and repeat. This simple and straightforward procedure works very effectively in all the practical cases we tested, normally stopping in less than five steps. Figure 1 illustrates our improved algorithm. The cutting plane is nominally used in an 'inner loop' to solve an LP defined by the current constraint linearization given by equation 6. Instead of waiting for convergence of the inner cutting plane loop, our numerical experiments suggest that computational performance can be boosted significantly by alternating between an outerloop LP solution and inner-loop cutting plane method without waiting for the convergence of the inner-loop cutting plane step:

Synthesis
Improved Algorithm . Split the list of M 2 inequality constraints in equation (4) where σ ab is 1 + or −1 depending on the sign of the respective directed (one-sided) inequality. It is straightforward to verify that the fixed point of this improved algorithm procedure will also be a fixed point of the direct algorithm where the inner loop is iterated until the convergence of the cutting-plane method before making the next outer loop step (both procedures may have more than one fixed point if at least one exists). Given the global nonconvexity of equation (4), one obviously cannot guarantee that the fixed point of the improved procedure will always coincide with a fixed point of the direct procedure. However, the improved procedure is as good as the direct one for finding a local minimum (it will find some if it exists)-which is exactly the problem we are aiming to solve.

Non-convexity of the feasible space
To gain some intuition about the structure of the optimization problem equation (4), we explore the non-convexity on the simple three-node example shown in figure 2(a). The three power flows, which are a function of the three susceptances , , 12 13 β β and β 23 , are solved explicitly. The domain of feasible susceptances, as determined by the six constraints [see equation (4)], is shown in figure 2(b). The non-convexity of the domain can be clearly seen near the bottom of figure 2(b). Figure 2(c) illustrates our linearized algorithm in equation (5). Here, the exact optimization domain (equivalent to the one shown in figure 2(b), but rotated for a better view) is shown in red, and the domain linearized around the base case β 0 is shown in blue. Two red points mark the initial state (outside of the domains) and the final optimal state (inside of the domains) found in one iteration.

Sensitivity of line flow to local change of susceptance
It is instructive to analyze sensitivities of the line flows to changes of the line susceptances on this simple three-node example. Power flow from node 1 to node 2 is ( )   (4)) for the three-node example. The blue area corresponds to its linearized version as in equation (6). Red dots mark the initial β 0 configuration (seen outside of the domains) and the final solution (inside of the domains) achieved in one iteration of the direct algorithm of sections 3.1, 4.1.
For the 30-node network, figure 4 clearly shows a jump in the cost of inductance correction at 1.51 α = signaling the emergence of multiple (at least two, but possibly more) local minima. Investigation of the details of the solutions above and below the jump shows that the corrections to susceptances of lines #33 and #40 are significantly different. Above the jump, the net susceptance of line #33 becomes approximately zero, indicating that this line has effectively been removed from the network.
The results in figure 4 lead us to suspect that, by always initiating with the base solution, our algorithm has become trapped in a local minima for 1.51 α < . To verify this suspicion, we initiated the algorithm with a configuration equivalent to the base configuration for all lines but with line #33 removed. Dependence of optimal cost on α for this case is illustrated in figure 5. Assuming that turning the line off costs zero, we significantly decreased the cost of corrections. But the jump of the cost still exists, meaning that we still have another minima of the cost function.  To prove the fact that there is more than one minimum (more than one final solution for one initial configuration of the susceptances), we compare the initial solution (figure 4) with the solution with line #33 turned off (now we will calculate the cost of such operations in terms of the cost function defined). Costs of two different solutions for the same initial grid (30-bus grid) are illustrated in the figure 6. Three regions can be seen. In the 3rd region, the solutions are the same. In the 1st region, there is extra cost associated with the forced reduction of the susceptance of line #33. This effective removal of line #33 does not result in lower cost. But in the 2nd region it can be seen that this removal does decrease the cost compared to a solution where line #33 was not removed. There are two different solutions with different costs implying that at least two minima exist.

Demonstration/computational efficiency on large networks
In addition to the already discussed 30-bus model we also tested the algorithm efficiency on the Polish grid where we consider two versions: a 2737-bus summer grid and a 2746-bus winter  model (also available in MATPOWER). Tests of the improved algorithm of section 4.3 revealed that the number of iterations required for convergence is unexpectedly small not only for the relatively small 30-node grid, but also in the case of the realistically sized Polish gridapproximately six iterations in figure 7 and less than a dozen for all the cases we experimented with. For the case of the Polish grid, each iteration of our algorithm took 30 s on a standard quad-core processor.
The computational efficiency of our approach enables multiple analyses and applications. We discuss some possibilities in detail in the companion manuscript [37]. Briefly, we continue to explore the example of the Polish grid using our approach to resolve (a) stressing via uniform load growth, (b) robust optimal placement-correcting multiple configurations, and (c) gird stressing via generator retirement.

Conclusions and future works
In this manuscript, we developed an approach to the placement and sizing of SC devices to relieve congestions in transmission networks. Through the structure of the cost function (i.e., the use of the ℓ 1 norm), our approach implicitly combines planning and operations. In addition, we utilize physical approximations and optimization methods to enable our approach to be computationally efficient, even on realistically sized networks such as the Polish grid. The cost minimization is done under the condition that thermal overloads are relieved for a specially selected bad configuration(s) of load and generation.
Our main technical conclusions and results achieved are as follows: • The key element of our heuristics is linearization of the constraints around the current state, which allows the placement and sizing optimization to be solved as an SLP. • The l 1 norm induces sparsity. We find that the resulting optimal SC deployments are sparse, i.e., necessary at only a handful of lines. However, the lines to be corrected are not necessarily lines with the most severe overload (without the correction applied). • The sparsity enables efficient solution of the optimization problem via cutting plane methods. • The algorithm performance is illustrated on moderate and large scale models. See part II [37] for an extended discussion.
There are number of directions for future work in this area including: • Generalizing the approach beyond the DC approximation to a partially nonlinear or fully nonlinear AC case. This extension would require more complex derivations and more involved algorithms; however, it is possible in principle. • Extending the approach to include other devices such as PSTs. This extension will require introducing a continuous phase-shifting parameter for any line and then using cutting plane methods to make the algorithm scalable. • Adding to the operational scheme multiple dangerous configurations (such as from analysis of fluctuation wind generation) and possibly including an outer step to search for these configurations. • Considering SC control in combination with other devices and controls, e.g. energy storage and generation re-dispatch. • Extension of the planning paradigm to a stochastic setting that takes into account the statistics of multiple stressed configurations and considers the investment problem of placing and sizing SC devices to minimize risks. • Generalize the model presented here to account for more general (and accurate) AC modeling of power flows, thus modeling risks of loss of synchrony and voltage collapse (in addition to the risk of the thermal overload discussed in this manuscript). • Extension of the algorithms to include the effects of device placement and control on grid dynamical stability.