Enhancing Time Series Aggregation For Power System Optimization Models: Incorporating Network and Ramping Constraints

Power system optimization models are large mathematical models used by researchers and policymakers that pose tractability issues when representing real-world systems. Several aggregation techniques have been proposed to address these computational challenges and it remains a relevant topic in power systems research. In this paper, we extend a recently developed Basis-Oriented time series aggregation approach used for power system optimization models that aggregates time steps within their Simplex basis. This has proven to be an exact aggregation for simple economic dispatch problems. We extend this methodology to include network and ramping constraints; for the latter (and to handle temporal linking), we develop a heuristic algorithm that finds an exact partition of the input data, which is then aggregated. Our numerical results, for a simple 3-Bus system, indicate that: with network constraints only, we can achieve a computational reduction by a factor of 1747 (measured in the number of variables of the optimization model), and of 12 with ramping constraints. Moreover, our findings indicate that with temporal linking constraints, aggregations of variable length must be employed to obtain an exact result (the same objective function value in the aggregated model) while maintaining the computational tractability, this implies that the duration of the aggregations does not necessarily correspond to commonly used lengths like days or weeks. Finally, our results support previous research concerning the importance of extreme periods on model results.

P OWER System Optimization Models (PSOMs) are widely used for planning and policy-making toward sustainable and clean energy systems.However, due to real power systems' spatio-temporal size and technical complexity, PSOMs can result in computationally intractable problems.Computational intractability in PSOMs stems from the multiple dimensions they try to represent, e.g., the technical details, the uncertainty concerning the system's demand and generators' availability, the spatial representation, or the granularity and length of the time horizon, etc.To overcome this, modelers apply aggregation techniques to approximate full PSOMs with PSOMs of reduced size to derive practical results within reasonable CPU times.One subset of such techniques is time series aggregation (TSA); which aims to replace a full hourly or even sub-hourly PSOM with a smaller model using a simplified time dimension, allowing for faster model runs, and maintaining, at least to some degree, the accuracy of the results.There are many methods to achieve this, as reviewed in [1] and [2], and each approach comes with its advantages and shortcomings.
In the literature [1], TSA techniques are subdivided in a-priori and a-posteriori methods.A-priori methods rely on the input space, e.g., demand time series, capacity factors of renewables etc., as most of the models cannot be run without performing an aggregation procedure; so, a-priori techniques, have little regard for the optimization model performance or its structure, and even today, they remain as state-of-the-art TSA techniques for PSOMs [3] [4] [5].
Two of the most common a-priori methods are downsampling and representative periods.Downsampling consists of increasing the coarseness of the time steps used in PSOMs to reduce their size; for example, in SDDP [6], instead of using the available hourly data, weekly averages of demand and inflows are deployed; however, when using coarse time steps arXiv:2310.19369v1[math.OC] 30 Oct 2023 researchers lose sight of short-term dynamics; for example, in the case of weekly ones, the daily and hourly patterns of a power system are lost.A case study evaluating this situation is found in [7]; this means that a model with weekly or daily time steps will not be able to represent the short-term dynamics of the system.
On the other hand, representative periods aim to partition the complete time horizon (e.g.8760 individual hours of one year) into a weighted average of smaller ones while keeping the coarseness of the time steps (e.g. 7 representative days with 24 individual hours -7 × 24 different time steps in total).For example, in [8], the authors run 168-hour long models (one week) and use four of these they approximate a complete year, so in this case, we have four representative periods that are used to aggregate the 52 weeks of a year.The main advantage of using representative periods over downsampling is that researchers do not overlook short-term temporal dynamics of the power system; however, the difficulty lies in determining if: first, the chosen periods are representative enough of the original data; and second, when employed in a PSOM, how they translate into accurate results [2], i.e., how well they approximate full model results.
However, despite their past usefulness, new trends like the increasing share of variable renewable energy sources (VRES) in power systems, pose a challenge to temporal aggregation techniques as they rely on adding multiple time steps, e.g., daily or weekly averages from hourly data [9] [10] [11], or breaking the inter-temporal linking of time [12] [13], but VRES' technical constraints require a highly detailed temporal modeling [14].The consequences of these aggregation procedures have been researched [15] [16] [17], and results show that they lead to inaccurate results as they go against these fundamental technical constraints and even those of short-term storage technologies.
Until now, the inaccuracy of a-priori techniques has not been a matter of concern, as the increasing share of VRES to achieve net-zero power systems has highlighted the limitations of the current temporal aggregation techniques used in PSOMs for planning and policy-making.The limitations of a-priori methods are especially highlighted and quantified in [18], where the author illustrates the substantial impact of applying a standard a-priori TSA method (representative periods with k-Means clustering) to approximate a full PSOM.The case study of a simple Economic Dispatch problem shows a 91% error when the objective function value of the k-Means aggregated model is compared with the full model, while other (aposteriori) alternatives achieve a 0% error; this accentuates the need for new and adequate aggregation techniques.
In contrast to a-priori methods, a-posteriori methods do not exclusively rely on PSOM input data but also include information from the PSOM's structure, e.g., the results from partial model runs or relaxations, in the TSA process.Aposteriori methods can be classified as either non-iterative or iterative; non-iterative aim to obtain a fast solution, although with a trade-off between robustness and optimality, they rely on using a relaxed version of the entire model, for example, by reducing or fixing the number of binary variables and then cluster the input data and also identify extreme periods (e.g., those with a high cost in the relaxed model).So, non-iterative methods do not guarantee achieving the optimum but provide good enough solutions fast.Iterative methods aim to obtain an aggregated model as close as possible to the complete one with less regard for the computational burden; they work by decomposing the problem, for example, by splitting design and operation or by treating each type of constraint independently and then merging the results.In this way, a-posteriori TSA methods have the potential for significant efficiency gains in terms of solution times while simultaneously maintaining the quality of model results close to the full PSOM (without TSA) [18], [2], [19]; this potential has led to an increased research interest about a-posteriori aggregation techniques for PSOM which take into account the structure of the optimization model [20] [19] [21].
In advancing a-posteriori methods, [18] proposes a new approach to temporal aggregation called Basis-Oriented TSA.The Basis-Oriented approach takes a full-hourly solution and splits the input data based on the Simplex basis each time step belongs to; then, for each basis, makes a centroid of the input data and reruns the model using the centroids; this procedure gives an exact approximation to the complete model while retaining computational tractability.Therefore, this approach allows for an aggregated model akin to representative periods, which preserves the objective function value and the solution of the complete model so it has zero error.Another advantage of the Basis-Oriented approach is that, as we demonstrate later in this paper for the case presented in [18], it finds an aggregation of minimal cardinality, which means no smaller set of representative periods exists that exactly approximates the model's objective function value.This result is supported by the result in [22], which states that every LP model has a row-aggregation which is equivalent to the complete problem.
In this paper, we extend the research on the Basis-Oriented approach and apply it to significantly more complex instances of PSOMs including additional constraints with both spatial and inter-temporal dependence, thereby bringing the approach closer to real-world applications.In particular, we extend the methodology to network constraints and show that we can directly apply the Basis-Oriented approach to find a zero-error aggregation.Moreover, we extend the approach to constraints with temporal linking, i.e., ramping of thermal generators.However, inter-temporal constraints are more challenging than network constraints, as obtaining a basis is not as straightforward.This complexity arises because the basis duration 1 or length (always one hour in [18]) will vary depending on the system's state (in particular, on inter-temporal dependencies) and it cannot be determined a-priori.
To address this challenge, we develop an algorithm that explores the dual space hour by hour, to identify these bases, which allows us to formulate an aggregated PSOM that is still exact (zero error).In summary, the original contributions of this paper are as follows: case analyzed in [18].
• We extend the Basis-Oriented TSA methodology to include network constraints.• We further extend the approach to models with temporal linking (in particular, ramping constraints) and develop an algorithm to find the time-dependent bases to obtain an exact aggregation under such situations.The remainder of this paper is organized as follows: in Section II, we briefly introduce the original Basis-Oriented TSA approach and show how the aggregation is minimal and unique; then, we extend it to include network and intertemporal ramping constraints in Section III; in Section IV, we present a case study to validate the TSA methodology and finally, Section V presents the conclusions and future research.

II. BASIS-ORIENTED TIME SERIES AGGREGATION
In this section, we first introduce the reader to Basis-Oriented TSA in section II-A by summarizing the main aspects and initial results from [18], as it provides the groundwork for our extension to more realistic models.And then, as a novel contribution, in section II-B we provide empirical proof that the set of representative periods determined by Basis-Oriented TSA not only obtains an exact aggregation but that it corresponds to the set of minimal cardinality to achieve a zero error, and, that this set is unique.Furthermore, we show that all sets that achieve an exact aggregation with a higher number of representative periods are actually a partition of the original Basis-Oriented result.

A. Previous Results: Economic Dispatch
Consider a stylized PSOM for the Economic Dispatch (ED) problem where the objective is to minimize the total system operation cost while satisfying demand and scheduling the generation units within their operational bounds.The set of constraints in (1) corresponds to a full model run of the ED problem for every hour k (k = 1, 2, . . ., 8760).Full model results are optimal production decisions p g,k for each hour k and each generator g.In contrast, the set of constraints in (2) represents an aggregated PSOM.Instead of k, this formulation uses representative periods r, a subset of k.To avoid the computational complexity or potential intractability of running the full model, the modeler must choose r such that |r|≪ |k| while obtaining results similar to those of the full model.In order to determine r, TSA methods are applied.Common results of a TSA method for a specific number of r are: representative values for the demand D r ; a factor W r , which corresponds to the weight (or a number of occurrences) of each r so that the aggregated model represents the ED problem for a full year; and other data P g,r .
In [18], the author shows that the full hourly ED model, which uses the whole time series of 8760 individual hours and a simple generation mix (one thermal, one wind plant, and the option of non-supplied energy), can be approximated exactly using only three representative hours.Instead of finding these three representative hours by the proximity of the data points (such as by k-means), they are determined by the Basis-Oriented approach -identifying the Simplex basis to which each hour belongs to2 .However, [18] does not address the following questions: Are there other aggregations with three clusters that also yield zero error, and if so, what characterizes them?In the following section, we attempt to answer this question and further characterize the results of Basis-Oriented TSA.

B. Minimal Number of Bases and Uniqueness
As previously mentioned, Basis-Oriented TSA achieves an exact approximation of a full hourly ED, using three representative hours only; despite their work ensuring zero error, it gives no insight concerning the existence and number of aggregations when using a different number of clusters.So, this section aims to assess whether this is the only exact aggregation with zero error using three clusters and how other exact aggregations, with a different number of clusters 3 , relate to the one determined under Basis-Oriented TSA.We do so employing exhaustive numerical enumeration.
For this experiment and for the sake of simplicity, we limit the total time horizon of the ED problem to 12 hours.Hence, the hourly ED problem in (1) has |k| = 12 time steps.This yields the exact solution to the ED problem and a reference to compare the quality of the approximation under the aggregated model in (2).For the aggregated model, there exists a plethora of combinations to group the original data into 1≤ |r|≤12 clusters.For example, there is only one possible combination to group the data into one cluster.However, according to the Stirling Numbers of the Second Kind [23], there are 2047 combinations of organizing the 12 data points into two clusters, and so on until 12 clusters.Table I presents the relation between the number of clusters and the number of possible combinations; in total there are 4 213 597 different ways of clustering the 12 hourly data points.We run the aggregated model for every possible combination and study the quality of the solution in terms of the error in the objective function.Table I shows the results of those runs.Therein, column Clusterings with No Error corresponds to the number of combinations that exactly approximated the full hourly model.For example, from the 2047 possible combinations to separate 12 hours into two clusters, none yielded a 0% error when used in an aggregated model.From the 8526 ways to organize the original data into 3 clusters, only one yields an exact approximation of the original hourly model, i.e., the one obtained through basis-oriented TSA.This proves that basisoriented TSA yielded the best possible aggregation for this data and the economic dispatch problem (as there is no exact aggregation with lower cardinality), and it is unique for this number of clusters.Now, let us analyze all the other combinations of aggregations with a cardinality higher than three that have also led to an exact aggregation.In Fig. 1a the Basis-Oriented aggregation is presented and in Fig. 1b we see a four clusters aggregation which also yields zero error, but clusters one and four in Fig. 1b are just a division of cluster two from Fig. 1a.This situation occurs for all other aggregation with zero error and cardinality higher than three: all of these aggregations correspond to subdivisions of the basis-oriented 3-cluster separation.This result is quite remarkable because it shows that not only does the Basis-Oriented approach yield the best possible approximation but that other exact aggregations come from the Basis-Oriented one.Based on these results, which underline the potential of Basis-Oriented TSA, we set out to extend the methodology to more realistic models and expand this work by adding timelinking ramping constraints and network constraints, which are commonly found in energy system models and discuss the arising challenges.

III. EXTENSION OF BASIS-ORIENTED TSA FRAMEWORK
Considering that PSOM vary significantly in their formulation depending on the aspects being analyzed, our goal in this section is to frame the varieties of constraints and technical aspects we consider in our extension to the Basis-Oriented approach.Commonplace formulations of PSOMs include network aspects, e.g., the DC Optimal Power Flow (OPF) [24] or the AC OPF or its convexification [25], which are widely used by operators for day-ahead scheduling and market clearing; or security-constrained unit commitment (SCUC) [26] including inter-temporal constraints.Therefore, this paper focuses on extending the ED formulation and the Basis-Oriented TSA to: first, consider network flow constraints in section III-A, and then ramping constraints in section III-B.

A. Basis-Oriented TSA: Network Constraints
In this section, we extend the Basis-Oriented aggregation to include network flow constraints and show that it still finds an exact aggregation of the input-data.The full hourly model for the ED problem with network flows is presented in (3).The additional term in the objective function compared to (1) corresponds to transmission costs while constraints (3d)-(3e) represent the nodal equations, the import limits and the export limits respectively.To find a Basis-Oriented aggregated model, we group the hours that have the same hourly basis and then form the centroid from each of these groups; the only difference this time is the total number of bases under the additional constraints.Because of the proof in [18], and in the absence of temporal linking constraints, this aggregation has zero error if compared to the full model.
C N f k,j,i (3a) s.t.P w ≤ p w,k ≤ CF k P w ∀k, w (3b) B. Basis-Oriented TSA: Ramping Constraints Especially in PSOM, chronology matters, and this is why it is so important to account for it in TSA methods.A common mathematical constraint found in PSOMs and that links together multiple temporal periods is a ramping constraint.This is used to represent the technical characteristic of thermal generators.In the case of ramping constraints, the temporal linking arises because of the maximum changes in production a thermal generator t can tolerate between two consecutive periods; for example, if p t,k corresponds to the power produced by generator t during hour k, a ramping up constraint would look like p t,k −p t,k−1 ≤ RU while a ramping down constraint would be p t,k−1 − p t,k ≤ RD where RU and RD are parameters of the generator.
To analyze the impact temporal linking has on the Basis-Oriented approach, we use the model in (4), which corresponds to the model used in [18] with added ramping constraints for the thermal generator t. min g,k Temporal linking constraints pose a challenge to the application of the Basis-Oriented TSA as it is no longer possible to assess the data of different hours separately -because due to active ramping constraints in the optimization model, the hourly periods are no longer independent.In Basis-Oriented TSA we therefore need to consider that a basis b might consist of multiple consecutive periods; in other words, some representative periods might be longer than one hour (because of active ramping constraints).Identifying those periods and their corresponding duration is not trivial a-priori.
To give an example, if a ramping constraint is active in a given time step, it will bind variables from two different time steps.Another way of thinking about this is considering the structure of the ramping constraint.Without it, the thermal generation in one time step is independent of the others, that is, a line with slope zero in the p t,k , p t,k−1 plane; but with ramping, they now represent a line with a slope of 1.This extends to the number of joined periods the ramping constraint is active.From this, we obtain a basis duration, which corresponds to the set of consecutive periods that depend on each other at the optimum and therefore cannot be split.
This leads to another contribution of this paper -how to identify these joined bases and their corresponding basis durations, which we describe in the following section.

C. Basis-Oriented TSA: Basis Identification and Heuristic Algorithm
In this section, we discuss how to identify bases for PSOMs with temporal linking constraints, such as ramping, and propose a heuristic algorithm to that purpose.This heuristic algorithm uses the dual solution of the full model to identify bases.We are aware that in real-life examples, this dual solution is not available.However, our goal here is to demonstrate that even for temporal linking and flow limit constraints the Basis-Oriented approach still finds an aggregation with zero error with respect to the objective function value.Moreover, using dual information, such as marginal costs 4 is very easy to understand and provides great intuition.In future research, we plan to develop a methodology that allows us to identify these bases using input data only.
To account for temporal linking, we use the marginal cost as a proxy to identify bases.In the simple case of [18], where no temporal interlinking constraints were present, only three different marginal costs could be observed, i.e., the variable cost of the wind generator, the thermal generator, and the cost of non-supplied energy.However, introducing ramping constraints complicates this and causes a plethora of additional marginal costs to appear, which do not correspond to any of the variable costs of the system's generation units.This is because, with active ramping constraints, the marginal cost of one hour is affected by the system operation of preceding or posterior hours.
For example, consider a four-hour dispatch of a singlenode system with a wind turbine and a thermal generator with variable costs 3 C/MWh and 24 C/MWh respectively.Table II shows system demand, the cost-optimal productions of the generators, and the arising marginal costs.Note that between hours 3 and 4, the thermal plant has a production increase that exceeds 100 MW.Let us repeat this example but with an additional ramp-up constraint with a limit of 100 MW for the thermal generator.In Table III, we now observe that production and marginal costs have changed.Since the ramping constraint prohibited that large jump from hour 3 to 4, the thermal plant had to increase its production earlier on -replacing the cheaper wind production in hours 2 and 3.The marginal cost of 66 C/MWh occurs because the thermal generator has to increase its production two hours before (indicated in green in Table III), plus the cost of serving the additional demand (in blue), and minus the cost of the displaced production from wind (in red), so 2 • 24 + 1 • 24 − 2 • 3 = 66.
This example illustrates how temporal interlinking directly affects the marginal cost of the system not only in the period where it occurs but also in the production in periods linked to it.Generalizing the previous case, we find that for our model, the marginal cost of any given hour is a linear combination of the variable costs of the generating units with integer coefficients a, b, c because of the absence of losses.
With this observation about the marginal cost in mind, we develop a heuristic algorithm, i.e., Algorithm 1, detailed in the Appendix.Algorithm 1 partitions the time horizon into groups of hours that cannot be separated (because of active ramping constraints) using information from the duals (as demonstrated in the example).The arising partition of the input data space -based on these groups of hours -will then be used to build an aggregated model using dual variables.Fig. 2 describes the complete data-aggregation process that we employ with temporal linking constraints.
The First Step of the data-aggregation process consists of loading the input data, capacity factors, and demand and running Algorithm 1 5 ; from this, we obtain a temporal partition of the inputs and the dual variables with subsets of different lengths 6 (measured in numbers of consecutive time periods i.e. hours); then, we want to identify the bases; hence, for each length (e.g. for all 3-hour partitions) we check which of those partitions have the exact same dual variables, and how many different basis (i.e., subsets that belong to different partitions according to Algorithm 1 but have the same values in the dual space) there are.For example, in the 8760 hour time horizon, there might be 189 total subsets with a 3-hour duration.However, when comparing their dual variables, there are only 7 different subsets of dual variables.Hence, there are only 7 bases of length 3 hours.
After that, in the Second Step, we want to aggregate the data belonging to the same basis.Hence, we group the input data using the unique subsets of dual variables found in the first step and make a centroid for each of them with a weight corresponding to the number of elements in the subset, each of these centroids is a representative period in our aggregated model.
In this subsection, we have presented a data aggregation process that decreases the computational complexity of the model with time-linking constraints and performs a temporal aggregation while maintaining a zero error when compared to the complete model.

D. Basis-Oriented TSA: Network and Ramping Constraints
In this section, we consider the model in ( 6), which includes flow-limit and ramping constraints.As previously illustrated, flow limits are easier to handle than ramping because there is no temporal interlinking, and they operate in a way analogous to hourly available capacity constraints.So we can apply Algorithm 1 making a small tweak: we now have to look at the marginal costs of every bus when looking for the periods that must go together, however, the aggregation procedure is the same as in Fig. 2. 5 Algorithm 1 looks for a time period where the system's marginal cost does not correspond to the variable cost of any of the generation resources.Looking at the dual variables of the ramping constraints in the periods after and before, we identify if the marginal cost comes from an up or downward ramping constraint.The length of the ramp is determined by the closest integer multiple to the variable cost of the thermal generator; this is based on the empirical work in [27] where the authors found that, in the case of marginal costs higher than V Ct but lower than V Cnsp, the integer multiple of V Ct corresponded to the number of periods in which one of the ramping constraints was binding. 6During our experiments, we found that the more constrained a model is, for example, a low ramping capacity, the higher the number of hours in each element.

IV. NUMERICAL RESULTS
This section contains experimental results obtained from applying the extended Basis-Oriented approach to the 3-bus system described in Fig. 3.The input data and parametrization can be found in [28]; the system has a maximum demand of 1000 MW while the generation units have the parametrization presented in Table IV and non-supplied power has a cost of C/MWh 5000.The data are such that the demand could be supplied entirely with the thermal generator; also, the limits in the lines represent the difficulty of getting renewable energy from the wind turbine into the grid as is usually the case in real-world systems.

A. Results with Network Constraints
We applied the Basis-Oriented TSA to the system presented in Fig. 3 whose data and parameter values are in [28].For simplicity, we consider that load is present only in one node of the system while the other two are net exporters.The model is solved for one year with hourly time steps to obtain a benchmark against which the temporally aggregated model is compared to 7 .After applying the Basis-Oriented procedure to the network constraints case, we obtained the same results from the temporally aggregated model and the full-hourly one; this means that Basis-Oriented TSA yields an exact aggregation and achieved a reduction in the number of required hours of 1747 times (8736/5) because we only required five representative hours to exactly represent the whole year, as shown in Table V.This corresponds to a reduction of the number of variables of the aggregated model of 3 orders of magnitude.The relationship between the computation time and the number of hours is not linear, however, it is one of the determining factors in solving such models in a feasible amount of time.To analyze the results obtained from the Basis-Oriented approach, we plot the resulting bases in the input-data space in Fig. 4 (renewable energy availability on the x-axis and demand on the y-axis).Comparing the results with those in [18] (where we only had 3 linearly separable bases), we see that in the three-bus network case, the bases are not linearly separable; this stresses the point that a-priori approaches for representative periods selection are insufficient in PSOM.These results also strengthen the argument that a partitioning of the inputs from a minimum distance perspective (i.e.: k-Means or k-Medoids clustering) is insufficient to approximate the complete model effectively.In the network case, we obtain five bases because of the additional8 constraints in the model; in this case, the additional bases come from the hours when lines one or two are congested, which means that one or two of the constraints in (6g) (6h) are binding.For this case, it is worth noting that each of the bases corresponds to an operational situation in the power system: basis 1 and basis 5 are hours where the demand is higher than the available renewable energy; for basis 1 those hours are the ones where no line is congested while basis 5 are hours where L1 is at its limit, so a part of the renewable production is routed through L2 and L3; an analogous situation happens with basis 2 and basis 3, but in this case, the demand is lower than the available renewable energy so (6b) is not binding; finally, basis 5 correspond to hours where both L1 and L2 are at their limit.In this situation, it makes no difference whether or not (6b) is binding for the wind turbine and that's why basis 5 includes hours with demand higher and lower than the available renewable energy.Of course, depending on the input data, some constraints might never be binding at all (e.g., lines have much capacity when compared to the generators they serve, like L3 and the thermal generator in our test case); that is why to precisely approximate these models, both the input data and the model's structure must be considered.

B. Results with Ramping Constraints
Running the full hourly model (4) (single-node problem with ramping constraints) with the parametrization in [28], results in the marginal costs shown in Table VI, which also includes their integer coefficients (a, b, c) corresponding to each of the generation units' variable cost and the non-supplied power penalty.Now we want to generate an aggregated model and present two different cases: Case A, where we ignore the fact that there is temporal linking in the TSA process; and, Case B, where we account for temporal linking constraints employing the heuristic data-aggregation process presented in Section III-C.
1) Case A (ignoring temporal linking in TSA): We apply the original Basis-Oriented TSA approach (which implicitly assumes hourly bases) from [18] to the single-node system in (4) and obtain 53 unique one-hour-long combinations in the dual space.So, 53 different hourly bases.Then we partition and aggregate the input data within these 53 hourly bases, which corresponds to an aggregation of 99,39% with respect to the number of variables; and run the aggregated models 9 (using cluster centroids and weights) obtaining an objective function value that is 22% lower than the full model; a summary of the results is presented in Table VII; this shows that assuming a fixed one-hour duration/length of the basis (as done in [18]) is insufficient to obtain an exact solution under temporal linking constraints.2) Case B (accounting for temporal linking in TSA): Given Case A, in the following we present an extension of the original Basis-Oriented TSA, which is an original contribution of this paper.The general idea is to drop the assumption of a fixed one-hour basis length and allow for a variable number of hours in each basis.The number of hours that we must group together comes from the heuristic process presented in Section III-C, where we showed that the system's marginal cost has information concerning how many, and which hours should be kept together in an aggregated model.Column b10 in Table VI indicates how many hours a ramp-up (ramp-down) constraint was active before (after) a given hour.This means that the bases will not be of the same length since it depends on the heuristic aggregation carried out by Algorithm 1.After applying Algorithm 1 (1st step of aggregation process presented in Fig. 2) we obtain a partition of the input data into 489 subsets with lengths ranging from 1 to 53 hours.The results are summarized in Table VIII.For example, in the 8736-hour time horizon, there are 189 subsets with a 3-hour length, 6390 subsets with a 1-hour length etc.The 1-hour long subsets represent situations where only the balance constraint is binding (and ramping constraints are inactive).The column Obj.Func.Avg.refers to the average objective function value of all of the subsets of a given length.For example, on average a 3-hour subset yields an objective function value of 22 624 C.
We include the results in Table VIII because they allow for interesting observations.For example, they indicate the impact of extreme periods on the objective function value as some of the subsets with the highest objective function values have the lowest frequency (e.g., 43 hours which only appears once).Also, note that the lengths do not correspond to the typically used aggregations of days or weeks (e.g., 24 and 168 hours, respectively), which might indicate that using typical representative days in aggregated models is not the most efficient way of aggregation.Moreover, a higher length (number of hours within a subset) does not necessarily imply a higher objective function cost (e.g., the 43-hour subsets have a higher average cost than the 53-hour subsets).Now we apply the 2nd step of the aggregation procedure from Fig. 2 where we identify bases from the subsets, which are also indicated in Table VIII.As an example, from the 189 subsets of length 3, we identify that there are only 7 among them with different dual variables.Hence, there are 7 bases of length 3. We aggregate the 3-hour subsets that belong to the same basis, i.e., that have the exact same dual variables.Following the same procedure for each length in Table VIII, we found that there are only 70 bases in total.They have varying lengths ranging from 1 to 53 hours.In total, those 70 bases represent 681 hours out of the full 8736.When the bases are used in an aggregated model of reduced size 11 , they exactly approximate the complete model.This significantly reduces the computational burden12 of the model.

C. Results with Ramping and Network Constraints
We carry out the same data-aggregation procedure for the case with ramping and network constraints.The supporting tables are omitted due to space reasons, but can be found in [28]; most of the subsets lie in the 3 to 6-hour range with these being the ones that show the most significant aggregation in the 2nd Step.Moreover, the results also show that some extreme periods -even if they only occur once in the partition -have a significant impact on the objective function value.In summary, to completely represent this model, we only require 740 hours, 59 more than the previous 681.
A comparison of the three test cases: Network; Ramping; and, Network & Ramping is presented in Table IX.Number of Bases refers to how many different bases were needed to exactly approximate the full model.Max.Subset Length indicates the maximum basis length (i.e., number of hours) in the aggregated model.Number Necessary Representative Hours corresponds to how many hours, in total out of the 8736, are required by the Basis-Oriented approach to achieve zero error 13 in the aggregated model.Finally, Size Reduction: Variables compares the size of each aggregated model with the full one.

D. Discussion
Our results highlight the complexity of temporal interlinking and show how the right-hand side value in one period, i.e., the ramping limit, affects the optimal solution, both forward and backward in time.Our research also shows that fixed-size representative periods might not be the best choice for aggregated models, and researchers must be ready to partition complete models in chunks of variable temporal size, where each might represent a peculiarity of the system being analyzed; also the length of each subset does not necessarily correspond to the commonly used 24 or 168-hours periods.As our results show, wind production already challenges this convention, but a similar situation might entail if batteries, relying on efficient dynamic control algorithms, become widespread in the industrial and/or household sectors, as their consumptioncharging cycles will not only depend on traditional variables like the time of the day but also on signaling from the environment (e.g., price).Finally, our results also support previous research about the importance of extreme values in an aggregated PSOM, as some of the periods that could not be merged happen only once or twice and have the highest objective-function values (e.g., subsets of length 13, 35 or 43 hours); hence the Basis-Oriented approach goes beyond the common assumptions of extreme days or weeks and tries to find these inseparable periods without imposing ex-ante conditions.

V. CONCLUSION
In this work, we addressed TSA when applied for optimization models, and in particular, for PSOMs.We extended Basis-Oriented TSA, which considers the structure of the 13 Taking into account both, the number of bases and their length optimization model in the aggregation process without losing accuracy, to include both network and ramping constraints.One of the contributions was to demonstrate that when the structure of temporal constraints is taken into account in the Basis-Oriented approach, an exact aggregation can still be obtained, and to develop a methodology to achieve this.To illustrate this, we developed a heuristic algorithm that performs an exploration of the dual space of the full PSOM solution.The algorithm identifies which hours should be grouped together, based on the constraints that are binding at any given time.In this manner, the length of the representative periods of the aggregated model is variable and arises naturally in an incremental way from the exploration procedure.
Another conclusion of our work is that if the goal is to perform temporal aggregation while keeping the results as close as possible to the complete solution, researchers must rethink their approach to using a uniform length in their representative periods (i.e., days) and start to consider aggregated models with different lengths that may or may not correspond to the intuitive time intervals commonly used in PSOM.
Finally, for future research, we plan to extend the Basis-Oriented approach to other types of constraints (i.e.including integrality such as unit commitment constraints), which were not addressed in this paper.Another topic of future research is inferring an aggregation directly from the input data and the optimization model's structure without already having a solution (primal and dual) of the full model.

APPENDIX A PARTITIONING ALGORITHM
The algorithm is divided in two main parts composed by each of the loops; in the first one, the ramping lengths of the model run are identified by associating each marginal cost with the closest integer multiple of the thermal unit's variable cost, b, the coefficient of V C t in (5).The second loop is the partitioning of the model's input data based on each period's marginal cost.

Fig. 1 :
Fig. 1: Two examples of clusterings with no error: (a) with 3 clusters and (b) with 4 clusters

Fig. 3 :
Fig. 3: Three-Bus system diagram with line limits (MW) and location of generators

ACKNOWLEDGMENT
This work is part of the NetZero-Opt project, which has been funded via a Starting Grant of the European Research Council (ERC) (Grant agreement No. [101116212]).

TABLE I :
Combinatorial Aggregation Results

TABLE II :
Example: No Ramping Constraint

TABLE III :
Example: With Ramping (blue/green/red indicate production change and relation to marginal cost)

TABLE IV :
Parameters of generating units

TABLE V :
Three-Bus System

TABLE VI :
Possible Marginal Costs, frequency of occurrence and integer coefficients (a, b, c) of variable costs

TABLE VII :
Case A: Comparison full versus aggregated model

TABLE IX :
Case Study Aggregation Comparison