A long-term intelligent operation and management model of cascade hydropower stations based on chance constrained programming under multi-market coupling

Under the medium- and long-term electric markets, cascaded hydropower stations face a series of practical challenges due to the uncertainty of inflow and market price. For long-term dispatch scheduling, the allocation of power generation in multimarkets is critical, including clean energy priority consumption market, inter provincial market and intra provincial market in order to maximize the operator’s expected revenue and reduce the market operation risks. Based on the hydro-dominant electricity market structure and settlement rules, we propose a long-term optimal operation method for cascade hydropower stations considering the uncertainty of multiple variables. First, a seasonal autoregressive integrated moving average model is used to handle the time-varying and seasonal characteristics of inflow series by using a copula connect function to fit the joint distribution of the monthly inflow, the clearing price of the intra provincial market and the delivery volume of the inter provincial market. Then, uncertain chance-constrained programming is established. Finally, a developed particle swarm optimization algorithm embedded in a Monte Carlo simulation is solved for hydropower operation policies, and the maximum revenue, resource allocation and scheduling strategy are obtained under the corresponding risk tolerance. Taking the actual data of cascaded hydropower stations in Yunnan Province, China, as an example, a simulation analysis is carried out. The results show that the proposed method can reasonably describe the uncertainty and correlation between the variables, realizing the optimal allocation of resources among multimarkets, and provide references for the long-term optimal operation of cascade hydropower stations in a multimarket environment. The results also show that the decision strategies should be determined considering the decision-maker’s risk preference.


Introduction
Long-term optimal operations of cascaded hydropower stations, using a monthly scheduling period throughout one year, play a key role in the water level control of reservoir operation, energy transfer between the wet season and dry season, compensation for other power sources, etc (Allen and Bridgeman 1986, Barros et al 2003, Labadie 2004, Li et al 2010, Gu et al 2017, especially in hydro-dominated areas. With a new round of electricity marketization reform in China, however, the traditional operation methods of cascade hydropower plants have been greatly impacted. Under the medium-and long-term electricity markets, the monthly generating capacity in multiple markets needs to be reasonably allocated , Liu et al 2019. This includes the clean energy priority consumption market, inter provincial market and intra provincial market, according to the influence of the market clearing price and transaction settlement rules, to maximize the revenue and reduce marketization risk. When participating in the above markets, cascade hydropower stations need to formulate a long-term operation plan including a monthly operation schedule and trading capacity. Within this framework, the coupling effect of the monthly trading capacity (generation capacity) and settlement rules should be considered, and reasonable allocation of electricity resources should be based on the settlement priority of each market. If the method of power allocation is unreasonable, this may lead to the failure of part of the contract, which will lead to the loss of a hydropower station's interests. On the other hand, there are several uncertainties in hydrological (reservoir inflow, water quality, etc) and market (clearing price, volume, etc) factors in operation planning and management. For example, the inflow distribution is uneven and random in the year. Hydropower stations need to reallocate them in time and space through long-term dispatching to achieve the goal of efficient utilization. Meanwhile, in order to obtain a greater benefit of electricity sales, hydropower stations need to adjust the operation scheme to track changes in the clearing price, and the uncertainty of price and inflow may lead to the failure of the contract.
The following sections present a thorough review of literature on relevant issues including different sources of uncertainty and their dependencies, decision-makers' risk preferences and optimization approaches, particularly with regard to chanceconstrained programming (CCP). Regarding uncertainty in hydrological factors, many works choose the reservoir inflow as the primary and only source of uncertainty (Oliveira et al n.d., Ventosa et al n.d., Barforoushi et al 2006, Hansen 2010. Regarding the market sources of uncertainties, a review of the literature reveals the importance of the clearing price (Conejo et al 2003, Kang et al 2005, Zareipour et al 2010, Nojavan et al 2015 and volume uncertainty (Gianfreda 2010. Some studies believe there is a correlation between the two kinds of uncertainties (Faria and Fleten 2011, Saadaoui 2013, Saâdaoui and Mrad 2017. In addition to the aforementioned sources of uncertainty, the risk attitude at different levels of decision-makers plays a key role in reservoir operation. This has been analysed in different applications including optimal reservoir operation (García-González et al 2007, Bourry et al 2009, Moghaddam et al 2013, Liu et al 2015. Regarding the application of systems analysis and optimization techniques, a review of the literature reveals that both hydrological and market sources of uncertainty have been examined by similar uncertainty-based optimization models and solution approaches. In this line, stochastic linear programming (Grygier and Stedinger 1985), stochastic nonlinear programming (Shrestha et al 2005), stochastic dynamic programming (Thanos and Yeh 2010) and CCP (Liu et al 2018) are some of the approaches suitable for tackling largescale optimal reservoir operation problems. Accounting for the interdependency of random parameters, copula-based CCP has recently been used in the subject areas of waste management (Chen et al 2016) and power systems optimization (Chen et al 2017).
In this study, we propose a cascade hydropower station optimization model for optimal design and long-term operation considering multiple interdependent sources of uncertainty under multimarket coupling. Based on the presented review of the literature on the uncertain factors and their origins in these systems, the uncertainty sources considered are in the processes of reservoir inflow, clearing price of the intra provincial market and delivery volume of the inter provincial market. The seasonal autoregressive integrated moving average (SARIMA) model is adopted to consider the time-varying and seasonal characteristics of inflow series to ensure that the simulation results are more reasonable. The objective is to obtain the maximum revenue, resource allocation and scheduling strategy under the corresponding risk tolerance. In addition to ordinary constraints including characterizing the dynamics and the physical processes taking place in the system such as water balance equations over time and space, the model includes probabilistic constraints on meeting the maximum output requirement. The model is therefore a chanceconstrained program in which the dependence structure of the mentioned random factors is considered using copula functions and covariances to consider dependencies in various random variables. The complex deterministic equivalent of the program, which is highly nonlinear and nonconvex, is solved by the particle swarm optimization (PSO) algorithm of a nested Monte Carlo simulation. The model framework is shown in figure 1.
Taking a cascade hydropower station in Southwest China as an example, it is verified that the proposed method can reasonably describe the uncertainty and correlation between the inflow, price of the intra provincial market and volume of inter provincial market. At the same time, it achieves optimal allocation of the resources among multiple markets and can better coordinate the relationship between profit and risk, to provide a reference for the long-term operation of cascade hydropower stations in a multicoupling market environment.

Correlation analysis and modelling of multivariate variables
The reservoir inflow, market clearing price and delivery volume of the inter provincial market are all time series, which have a series of typical characteristics such as time-varying volatility and volatility aggregation. These cannot be handled by ordinary simulation methods. Meanwhile, the mentioned random factors are interdependent, and their correlations need to be considered before the simulation. The SARIMA model (Guin 2006) has the advantages of considering the series autocorrelation, trend and seasonality at the same time. The copula function (Li 1999  measure the dependence mechanism between variables which contains almost all the information of random variables. Therefore, combining the SAR-IMA model with a copula function can reasonably describe the dependence between multivariate time series without losing important details. In summary, this article uses the SARIMA model to describe the marginal distribution of the time series, and then connects the series after a probability integral transformation using a copula function to describe the correlation structure between the multivariate variables. Different time series have different conditional marginal distributions. The resulting conditional marginal distribution functions can be connected by a copula function, which can completely describe the interdependence between the series. The Copula-SARIMA model has different marginal distributions, which is better than the traditional linear model, so it can effectively analyse the nonlinear relationship between multiple variables.

Basic test of sample
To ensure that the fitting results are correct and reasonable when modelling the volatility of the time series, a series of tests need to be performed on the samples. The specific steps are as follows: (a) Normality test. If the time series does not obey the normal distribution, it is possible to use the SARIMA model to fit it. Therefore, using the skewness and kurtosis joint test method (Jarque-Bera) to test whether the series obeys a normal distribution, the expression of the JB statistic is as follows: where T s represents the sample size, S represents the skewness coefficient, and K represents the kurtosis coefficient. If the sample follows a normal distribution, then the JB statistic approximately follows a chisquare distribution with 2 degrees of freedom.
(b) Stationarity test. Stationarity means that the unconditional expectation of the random variable y t is constant, the variance is constant, and the covariance does not change over time. The specific expression is as follows: In an autoregressive process, y t = by t−1 + α + ε t . If the lag coefficient b is 1, it is called the unit root. When the unit root exists, the relationship between the independent variable and the dependent variable is deceptive because any error in the residual series will not decay as the sample size increases, which means that the influence of the residual in the model is permanent. Use the ADF (Augmented Dickey-Fuller test) unit root to test the stationarity of the time series. If the result of the ADF unit root test b does not equal 1, then reject the null hypothesis that the series has unit roots. If the sample has stationarity, then the series is stationary if there is no unit root, and vice versa.
(c) Autocorrelation test. Draw the autocorrelation graph and partial autocorrelation graph of the sample. If the test data does not all fall within the acceptance area, then it gradually converges to the acceptance area, indicating that there is a tail in the series data. Then, the series has an autocorrelation and partial correlation. (d) White noise inspection. If a series is stationary, then it is necessary to judge whether the data are white noise. Use Ljung-Box statistics to test whether the series is a pure random series; if not, then the series is not white noise.

Identification and fitting of SARIMA model parameters
(a) Estimation of SARIMA model. SARIMA(p, d, q) (P, D, Q) S is a widely used method of time-series analysis and simulation that adds seasonal and periodic analysis to the original ARIMA model. p is the order of nonseasonal autoregression, d is the order of nonseasonal difference, q is the order of nonseasonal moving average, P is the order of seasonal autoregression, D is the order of seasonal difference, Q is the order of seasonal moving average and S is the order of seasonal period. The structure is as follows: where Y t represents the time series to be fitted, ε t is the residual of the fitted model, is the seasonal autoregressive coefficient polynomial, and ∇ d and ∇ D S are nonseasonal and seasonal difference items, respectively. The seasonal difference of the time series is eliminated by the method of series difference and seasonal difference.
The values of p, d, q, P, D and Q were preliminarily determined by the auto correlation function (ACF) and partial autocorrelation function (PACF). According to the literature, the values of P and Q should not exceed the second order. Therefore, the Akaike information criterion (AIC) values of different models are calculated by fitting the models one by one in the range of 0-2.
(b) Fitting of SARIMA model. The AIC values obtained from the above steps are optimized.
The lower the AIC, the better the model fitting effect. After the values of p, d, q, P, D and Q are determined, the maximum likelihood estimation (MLE) is used to estimate the parameters of the model to obtain the marginal distribution function of time-series variables, and then the nonlinear correlation structure between the variables is connected by a copula function.

Copula function of variable selection 2.2.1. Marginal distribution transformation
Probability integral transformation assumes that the cumulative distribution function of variable

Copula function selection
Since the samples fitted in this paper are ternary variables, t-copula and Gaussian-copula can be used for fitting. In this paper, t-copula function is selected to establish the joint distribution of variables. As the follow reasons: (a) t-copula function belongs to elliptic distribution family, which is easy to be extended to high dimension, and it can describe the correlation between variables; (b) the runoff of cascade reservoirs involves small probability floods such as 1000 year return period or 100 year return period, t-copula can describe the correlation of extremum. T-copula can describe a large range of upper tail correlation characteristics, which is suitable for extreme value analysis (Nikoloulopoulos et al 2009). The case study shows that t-copula is superior to Gaussian copula in the description of extremum correlation (Zong-Run et al 2010). The distribution function C(·,·,...,·) and its density function c(·,·,...,·) of the multivariate t-copula model is where ρ is a symmetric positive definite matrix with a diagonal element of 1, |ρ| is the determinant value corresponding to the matrix and T p,v (u 1 , u 2 , u 3 ) is the correlation coefficient matrix with ρ and the degreeof-freedom v multivariate standard t distribution.
According to the marginal distribution parameters of variables, the parameter values of the t-copula function are calculated, and then the distribution function of the t-copula function can be obtained by formulas (4) and (5). Finally, a Kendall rank correlation coefficient and Spearman correlation coefficient are calculated by t-copula random number points with the same sample number and compared with the statistics of the sample number. The advantages and disadvantages of fitting are tested in turn.

Generation method of simulation data
Similar to the simulation method of the binary copula function, the multivariate copula function simulation can generate a random number series (u 1 , …, u n ) that obeys the specified n-valued copula function C (·,·,...,·) through the conditional distribution of random variables. In fact, we need to simulate the interdependent three-dimensional random variables, we can only simulate the three components one-toone, and the split pair of variables constitute interdependent two-dimensional random variables. The reason why it can be carried out in this way is that the ternary joint distribution can be obtained according to the binary conditional distribution and copula function. The specific steps are: (6) where C is copula function; F(x 1 |x 3 ) and F(x 2 |x 3 ) are conditional probability distribution functions of variables X 1 and X 2 given X 3 .
Because the market development time of this paper is short, and the number of electricity price and electricity quantity in the sample is too small, only SARIMA fitting is carried out for the marginal distribution of runoff series (the runoff sample is from 1953 to 2019), and the kernel density estimation method is used to fit the marginal distribution of other variables.
(a) According to F(x 1 , x 2 , x 3 ) = C(u 1 , u 2 , u 3 ) which obtained from the direct statistics of data, the binary conditional Copula functions C(u 1 |u 3 ) = ∂C(u 1 , u 3 )/∂u 3 and C(u 2 |u 3 ) = ∂C(u 2 , u 3 )/∂u 3 are calculated; (b) A random sequence (v 1 , v 2 , v 3 ) is generated, which contains three variables that obey [0, 1] independent distribution, and then a random sequence that obeys the specified 3-element t-copula function T p,v is generated. (c) According to SARIMA model, m groups of runoff series are generated, and the corresponding u 3 is solved according to x = F −1 (u); (d) According to the binary conditional copula function obtained in step 1, and Nelson theorem, the corresponding variables are obtained through the inverse functions of C(u 1 |u 3 ) and C(u 2 |u 3 ), that is, the random sequence that obeys the distribution of three variable t-copula function is obtained. (e) According to the distribution function F(x n ) of each variable, the variable value corresponding to u n is calculated.

Long-term optimization model based on CCP in multimarket
It is assumed that the cascade hydropower stations belong to the same stakeholder and that each station is a price taker. To make the model closer to the actual project, refer to the actual dispatch mode and introduce a dynamic assessment of the delivery default in the model. That is, dynamically adjust the assessment electricity price according to the proportion of the default power energy. Considering that the cost of cascade hydropower generation is mainly composed of fixed costs, it does not affect model optimization, which is ignored in this paper. Take the year as the dispatch cycle and one month as a period.

Objective function
The income of cascade hydropower stations mainly comes from clean energy priority consumption market, interprovincial market, intra provincial market and Penalty loss model for breach of contract. Among them, the priority of clean energy priority consumption market settlement is the highest and the price is set to a fixed value according to the flood and dry seasons. The second is the inter provincial market market, according to the statistical data of market operation, the volume of this market can be considered that the price in a year is a fixed value. Finally, the intra provincial market, the price is cleared according to the bidding price. According to the settlement rules, because of the uncertainty of inflow and intra provincial market clearing price, the allocated volume in the inter provincial market of the tap power station may not be completed. Therefore, the loss of default should be considered in the calculation of revenue. That the mathematical model of the long-term

Input basic information of power station, joint distribution variables and the Calculation parameters of PSO
Taking the water level of each station in each period as the decision variabl and generating the initial solution of PSO According to the joint distribution, Monte Carlo simulation sampling for s times is used to calculate the corresponding objective function value in each sampling.

Check the deterministic constraints
According objective function, the result is sorted and the series {R1,R2,…,RN} is obtained. Set S' as the integer part of αS, and assign the S' th element of the series toR.
The particle fitness is calculated by the penalty function method for the particles violating the chance constraint and the deterministic constraint.
According to the fitness, the individual and global extremum of PSO are determined, and then the individual population is iteratively calculated to get a new generation population The global optimal particle is the last cascade hydropower station scheduling scheme optimal operation of cascade hydropower stations based on chance constraints under multiple market coupling is as follows: r 1,i,t = e 1,i,t × p 1,t (8) r 3,i,t = e 3,i,t × p 3,t where R is the total revenue of the cascade power station, r j,i,t represents the profit or loss of station i in the market in the tth month, j = 1 is the revenue of the clean energy priority consumption market, j = 2 is the revenue of the inter provincial market, j = 3 is the revenue of the intra provincial market, j = 4 is the penalty loss for breach of contract, and N is the number of cascade power stations. T is the total dispatching period, and r 1,i,t , e 1,i,t and p 1,t are the priority consumption market revenue and settlement volume and price of station i in the tth month, respectively. r 2,i,t , e 2,i,t and p 2,t are the inter provincial market revenue, settlement volume and price of station i in the tth month, respectively, and i = 1. r 3,i,t , e 3,i,t and p 3,t are the intra provincial market revenue, settlement volume and price of station i in the tth month, respectively. where r 4,i,t , e 4,i,t , e ' 2,i,t and p 4,t are the penalty loss, default amount and transaction volume in the inter provincial market and penalty price of the tap station in the tth month, i = 1. e i,t , Q i,t , H i,t, , Z i,t , Z d i,t and H d i,t are the average generating capacity, average generating flow (m 3 s −1 ), head, decision level, downstream tail water level and head loss of the hydropower station i in the tth month, respectively; η i is the output coefficient of hydropower station i; and ∆t is the time step.

Constraints
In addition to the conventional hydraulic constraints, the constraints in the model also include revenue constraints, market constraints and opportunity constraints. Due to the limitation of paper space, see the appendix B for details.

Model solving method
The traditional method to solve a chance-constrained problem transforms the CCP model into its equivalent deterministic model according to the given confidence level in advance, and then solves it. However, in the mathematical model of the long-term optimal operation of cascade hydropower stations in the multimarket, there are many random variables, and the opportunity constraints are too complex. This is usually difficult to achieve by traditional methods. Therefore, we use a Monte Carlo simulation method to deal with the random variables, and then use the PSO method to solve them. The specific steps are shown in figure 2.

Results
Based on the electricity market dominated by hydropower in a province of China, see appendix A1 for the main operation parameters of power stations A, B and C involved in the calculation. A is the tap station, and the output process directly affects the operation of the downstream stations. The long-term scheduling period is one year, the time period is one month, and the number of Monte Carlo sampling is 1000. The number of population is 100, the maximum number of iterations is 5000, the inertia factor is 0.9, and the learning factor is 2.

Multivariate correlation analysis
The random variables studied in this paper are the monthly inflow of the tap station, the clearing price of the intra provincial market and volume of the inter provincial market. The samples used in this paper involve 48 months from January 2016 to December 2019. The details of the samples are shown in figure 3. The three variables show obvious seasonality and volatility. At the same time, it can be seen from (d) that there are correlations among the variables. Red represents a negative correlation, blue represents a positive correlation; and the deeper the colour, the stronger the correlation (the Sperson coefficient of price and volume is −0.55296, that of the volume and inflow is 0.39326 and that of the inflow and price is −0.84908).
For the multivariate variables with strong correlation in this paper, a copula function can be used to analyse the correlation between variables and contain almost all the details without determining whether the traditional linear correlation coefficient can correctly measure the correlation between variables. For marginal distributions, the traditional method assumes that the variance of time-series variables is fixed, which is not in line with reality. For example, inflow series show obvious seasonality. The ARIMA model can only capture the linear relationship of a time series but cannot capture the nonlinear relationship, which leads to an error in the simulation results and deviations from reality.
However, the SARIMA model takes into account the seasonal factors and carries out a 12-step difference simultaneously with the first-order difference of time series, thus eliminating the seasonal effect and solving the above problems. Due to the short development time of the market studied in this paper, the numbers of price and volume in the sample are too small, so only the marginal distribution of the inflow series is fitted (inflow samples from 1953 to 2019). The inflow details are shown in figure 4. The marginal distribution of other variables was fitted by the kernel density estimation method.
Before fitting the SARIMA model, it is necessary to conduct a basic test of the inflow data. The result of the JB test is 1. The original hypothesis that the series has a normal distribution is rejected, which means that the distribution is not a normal distribution. The result of the ADF unit root test does not equal 1, the original hypothesis of the unit root exists in the rejection series, and the sample is stationary. ACF and PACF were drawn to test the autocorrelation. The results of the Ljung box test were 1, which rejected the original hypothesis of white noise, and the series was not white noise. The specific test results are shown in figure 5. The basic test shows that the SARIMA model can be used to fit the inflow series. The values of model p, d, q, P, D and Q are identified and processed by the R language, and the optimal model is SARIMA (1, 0, 1) (2, 1, 1) [12]. Then, the parameters of the model are estimated by the MLE method. The fitting results are listed in table 1. Then, 100 sets of inflow processes are generated, and the results are shown in figure 6. The model SARIMA (1, 0, 1) (2, 1, 1) [12] was used to simulate the runoff series from 1953 to 2019, the root mean square error is 11.945%, which indicates that the fitting runoff process is in good agreement with the actual runoff.
According to the method described in section 2.2.1, the marginal distribution of each variable is transformed into a uniform distribution subject to [0, 1] through probability integral  transformation, in which the inflow series uses some data after marketization from 2016 to 2019, and uses the marginal distribution of the whole sample for probability integration to obtain a uniform distribution. Then, the t-copula function proposed in section 2.2.2 is obtained by the two-stage maximum likelihood method of nonparametric estimation. One hundred sets of scenarios were randomly generated, and the correlation coefficients between variables were calculated for verification (the Sperson coefficient of price and electricity quantity is −0.45609, electricity quantity and inflow are 0.346654, and inflow and price are −0.87613). Finally, a comparison is made with the t-copula function without the SAR-IMA model, and the results are shown in figure 7. It can be seen from the figure that inflow and price have negative value (red box) in the scenarios generated by the conventional method, while the method proposed in this paper can generate more reasonable scenarios. The reason for this situation is that the fluctuation range of runoff is large, when only using t-coupla to generate simulation data, we need to use the marginal distribution function obtained by kernel density estimation to transform. In the face of extreme cases, the stochastic runoff process obtained by transformation is prone to negative values. The runoff stochastic process generated by SARIMA model can avoid the above situation, and t-copula can better fit the tail dependence, so t-copula with SARIMA adjustment can work better. does not follow this law, and the total power is lower than in the conventional model.

Analysis of optimal operation results of cascade hydropower stations under multimarket coupling
The resource allocation results in the above multimarket coupling model are shown in figure 9 (since B and C do not participate in the delivery market, only B is shown). It can be seen from the figure that when participating in the coupling market, stations tend to generate more electricity when the price is high and less when the price is low, and allocate more volume in the market with high price and less volume in the market with low price, or break even on the contract. Given the factors of price, head and inflow, the month with the largest generating capacity does not appear when the price is the highest. The influence of settlement price and assessment price is taken into account when A allocates resources. In May, for example, the settlement price in the clean energy priority consumption market is the highest. To increase revenue, all power resources should be allocated to this market. However, due to the default assessment mechanism of the inter provincial market, some of the resources need to be completed to avoid additional losses. B allocates power resources in the inter provincial market and clean energy priority consumption market according to the price.

Analysis of dispatching results under specified risk tolerance
According to the model described in section 3, change the risk α, β 1 and β 2 that the power station is willing to bear, and then calculate and analyse the scheduling schemes of situation 1 (considering the uncertainty of all variables), situation 2 (considering the uncertainty of price and inflow), situation 3 (considering the uncertainty of price and volume) and situation 4 (considering the uncertainty of inflow and volume). The results are listed in table 2. It can be seen from table 2 that with an increase in the willingness of stations to bear risk, the possible generation power and target profit also increase. Further analysis shows that the change in α has an obvious effect on the generation revenue. To analyse the impact of risk tolerance α in the above four situations objective functions on the long-term operation income of cascade hydropower stations, results corresponding to different α in different situations are plotted in figure 10. When the risk tolerance is high (risk preference) and medium (risk neutral), situation 4 has the highest income. Situation 1 comes next. Situation 2 and situation 3 are similar. When the risk tolerance is low (risk aversion), situation 3 has the highest income. Situation 4 comes next, and situation 2 and situation 1 are similar. When the risk is high, compared with fluctuations when the electricity price, inflow and power are more volatile,  large flows bring more power generation and can allocate more delivery resources to obtain more revenue. When the risk is low, the inflow tends to dry up, and the allocated amount of volume in the inter provincial market is less. Although the intra provincial market price increases, the generating capacity greatly reduces, leading to an overall decrease of income. Situation 3 does not consider the uncertainty of inflow, which has little impact on the increase in power generation, so the revenue is the highest. Situation 4 does not consider the uncertainty of the intra provincial market price, which has little impact on the price fluctuation in the flood season, so the revenue comes next. Situation 1 and situation 2 take into account the variables that play a major role in revenue, resulting in lower price during the flood season and reduced water supply, so they have the lowest revenue. The results of the model are consistent with the actual situation, so it can be seen that the copula-SARIMA model proposed in this paper can be used to reasonably evaluate the uncertainty factors brought  by cascade stations participating in multimarket coupling.

Conclusions
(a) There are many uncertain factors in the market operation of cascade hydropower: inflow, price, etc. These are all time series. By using a Copula-SARIMA model, we combined the research of the correlation degree and correlation mode of a multivariate situation, and considered the volatility and time-varying nature of the time series at the same time. Compared with the constant correlation model, which considers the uncertainty of variables alone, the fitting data generated by this method maintains the correlation between samples, making the simulation experiment more reasonable. (b) Large-scale cascade hydropower stations play a role in regulating and storing natural inflow due to their large storage capacity. In the medium-and long-term market environment, water resources can be redistributed in time and space according to the situation of incoming water and market clearing, to achieve the efficient utilization of water resources, improve income and reduce the default assessment caused by the randomness of inflow. (c) As a signal to guide the resource allocation of cascade hydropower stations in a variety of coupled markets, considering the uncertainty of inflow, volume and other factors at the same time, the price can encourage the market participants to evaluate income more reasonably and can provide an effective reference for the cascade hydropower stations to arrange the operation scheme of each time scale, formulate the transaction plan, optimize the resource allocation and obtain more benefits.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.