Distribution‐based PV module degradation model

The degradation of photovoltaic modules has an impact on various parameters of photovoltaic modules. Ignoring the degradation of photovoltaic modules or inaccurate estimation of the degradation will lead to wrong power dispatching strategies and lead to economic losses. For PV module life estimation or reliability estimation, it is necessary to first establish an accurate statistical degradation model of PV module. The main goal of this paper is to analyze a selection of explicit PV module degradation model based on distribution. Since the degradation is related to time, the study assumed that those parameters in Gamma or Gaussian distributions are related to time. Five models are calculated based on maximum likelihood estimation and particle swarm optimization. Through verification and comparison on the measured PV module degradation data, the performance of these models in four cases: long‐term data fitting, long‐term data prediction, single‐module short‐term data fitting, and multimodule short‐term data fitting are evaluated. The results show that the model proposed in this paper has a great improvement over the original model, and the constant‐σ Gaussian distribution degradation model achieves the best performance.


| INTRODUCTION
As a low-cost and clean energy, solar photovoltaic has been widely used. The investment in photovoltaic power plant facilities is huge, as the life of photovoltaic modules is limited, usually 25 years. 1 With the increase of usage time, the parameters of the PV module model will change with the degradation of PV modules. 2 Ignoring the degradation of PV modules or using an inaccurate degradation model will lead to wrong power scheduling strategies and cause economic losses. Therefore, accurate quantification of the degradation rate of PV modules is crucial for both company investors and researchers. 3 Establishing a PV module degradation model can more accurately obtain the current status of PV modules, and can provide parameter corrections for failure risk assessment, PV module life prediction, and PV module power prediction. Malik et al. 4 based on the degradation model correct the PV module power prediction.
Photovoltaic module degradation modes can be divided into three types 5 : early loss, stable loss, and wear-out loss. Early loss is generally caused by the problem during installation and can be easily detected in the early life. Wear-out loss is caused by the use of the device over the service life, which is generally caused by internal damage. The stable loss is the degradation during normal use, which generally conforms to the natural degradation law, and is a hotspot in the study of degradation models.
The degradation of photovoltaic modules is related to its structure and various environmental factors, so the cause of degradation is complicated. In Lindig et al., 6 degradation models are divided into two categories, analytical models and statistical models. Analytical models usually qualitatively analyze the degradation of PV modules by the causes of PV degradation. In Ndiaye et al., 7 degradation of photovoltaic modules is divided into four main types, including corrosion, delamination, discoloration, and glass breakage and cracks. In Aghaei et al., 8 the external and internal factors of photovoltaic module degradation and the possible degradation types of different modules is summarized. The external factors are caused by the environment, including irradiance, temperature, humidity, mechanical load, and so on. The internal factors include processing and architecture, and so on. These models cannot satisfy the quantitative research on the power of photovoltaic modules, so data statistical models have received more attention. The statistical model of data is based on data to approximate the degradation of photovoltaic modules, which is divided into approximation based on physical model and approximation based on statistical model. The most typical approximation based on the physical model is the approximation of the I-V curve. Piliougine et al. 9 approximate the degraded photovoltaic modules as a single diode model for parameter extraction. The result shows that the series resistance of the degraded photovoltaic modules increases, the photo-generated current decreases and etc. Statistical models are based on statistical distribution laws to approximate the relationship between the power degradation of photovoltaic modules over time, and are widely used for reliability assessment and lifetime estimation.
According to statistical knowledge, the degradation generally conforms to a specific distribution, which can describe the process of product degradation and quantify the relevant parameters after product degradation. Nelson 10 first carries out this type of research, modeling the degradation of products of multiple insulating materials at the same time based on a log-normal distribution. Considering the characteristics of photovoltaic modules, Vázquez and Rey-Stolle 11 propose a degradation model based on Gaussian distribution and output power, which assumes that the output power of photovoltaic modules and the standard deviation of Gaussian distribution changes linearly with time. The parameters related to the Gaussian distribution are not estimated. When the rated output power changes, the parameters will change accordingly, which leads to the poor transferability. Yu et al. 12 estimated the performance degradation reliability of photovoltaic modules based on the β-distribution method. He 13 approximated the degradation process of photovoltaic modules based on the Lambda distribution statistics method. Both methods require the power data of multiple PV modules installed at the same location at the same time to approximate the corresponding distribution, and need to remodel for different PV module groups. Park and Kim 14 estimate the lifetime of photovoltaic modules by simulating the degradation process of photovoltaic modules based on the Gamma process. They propose that the degradation rate may have a high-order linear relationship with time, but it is still a constant first-order linear relationship during the calculation process. In Liu et al., 15 the power generation efficiency degradation of photovoltaic modules in different time periods is simulated based on the Gamma process, which requires real-time data to update the status of photovoltaic modules, and the distribution functions of different stages are different, which does not conform to the law of continuous degradation of photovoltaic modules.
In this work, the main contribution is to analyze the estimation of the distribution probability function of photovoltaic module degradation at different times, that is to find the distribution that best conforms to the degradation law of the output power of photovoltaic modules. In other words, a selection of explicit PV module degradation model based on distribution is applied in this paper. Those parameters in distribution are related to time, so the relationship between parameters and time are presumed in the establishment of PV module degradation model.
In this work, the degradation models of photovoltaic modules based on Gaussian distribution and Gamma distribution are evaluated, and three models are improved on this basis, and high-order Gaussian distribution degradation model, constant-σ Gaussian distribution degradation model and variable-β Gamma distribution degradation model are proposed. Maximum likelihood estimation (MLE) and particle swarm optimization (PSO) are used to estimate the relevant parameters of the model.
Predicting the output power of photovoltaic modules is a research hotspot of photovoltaic modules, to compare the pros and cons of different models, the fitting effect and prediction effect based on long-term data are evaluated with root mean square error (RMSE). Considering that the collection of photovoltaic module data is usually not complete, usually only the data of the last few years can be obtained, the fitting effect based on shortterm data of single module and the short-term data based on multiple modules are also used to evaluated models in this work. The four aspects of evaluation of PV module degradation model based on distribution have been improved comparing the original models, which leads to more accurate estimation for the degradation process of photovoltaic modules. This paper is organized as follows: Section 2 introduces the establishment of the five PV module degradation model based on distribution, which contains MLE and PSO to solve the maximum likelihood function value and related parameters in models. In Section 3, data set sources and evaluation metrics are explained and shown. Section 4 shows the results and discussion about five models. Finally, Section 5 summarizes the main conclusion of this work.

| PHOTOVOLTAIC MODULE DEGRADATION MODEL
This chapter introduces the PV module degradation model based on Gaussian distribution, 11 and then proposes a high-order Gaussian distribution degradation model and a constant-σ Gaussian distribution degradation model on this basis. Second, this chapter also introduces a PV module degradation model based on Gamma distribution, 15 on which a variable-β Gamma distribution degradation model is proposed.

| Basic Gaussian distribution degradation model
Vázquez and Rey-Stolle 11 propose a PV module degradation model based on Gaussian distribution, and the basic Gaussian distribution degradation model can be obtained by converting the output power into the degradation rate.
Assuming that the degradation rate r follows a Gaussian distribution, the probability density function is:  where μ is the expectation of the Gaussian distribution and σ is the standard deviation of the Gaussian distribution.
The basic Gaussian distribution degradation model assumes that the expectation and standard deviation have a linear relationship with time, then the expectation and standard deviation of the degradation rate can be estimated as: where A and B are constant parameters, and t is the usage time of photovoltaic modules. Take the expectation and standard deviation into the probability density function of Equation (1): Assuming that there are n data points in total, according to MLE: where L is the maximum likelihood principle objective function value. When the continuous function takes an extremum, the partial derivative is zero: where A can be directly calculated by Equation (8), and B can be calculated by Equation (9) and the value of A.

| High-order Gaussian distribution degradation model
In the basic Gaussian distribution, the expectation and time are assumed to have a linear relationship. In fact, the annual change in the degradation rate of photovoltaic modules is not the same, and shows an upward trend year by year. Based on the basic Gaussian distribution degradation model, it is assumed that the expected degradation rate has a high-order linear relationship with time, which is more in line with the change trend of the degradation rate, and a high-order Gaussian distribution degradation model can be obtained. Modify (2) as: where C is a constant parameter. Probability density function is: A, B, and C conform to the following equations: In this model, C can be set as particle in PSO, and A and B can be calculated by Equations (12) and (13), then the objective function is L in Equation (5).

| Constant-σ Gaussian distribution degradation model
Both the basic Gaussian distribution degradation model and the high-order Gaussian distribution degradation model assume that the standard deviation increases with time, which will lead to more and more inaccurate estimates of the degradation rate of PV modules as time increasing. On the basis of the high-order Gaussian distribution model, assuming standard deviation is constant, (3) is modified as: Probability density function: A, B, and C conform to the following equations: The solution process is the same as high-order Gaussian distribution degradation model with Equations (5), (16), (17).

| Gamma distribution degradation model
Liu et al. 15 propose to simulate the power generation efficiency degradation of photovoltaic modules based on the Gamma process, which assumes that the degradation rate of photovoltaic modules increases linearly with time.
Assuming that the degradation rate r follows a Gamma distribution, the probability density function is: In the above equation, λ is the shape parameter of the Gamma distribution, β is the scale parameter of the Gamma distribution, and Г(x) is the Gamma function.
Assuming that the expected degradation rate has a high-order linear relationship with time, and the scale parameter is constant: A, B, and C are constant parameters. Take the expectation and standard deviation into the probability density function of (19): The objective function is Equation (5). When the continuous function takes an extremum, the partial derivative is zeros: To simplify the operation, PSO is applied to maximize the objective function L in Equation (5). In this model, A and C can be set as particles in PSO, thus B can be calculated by Equation (25).

| Variable-β Gamma distribution degradation model
The variety of scale parameter is ignored in the Gamma distribution model. Based on the Gamma distribution model, it is assumed that the scale parameter is proportional to time to establish a variable-β Gamma distribution model. Assuming that the scale parameter has a linear relationship with time, modify (21) as: Modify (22) as: A, B, and C conform to the following equations: The solution process is the same as Gamma distribution degradation model with Equations (5) and (28).

| Datasheet introduction
To verify the model fitting effect and prediction effect, this paper uses the long-term data given in Park and Kim 14 as datasheet 1. Figure 1 presents the annual measured degradation rates of monocrystalline PV modules between 1998 and 2009. A total of 11 points were fitted to verify the fitting effect, the first eight points were used to verify the model trend prediction effect, and the last three points were used for verification.
To verify the fitting effect of the model on the shortterm data of a single module and the short-term data of multiple modules, this paper uses the data given in the literature 11

| Evaluation indicators
Common evaluation indicators include MSE, RMSE, MAE, MAPE, and SMAPE. The expected value of the distribution represents the expected drop rate of the output power of the photovoltaic module, and the drop rate can be converted into the maximum output power, so the expected value of the distribution is often used in the prediction of the output power. For the median value of the distribution, there is a 50% probability that the degradation rate of the PV module is greater than it, so it can be used to measure the reliability and lifetime of the PV module, this is done in literature. 14 In this paper, RMSE of mean and median are selected as the evaluation indicators. The equation of RMSE is as follow:

| COMPARISON AND VERIFICATION
To compare the fitting effect of the model, this chapter carries out the fitting verification of long-term data and the prediction effect verification of long-term data on the basis of datasheet 1, and carried out the single-module fitting effect verification of short-term data on the basis of datasheet 2. Validation of multimodules fitting effects on short-term data.

| Fitting effect of the models based on long-term data
This study uses 11 points in datasheet 1, as mentioned in III.A, as a training set to fit and evaluate the five models in the fitting effects on long-term. As shown in Table 1, datasheet 1 is fitted by MLE. Three constant parameters A, B, and C of the model are obtained. From the maximum likelihood principle objective function value L, it can be seen that the five models have converged to maximum possibility. Take the fitting results at different times into the models to obtain the expected values. 90% confidence intervals are calculated for the distributions of the different models at each time to display the results, as shown in Figures 3-7.
As shown in Table 2, the fitting effect of the basic Gaussian distribution degradation model is the worst, and the RMSE value is much larger than the evaluation indicators' values of the other four models and maximum likelihood function value of the basic Gaussian distribution is the smallest. The fitting effect of the high-order Gaussian distribution degradation model is improved compared with the basic Gaussian distribution degradation model, but RMSE is still much larger than the other three models. The constant-σ Gaussian distribution degradation model gets the largest maximum likelihood function value and it is the best in the fitting effect based on long-term data, RMSE is the lowest among the five models, which is only 0.38 times of the basic Gaussian distribution degradation model and 0.46 times of the high-order Gaussian distribution degradation model. The Gamma distribution degradation model is slightly worse than the constant-σ Gaussian distribution degradation model, and its RMSE is 5.8% higher than the constant-σ Gaussian distribution degradation model. The variable-β Gamma distribution degradation model has better fitting effect than the constant-σ Gaussian distribution degradation model, and the RMSE value is reduced by 4.95%, which is 0.56% higher than the RMSE value of the constant-σ Gaussian distribution degradation model, indicating that its fitting effect is very close to the constant-σ Gaussian distribution degradation model. In the Gaussian distribution, the mid value is the same as the expected value, so the evaluation indicators of mid value of the three models is the same as the evaluation indicators of expected value, but there is an increase of less than 1.15%, compared RMSE of mid value of the Gamma distribution model and the variable-β Gamma distribution model with RMSE of the expected value.

| Prediction effect of the models based on long-term data
The degradation model has a corrective effect on power prediction. For example, in Kuitche, 17 the statistical distribution method is used to predict the power of photovoltaic modules. Indicating that the trend prediction effect of the degradation model is also an important indicator. This study uses eight data points from 1998 to 2006 as the training set, and uses three data points from 2007 to 2009 as the testing set in datasheet 1 as mentioned in III.A. The three data points are used to evaluate the effect of the models based on long-term data of the five models in this paper. As shown in Table 3, the datasheet 1 is fitted by MLE, and three constant parameters A, B, and C of the model are obtained. According to the objective function value L of the maximum likelihood principle, the five models all converge to the maximum possibility. Table 4 shows the prediction effect of five models, the constant-σ Gaussian distribution degradation model far outperforms the other four models, and its expected value's RMSE is 0.227 times of the basic Gaussian distribution degradation model, 0.166 times of the highorder Gaussian distribution degradation model, 0.355 times of the Gamma distribution degradation model and 0.583 times of the variable-β Gamma distribution degenerate model. The constant-σ Gaussian distribution degradation model gets the largest maximum likelihood function value in prediction effect of the models based on long-term data. The variable-β Gamma distribution degradation model has a significant improvement in the prediction effect than the Gamma distribution degradation model, and the RMSE value is 0.609 times of the Gamma distribution degradation model.

| Fitting effect of the models based on short-term data of single module
This study uses all 126 data points in datasheet 2, as mentioned in III.A, as the training set, and evaluates the fitting effect of the models based on short-term data of single module of five models. Datasheet 2 is fitted by MLE, and the model is obtained. The three constant parameters A, B, and C of different photovoltaic modules are then calculated into five models, and the fitting results of 21 photovoltaic modules are obtained respectively. Taking the average value of each evaluation indicators, the following results can be obtained.
In the fitting effect based on short-term data, all of the RMSE are not very different as shown in Table 5. The fluctuation is less than 0.6%, indicating that this task is relatively easy, and multiple models can achieve good fitting results. In the short-term, the degradation rate of photovoltaic modules does not change significantly.
RMSE of median of Gamma and Variable-β Gamma are lower than RMSE of mean, which may indicate than median performs better than mean based on short-term data.

| Fitting effect of the models based on short-term data of multiple modules
This study uses all 126 data points in datasheet 2, as mentioned in III.A, as the training set, and evaluates fitting effect based on short-term data of multiple modules of the five models. Datasheet 2 is fitted by the MLE, and the three constant parameters A, B, and C of the model are obtained. From the maximum likelihood principle objective function value L, it can be seen that the five models converge to the maximum possibility, and the result is obtained. As shown in Figures 8-12 and Tables 6 and 7.
Under short-term multimodules data, the constant-σ Gaussian distribution degradation model performs best, the RMSE of mean and median of the Constant-σ Gaussian degradation model are 0.6263, which is more than 15% lower than RMSE of mean and median of the Basic Gaussian distribution degradation model.
The RMSE of mean and median of the Variable-β Gamma distribution degradation model is a little lower than the RMSE of mean and median of the Gamma distribution degradation model. It indicates that the performance gets improved from Gamma distribution degradation model to variable-β Gamma distribution degradation model.

| Comparison and discussion
Deviance of mean value in distribution and real value can intuitively show the effect of the model, the absolute value of deviance of rate are calculated and shown in Figure 13 in fitting effect of the models based on longterm data.
In fitting effect of the models based on long-term data, high order Gauss distribution degradation model has obtained a small improvement compared with basic Gauss distribution degradation model, but constant-σ Gaussian degradation model has obtained a great improvement over the above two models. Variable-β Gamma distribution degradation model have slightly improved compared with Gamma distribution degradation model. Constant-σ Gaussian degradation model performs as well as variable-β Gamma distribution degradation model. But in prediction effect of the models based on long-term data, Constant-σ Gaussian degradation model obviously performs better than variable-β Gamma distribution degradation model. New models get better prediction effect and better fitting effect, thus can be more accurate estimation of degradation and used to correct the power prediction of PV modules or predict the lifetime of PV modules.
In terms of fitting effect of the models based on shortterm data of single module, Gamma distribution degradation model performs slightly better than other models. It may lead to the conclusion that parameter β in Gamma distribution does not change with time during short time. In fitting effect of the models based on shortterm data of multiple modules, it's assumed that those modules obey the same law of degradation, these models performs better than in the fitting effect of the models based on short-term data of single module, which may reveal that photovoltaic modules obey the same distribution degradation in the same environment.
Comparing models based on Gaussian distribution and Gamma distribution, constant-σ Gaussian degradation model, the best model based on Gaussian distribution performs better than those models based on Gamma distribution, Gaussian distribution is more in line with the degradation law of photovoltaic modules.

| Advantages and disadvantages
In this paper, the relationship between the parameters of the distribution and time in the degradation model of photovoltaic modules are discussed by establishing different models. Three new models are established and the solution process of the maximum likelihood principle  is simplified by using the particle swarm, three new models all have improvement over original models. This experiment does not fully prove that photovoltaic modules in the same environment obey the same distribution, and more data and experiments are needed to prove whether the assumption is true.

| CONCLUSION
This paper introduces the PV module degradation model based on Gaussian distribution and Gamma distribution, and establishes three new PV module degradation models on this basis: high-order Gaussian distribution degradation model, constant-σ Gaussian distribution degradation model and variable-β Gamma distribution degradation model. The five models are verified in many aspects, among which the constant-σ Gaussian distribution degradation model achieves the best performance in all aspects. It means the parameter σ in Gaussian distribution degradation model may not change with time.
The high-order Gaussian distribution degradation model, the Gamma distribution degradation model and the scale Gamma distribution degradation model do not perform well in the prediction effect. On singlecomponent short-term data fitting, median is more accurate than mean. In PV degradation model, Gaussian distribution are better than Gamma distribution.