Energy yields of small grid connected photovoltaic system: effects of component reliability and maintenance

: The likelihood of system failure of small systems is investigated in order to establish the risk associated for the investment into a photovoltaic (PV) system for small domestic applications. This is achieved by reviewing existing literature on PV system failure rates and using these as an input for a statistical PV system yield simulation tool that considers failure and repair. It is typically assumed that these systems do not require any maintenance, but it is shown that this will have near catastrophic impact on the energy production of PV systems. The no maintenance is not a likely scenario, as small systems have to register their generation to achieve a feed-in-tariff. In a later stage, when PV is used for self-consumption only, this may change but in the present market most users are forced to carry out a quarterly check and thus this catastrophic failure is avoided by the need of having to apply for the feed-in-tariff. Minimum maintenance strategies for ensuring profitable system operation are investigated and their cost-effectiveness is discussed. It is shown that the present situation where many systems are neither monitored nor is any maintenance carried out results in a high probability of unsuccessful system operation as failure detection may take a very long time. Successful system operation here is defined as not recovering the financial investment. It would be advisable to carry out at least monthly performance checks as otherwise it is likely to have more than 10% energy lost because of system downtime. This requires, however, availability of irradiance data as otherwise it is not possible to identify whether low yields are due to resource issues or really system yield issues.


Introduction
Photovoltaic (PV) have been growing over the last years in the UK, as well as in other markets. In the UK, the number of installed systems reached half a million in December 2013 and surpassed 600 000 installations in September 2014. This is an increase from about 400 installed systems in January 2010. This significant increase in installation numbers has resulted in a number of systems not quite achieving predicted performance values as can be seen, for example, in the tail of performance ratios discussed in presentation of [1]. This is due to issues in system design including meteorological datasets, installation quality and reliability. System design issues are not always fixable as, for example, heavily shaded systems will be difficult to improve. However, installation quality and reliability issues will affect the system yield just as much, if not more.
The root cause for this is the rapid growth of installations having added a lot of pressure on installers to increase staffing and capacity. For example, this rapid increase has resulted in many systems sold by less than ideally qualified staff, as evidenced by the obvious design flaws (as discussed in terms of shaded systems, e.g. in [2]).
PV systems are seen as very reliable and maintenance free. A very significant proportion of these systems has been installed without any monitoring and inverters are often not in places enabling easy inspection, particularly in the context of small domestic systems. The consequence of this is that many systems run with no or minimal maintenance. This may result in failures going unnoticed.
Failure in technical systems is normally separated into three distinctly different effects, summing up to the overall observed failure rate, as illustrated in Fig. 1. These are † Infant mortality: These are initial failures caused by inappropriate handling or issues in the manufacturing. They typically show in the very early stages of life, say the first couple of months and decline exponentially thereafter. † Random failures: These are unpredicted events and cannot be avoided. These may be caused by unrelated events, for example, a house fire or unpredicted hail. However, these are still failures that need to be considered when investigating the reliability of systems. † Wear out: These are the typical end-of-life failures. The time and shape of onset depend on the reliability engineering and specific mechanisms. It is expected that these are more common in the second half of a system's life.
The shape given in Fig. 1 is illustrative only; the precise shape of each component is dependent on a variety of factors, such as manufacturing quality, component specifications and system layout. Site-dependent stresses such as ramp rate of module temperature and irradiance and shading will also contribute to the system ageing.
The only regular data check done on the majority of systems in the UK is a quarterly reading of kWh produced, which is submitted to the feed-in-tariff (FIT) provider who will then reimburse the system owner. The variability of solar resource makes it difficult to interpret these readings for the non-expert and partial failures may go unnoticed for a long time. Failures affecting the entire system would be identified on average 3 months after occurrence (assuming that the first reading is not conclusive yet). There are examples of better monitored smaller PV systems, for example, the field trials [3], but in many cases there are issues in terms of reliability of particular data. The sensors may actually cause a number of unnecessary maintenance issues, resulting in critical readings being ignored. In general, small systems will not have detailed monitoring, which limits the possibilities for detailed assessment and thus the situation is different to larger systems as discussed in [4]. Larger systems should have dedicated operators

IET Renewable Power Generation
Special Issue on 10th Photovoltaic Science, Application and Technology Conference (PVSAT- 10) investigating and reporting failures as described in [5]. It is critical to distinguish between reliability and durability. The issue of durability, i.e. the slow degradation or wear out of performance without catastrophic failure, is not considered here but investigated by other studies [6]. This paper focusses on reliability issues, that is, failures. This paper also assumes identical components and component-to-component variation in the performance.
PV systems are ensembles of a large number of components, such as PV modules, electrical diodes, cables, connectors, inverters and many more. Each of these components can fail, with variable impact on the energy yield of the PV system. This paper predicts the likelihood of component failure based on literature reports on faults. A stochastic model is used to estimate the energy loss in dependence of the maintenance scheme. It will be shown that a simple maintenance scheme of monthly checks is sufficient for ensuring a profitable operation of a PV system. Profitable operation is defined here as not having a performance loss higher than 5% compared with the predicted output, which is the order of financial return people would expect for their system.

Failure rates
It is unlikely that any technology will be fault free and PV is no exception. There are a number of reports of variable performance [3,[7][8][9][10] of smaller systems. It is difficult to obtain reliability data, especially as small systems are typically not monitored and there are no systematic analysis of existing systems beyond field trials and voluntarily submitted data. A summary of data used in this study and its sources is shown in Table 1. The shape of the wear out depends on the shape of the failure curves, which is typically either exponential (as indicated in Fig. 1) or Weibull type of probability density functions (PDFs).
This paper considers only the mixture of random and wear out failures, infant mortality is not considered. This is justified by these initial failures should normally be picked up in the commissioning. They may occur a bit later but it is impossible to obtain any reliable data on this. Excluding these will be a slight underestimation of real failure rates, but it is not addressable with the data presently being available. The aim of this paper is to look at longer term impact and thus the use of Weibull and exponential curve shapes is appropriate. The paper is not really concerned with the underlying causes, as discussed in [14,15], but with the absolute number and timing of failures.
The uptime of a system depends on a number of factors, which are described by the mean-time-between-failures (MTBF) and mean-time-to-repair (MTTR). MTTR is the sum of mean-time-to-detect (MTTD) and mean-time-to-fix (MTTF). Thus, there is a strong dependence on system instrumentation, as MTTD depends strongly on the quality of monitoring, and maintenance strategy, as MTTF depends, for example, on the ordering time for components and stock availability.
System failure is modelled in this study by Weibull and exponential PDFs, which are commonly used in reliability simulations as they are relatively simple to model and are often a good approximation to normal service life. The bulk of the lifetime simulated will represent the normal operating period of a system. Considering this and the lack of published data, the Weibull and exponential PDFs are considered valid for this application.
The aim of this paper is to investigate reliability issues in a statistical approach and thus a statistical simulation was developed. Tests were run to confirm that the software was calculating the correct number of failures for each component. A number of different failure distribution profiles were tested over a varying number of simulation runs to determine appropriate numbers of runs for a Monte-Carlo simulation model that was developed in order to assess the probabilities of failure. All the simulated failure profiles provided a good match with the equivalent calculated failure profiles. However, there is a slight tendency in Weibull distributions to underestimate failure rates. This improved with an increasing number of simulation runs. This caused increases in the processing time required. After some experimenting, 10 000 runs was found to be the optimum balance of speed and precision, as illustrated in Fig. 2.

Simulation
A simulation tool has been developed that calculates times of failure for failure rates represented by exponential, normal and Weibull  n/a n/a [12] breaker -DC none found a connector/couple exponential n/a n/a n/a 0.00024 4 166 666 667 [11] differential breaker exponential n/a n/a n/a 5.712 175 070 [11] diode -DC exponential n/a n/a n/a 0.313 3 194 888 [11] fuses -AC none found a fuses -DC Weibull 2 1.5 175 000 n/a n/a n/a [12] grid connection exponential n/a n/a n/a 114.2 8757 [13] inverter exponential n/a n/a n/a 40.29 24820 [11] junction box/row box Weibull 2 0.51 28 800 000 n/a n/a [12] mountings none found a PV module exponential n/a n/a n/a 0.0152 65 789 474 [11] Weibull 3 0.28 1.248 × 10 14 408 n/a n/a [12] switches -AC exponential n/a n/a n/a 0.034 29 411 765 [11] switches -DC exponential n/a n/a n/a 0.2 5 000 000 [12]  PDFs in the framework outlined in Fig. 3. Provided there is a suitable equation for time of failure, minimal modifications would be required to account for other types of PDF. The equations for failure are given in Appendix.
Components are maintained or failures detected and repaired according to scheduled maintenance, MTTD and MTTR parameters set from the main window.
A simulation based on [16] was written to calculate the energy yield of a PV system over its lifetime, as illustrated in Fig. 4. A Monte-Carlo simulation is carried out where for each time step the failure and repair likelihood is calculated. The simulations were carried out for small systems typically strings of ten modules in length and small number of parallel strings. Simulations were carried out for each time step. The occurrence of a failure was determined according to the time of failure equations in Appendix and performance was simulated according to the state of the system and the meteorological data taken from our reference dataset, which was an hourly dataset generated with Meteonorm [17] and then repeated for each year. The overall yield was added for each time step and stored for the entire simulation period. These are typical system configurations seen in today's domestic market for the dominating system size (in terms of numbers) of 2-4 kWp system. Failure and maintenance parameters were adjusted to represent the following scenarios: † No failures. † No maintenance. † Various scheduled maintenance. † Regular simple system output check. † Selected simulations repeated with module degradation. † Selected simulations repeated with improved component reliability.
The assumption of a fixed time to repair for all components is a simplification. Some failures may only be transient in nature or have relatively small performance effects, which would result in these not being detected during normal, non-monitored operation. Similarly, the use of inappropriate sensors may result in slow mean times to detect a failure, which then also will result in an increased MTTR and thus increased avoidable losses.

Energy loss
It was checked that the simulation of various systems resulted in realistic energy production in terms of kWh/kWp to confirm the appropriateness of the simulation methodology. All simulation cases were tested and resulted in consistent yield figures, that is, the interconnection strategies were working well in the case of no failures being introduced. The different scenarios were then run repeatedly in a Monte-Carlo fashion and the energy generation trends were shown in forms of PDFs, as this would be the required for a risk or cost benefit analysis.
The energy output trends show the expected patterns: † With no maintenance, annual energy output is rather low and the vast majority of systems will not deliver 25% of the energy yield as demonstrated in Fig. 5. This is largely driven by inverter failures, which are expected to last about half the system design time of 25 years. This means that, they should be replaced once during the system's operation. Not replacing them will on average half the energy output. Other failures, typically less frequent, will increase the likelihood of systems to be non-operational or have production losses. † With annual maintenance, the performance over its lifetime is around 80% of the failure-less performance (see Fig. 6). This is not an unusual result in systems with expected failures (e.g. inverter replacement mentioned above).
The fact that the lack of maintenance is performing worse than a system with maintenance appears self-evident. However, the energy losses in a maintenance-free system as shown in Fig. 6 are surprisingly drastic, which is due to the fact that any failure in our simulation will take out the entire string or indeed the entire system, that is, each failure has catastrophic effects on the performance. This is an overestimation as not all failures will cause a complete system outage. Unfortunately, this is the best that can be done with the failure data published. This apparent overestimation highlights the need for more up to date and more comprehensive data to be published. The main failure mode in the simulations discussed here was the grid connection, which depends on the quality of the grid and the way the grid failure is managed. The source of failure estimation is not as current as it could be, but not entirely unreasonable. The resulting MTBF of about 1 year is well in excess of what was observed in monitoring work carried out in the field  The issues seen then, however, were largely down to the grid voltage overshooting. Inverters and protection equipment used in these trials were optimised for 220 V optimisation while the UK grid is closer to 240 V. Inverters cut out whenever the grid voltage reaches 250 V, which happened rather frequently. This has been taken into account and it is the author's expectation that the grid connection side is much more robust nowadays. Unfortunately, no newer reference could be identified and thus the rather high failures are predicted. Ignoring this failure mode would still result in high failures as the power conditioning block of a system as a whole still is contributing to a number of failures. The second biggest contribution here is the inverter. The failure rate of this was also estimated by Zini et al. [11]. It is heavily dependent on their underlying dataset and data published to date seems to be rather pessimistic. Modern inverters will be more reliable which will shift the entire distribution towards the higher lifetime energy production. The suggestion of MTBF in the order of 3 years appears to be rather low, but no contradicting data could be identified. Major inverter manufacturers mentioned that MTBF should be in the range of 30+ years nowadays, but no reference of observed field reliability has been found on this. Present guidance in the UK is that the inverter needs to be replaced once during the system lifetime, which would cut the energy yield of the unmonitored approach in half.
Component failures are fairly low, that is reliability of individual components is good. The combination as well as the number of components in a system make a failure during the lifetime rather likely but the main issues here will be sub-standard components being fitted or incorrect fitting, which is not easy to assess and all present information is too anecdotal to use in a scientific paper.
The no maintenance approach clearly is an extreme case and in times of a FIT not very likely. However, the calculations above clearly demonstrate the importance of at least some sort of ongoing quality assurance. The question to be investigated in the following is which granularity in the monitoring strategy is appropriate.
The model is then used to predict the energy losses due to different maintenance regimes. Maintenance is assumed to also include a repair of any fault identified. This clearly is an oversimplification because the key figure in this context is the MTTR, which is the sum of MTTD, mean-time-to-component-availability and mean-time-to-service; that is, it also depends on the likelihood to spot the failure which is not always instantaneous, the time it takes to order and to obtain a replacement part and the time it then requires to send out a service engineer to fix the problem. However, it is virtually impossible to obtain reliable data on these and thus it is assumed that maintenance also means spotting the problem and then fixing it in the same instant.
The maintenance intervals chosen were no maintenance, 5 yearly, annual, 6 monthly, monthly, daily (close monitoring) and no failures as a reference case. Systems with up to 30 modules were simulated with the combination of 5 × 1, 5 × 2, 5 × 5, 10 × 1, 10 × 2 and 10 × 3. The calculated lifetime energy yields are plotted in Fig. 7. The system lifetime was set at 25 years.
The simulations confirm that some basic maintenance is required for PV systems, which is not what is presently being sold to customers. Relying on a simple annual inspection only could result in a 20% energy loss on average. The loss can be reduced if the inspection interval is increased to monthly, where <5% of the energy could be lost. The close monitoring scenario delivers a near perfect yield, but that is to be expected given the simplifications in MTTR mentioned above.
The second indicator of the relevance of maintenance intervals is the width of the distribution. In terms of investment it is also relevant how wide the distributions in the performance profiles are, as this determines the risk margins. These are shown in Fig. 8 for a 10 × 3 system, but this is similar to other configurations. The width of the distribution is extremely wide for the no maintenance options, that is, there is a low probability of successful operation for the expected system lifetime. In the case of annual inspections, the full-width-half-maximum is <10%, which indicates that normally only a single failure occurs for the data used in these simulations. It also indicates that the performance risk is acceptable. Monthly maintenance would be guaranteeing successful system operation.
A preliminary investigation into the effects of improved component reliability was made by modestly increasing the inverter, transformer, AC and DC protection mean-time-between-failure (MTBF) and scale parameters. The results of this estimation are given in Fig. 9. Note, this was done subjectively and the improvements may not be achievable or affordable in reality. There was an increase in average yield of ∼10% for both regimes tested compared with the original reliability values. As with the previous tests, annual  maintenance significantly improved the average yield, in this case to over 90% of maximum possible compared with around 20% for system with no maintenance.
The increase in performance quality is somewhat arbitrary but it serves to demonstrate the importance of the overall reliability on the outcome of the system operation. It appears that a significant percentage of the lifetime energy yield will be lost either way if no close monitoring or monthly maintenance is being applied.

Conclusions
The system performance is shown to critically depend on the maintenance regime. In the context of the common UK operation, where systems are observed only in quarterly intervals, there are very significant risks (>80%) to not have a profitable system operating based on the underlying reliability data. Profitable here is defined as losing more than 5% of the energy that would be produced if there were no reliability issues. This corresponds to 5% of overall income potential and is in the order of interest one would expect to earn with a PV system. The underlying reliability data may be slightly negative and present generation of technologies should be more reliable. However, there also have been a large number of new entrants to the PV market, which could actually result in higher failure rates as many new products will have an initial period of rapid reliability improvements. This demonstrates that the energy lost because of reliability issues is potentially larger than the mere performance difference between different technologies.
The most cost-effective time span for monitoring is in the range of monthly investigations. To make this meaningful for small domestic systems, one would need tools to place these systems into the context of the weather patterns being seen for this month. Small domestic systems, which are the most likely to suffer from poor quality assurance during their time of operation, are also the least likely to have any monitoring installed. In the UK, about 580 000 out of 600 000 systems in October 2014 fall into this category. The prediction of 20% energy loss for 6-monthly maintenance puts a lot of these at risk for not achieving their investment potential. The only slight measure of reassurance is that this paper uses some overestimations for failure rates of grid connection faults and inverter faults. The assumption of a 6-monthly maintenance is appropriate as the MTTR would be in the range of 6 months: MTTD would be on average 1.5 FIT reporting cycles and MTBF of 6 weeks would also appear reasonable. This would indicate an urgent need for quality assurance support for domestic system owners.

Acknowledgments
This work has been funded in parts through the research project 'PV2025 -Potential Costs and Benefits of Photovoltaic for UK Infrastructure and Society' which is funded by the RCUK's Energy Programme (contract no: EP/K02227X/1).
Cumulative distribution function