Reliability, availability, maintainability data review for the identification of trends in offshore wind energy applications Renewable and Sustainable Energy Reviews

This work presents a comprehensive review and discussion of the identification of critical components of the currently installed and next generation of offshore wind turbines. A systematic review on the reliability, availability, and maintainability data of both offshore and onshore wind turbines is initially performed, collecting the results from 24 initiatives, at system and subsystem level. Due to the scarcity of data from the offshore wind industry, the analysis is complemented with the extensive experience from onshore structures. Trends based on the deployment parameters for the influence of design characteristics and environmental conditions on the onshore wind turbines ’ reliability and availability are first investigated. The estimation of the operational availability for a set of offshore wind farm scenarios allowed a comparison with the recently published performance statistics and the discussion of the integrity of the data available to date. The failure statistics of the systems deployed offshore are then discussed and compared to the onshore ones, with regard to their normalised results. The availability calculations supported the hypothesis of the negative impact of the offshore environ- mental conditions on the reliability figures. Nonetheless, similarities in the reliability figures of the blade adjustment system and the maintainability of the power generation and the control systems are outlined. Finally, to improve the performance prediction of future offshore projects, recommendations on the effort worth putting into research and data collection are provided.


Introduction
Despite the efforts to achieve a through-life reliable design and attempts to control the failures of wind turbines, some system failures are inevitable. The inherent requirement for either cost, or material and weight optimisation, together with the extreme operating conditions, can lead to unexpected failures. This is true for land-based turbines and has an even greater impact on offshore wind systems, where the harsh environment and the high cost of the assets and logistics increase the importance of a proactive approach to the system's maintenance. Since the early-stage wind farms, a considerable effort has been made in collecting indicators for their reliability, maintainability and availability (RAM) statistics and putting them into databases [1]. National and international initiatives have been mainly directed at creating repositories for onshore wind turbines [2]. Only recently have some initiatives focused on the collection of RAM data from modern [3,4] and/or offshore systems [5,6]. Although large and heterogeneous, the populations of some of the most well-known campaigns (e.g. Ref. [7,8]) generally include statistics of outdated configurations and small rated wind turbines, compared to modern installations and offshore trends. Nonetheless, the collection of historical data has shown to be useful for benchmarking critical components to support monitoring concept development and a systematic service life performance analysis.

Previous review works
The onshore and the few offshore available data have been already analysed, and cross compared by several authors. In 2011, Sheng and Wang [9] compiled the first extensive survey of the various databases available until that year. Three more recent studies have significantly contributed to gathering and comparing the data available until the year 2018. Pfaffel et al. [2] presented a comprehensive collection of the to-date available RAM statistics (a total of 23, of which 20 are onshore), updating the historical data comparison initiatives with the results from offshore wind farms, and including datasets from outside the European continent. The failure frequencies and downtime are presented in an all-in-one comparison according to standardised key performance indicators (KPIs), in their normalised and non-normalised form. Artigao et al. [10] cross-compared some of these reliability statistics (13 initiatives, of which two are offshore) with the purpose of identifying the critical components of wind energy converters across all technologies, sizes and locations, in order to suggest condition monitoring strategies, techniques and technologies. Similarly, Dao et al. [11] used the averaged statistics of [2] to visualise the trends in reliability and maintainability figures of offshore wind turbines, as opposed to the onshore ones, and assess their impact on operational cost, to assist operators to identify the optimal degree of reliability improvement to minimise the levelised cost of energy.

Issues with data collection and comparison
Previous studies identified some trends in the averaged data, depending on the survey location (on-and offshore, and governing country), population size and mean power rating (e.g. Ref. [11]). However, two main issues related to the use of cumulative statistics still exist: 1) As highlighted by Leahy et al. [12], and previously by Sheng and O'Connor [13], the currently accessible RAM data lack a harmonised practice for their collection, processing and publication. The absence of standardisation in the type of data, and methods for their collection, leads to different levels of data quality. Furthermore, it is challenging to compare data among studies if project-specific and/or undocumented taxonomies are employed for the systems and subsystems characterisation. 2) As noticed in Ref. [1,14], technologies of different maturity, in different operating years, are expected to fail differently. The deployment location and the varying environmental conditions can play an important role in the lifespan reliability of the wind turbine systems and subsystems [15][16][17]. When comparing the statistics in terms of averages among the heterogeneous population [2,10,11], this level of detail is not considered.
To tackle the first issue, Leimeister at al. [14] recognised that fuzzy set and/or evidence theories can help deal with the uncertainties of vague data. Nonetheless, these methods cannot cope with the same level of detail and information as for a RAM database.
In 2011, the Continuous Reliability Enhancements for Wind (CREW) Database and Analysis Program, supported by the US government, introduced a consistent approach for the collection of high-resolution supervisory control and data acquisition (SCADA) data with the aim of characterising the reliability and performance of the country's fleet. Motivated by the standardisation intent, the members of IEA Wind Task 33 created, in 2013, the "Reliability Data Standardization of Data Collection for Wind Turbine Reliability and Operation & Maintenance Analyses: Initiatives Concerning Reliability Data". Similarly, industrially-led repositories are currently collecting data with a higher level of detail, adopting structured and harmonised procedures to accommodate the different types of data from wind farms [18,19], while providing sufficient information for a consistent comparison of the structure per typology, age and location.
With regard to the second point, some research effort has been put into finding a correlation between the turbines' failure rates and the associated environmental conditions [15,20]. In this regard, Barabadi et al. [21] suggested a methodology for RAM data collection of engineering structures in Arctic conditions, showing its applicability to the offshore and marine industries.

Scope of the analysis and methodology
Having established the need for more representative RAM databases, this work aims to perform a comprehensive review of existing published data related to reliability, availability and maintenance of onshore and offshore wind turbines, with a view to critically discussing commonalities and distinguishing correlation aspects between modern and more early-stage assets. This paper adds to the existing body of knowledge on the identification of the trends at the turbine, its system and subsystem level, based on the specific design parameters and the deployment conditionseither onshore or offshore, as well as sitespecific parameters. This can subsequently allow the development of a better understanding of the sensitivity of certain components and technologies to key design and environmental parameters, facilitating technology qualification of new alternatives and the reliability analysis of the units in a farm.
The structure of the paper is organised as shown in Fig. 1. An overview of the RAM statistics from historical repositories for onshore and offshore systems, as well as from the more structured industry-led initiatives, is given in Section 2. In Section 3, the methods for the uniform and consistent comparison of the statistics are presented. The RDS-PP® designation is adopted to establish a uniform and easy comparison of the results among the several datasets. The quality of the collected statistics is discussed in Section 4.1 based on the type of records used for assessing failure of the turbine. In Sections 4.2 and 4.3, the trends from the comparison of the RAM statistics are identified based on the design and deployment parameters of the wind turbines. Finally, in Section 5, RAM figures of offshore wind assets are discussed. The operational availability is evaluated for a set of reference offshore wind farms, to quantify the impact of statistical uncertainties and unveil the challenges for the offshore wind industry in the collection of representative data.

Review of onshore and offshore statistics
For the purpose of this paper, only fully operational data have been collected, rather than test data of single components (e.g. gearbox reliability collaboration [22] and blade reliability studies [23,24]). The RAM statistics found in the literature from single initiatives to summary reports are analysed in more detail. As the type of data collected is heterogeneous, a terminology is introduced for describing and classifying the findings from the several databases. Standardised definitions of reliability, maintainability, availability and performance indicators are summarised in the Appendix and are used to describe the data repositories throughout the paper.
Starting from a chronological overview of the main initiatives for onshore systems, the main findings are then outlined for the more recent offshore data collections. Despite including insights on offshore risks, the industry-driven databases (see Section 2.2) are generally not available to the public for confidentiality reasons. Consequently, publications from other independent authors are reviewed, to obtain an appreciation of the RAM experience of various offshore wind farms installed in European waters.

Overview of onshore data
One of the first reliability databases for onshore turbines was compiled by Lynette [25], who analysed the trend in availability and costs for the maintenance of various types of small-scale units installed in California until the end of the 1980s. Similarly, the Electric Power Research Institute (EPRI) collected, during 1986 and 1987, failure data for a portion of its Californian population [1]. These statistics were reported by the DOWEC (Dutch Offshore Wind Energy Converter) research program in Refs. [26], which has been one of the pioneers in the documentation of wind turbine reliability figures. Due to the outdated technologies in the population, the Californian data were not integrated in their comparative study. Only the yearly statistics, recorded around the end of the 1990s and first years of the 2000s, by the largest European data collection campaignsthe German and Danish Windstats newsletter, and the German LWK (Land Wirtschafts-Kammer) and WMEP (Wissenschaftliches Mess-und Evaluierungsprogramm) databaseswere cross-analysed. By plotting the results per size classes (within the single initiatives) and across the databases, the DOWEC team simply observed a significant scatter in the trend, justifying through the unknown age and the outdated type of some installations the causes of their statistics' discrepancy. In contrast, some years later, Ribrant et al. [27,28] outlined some similarities in the share of the components to the turbines' failure rate and downtime, when comparing the Swedish and Finnish databases -Vindstat and Felanalys, and VTT, respectivelywith the current WMEP results.
In the same period, a substantial contribution to the collection and understanding of the turbine reliability statistics was made by Durham University (UK) and Fraunhofer IWES (Germany). For the first time, Tavner et al. [29,30] analysed and compared the time-trend of the reliability results from EPRI, Windstats, LWK and WMEP, plotting them against the historical data from other industrial turbines. Tavner et al. compared the components' failures of the LWK population, grouping them by layout and size [7,29,31]. Furthermore, Tavner et al. investigated the effect of weather and location on the turbines' failures, drawing some preliminary conclusions on their correlation. In Ref. [15,32] they observed a periodicity in the failure frequency for some of the Danish Windstat components, while in Ref. [20] they investigated the possible dependency of failures on the wind farm location for a population of Enercon E30-33 turbines. In parallel, and consistently with what is shown by Ref. [31][32][33][34], Faulstich et al. analysed the effect of the turbines' configuration and location (onshore, coastal, and offshore) on the reliability figures of the WMEP database [33,34]. In Ref. [35], they additionally explored the possible link between WMEP failures and wind speed, developing further the first observations of Hahn [36].
From the experience of the above-mentioned databases, and because of the increasing number of wind farm installations in Europe, more structured RAM data collections have been launched. Aimed at moving towards a design-for-reliability approach, and targeting improved condition monitoring techniques [8], the European ReliaWind project ran for three years from 2008. Starting from reviewing previous European projects (EUROWIN and EUSEFIA [2]) and national-level initiatives, this project collected and analysed heterogeneous data from the turbines operation and maintenance (O&M) activities, based on the joint effort of industry (e.g. Siemens Gamesa), technical experts (Garrad Hassan, now DNV GL) and academia (Durham University, among others). Contemporarily, in Germany, the Fraunhofer Institute continued the WMEP database activity in the "Increasing the availability of wind turbines" (EVW) project [37,38], which ended in 2015. Despite these projects representing two of the most recent and complete databases from the European experience, due to confidentiality issues, only their final reports and relative values of the total statistics are accessible.
From the first decade of the 21st century, other data collection initiatives from the rest of the world have contributed to documenting and tackling the reliability of onshore installed systems. Academic and industrial researchers in India, China and Japan published the first reliability and availability statistics reports for specific wind farms [39][40][41] and turbine manufacturers [3]. The CREW Database and Analysis Program [42], in the USA, which is an ongoing activity coordinated by Sandia Laboratories, has become more extensive and structured.

Overview of offshore data
Little failure data exist in the public domain for offshore wind systems. Performance from UK's offshore round 1 wind farms, with evidence of wind farms' availability indicators (see Appendix A) and capacity factors (CF), were first reported by Feng et al. [43]. In this work, as in Refs. [5], maintenance records and operational issues of four selected wind farms were analysed. Similarly, the reliability figures for the Egmond aan Zee wind farm were derived by Crabtree et al. [44] by accessing the operational report from Noordzee Wind [45], for the first three years from installation. In Ref. [44], they additionally updated the results from the early experience of round 1 wind farms, which were affected by technological-immaturity failure events. Besides, they collected the performance indicators of round 2 wind farms, showing a growth in the average CF for the more modern offshore wind turbines, in line with results presented by the SPARTA [46] and Offshore-WMEP [47] projects.
With regard to reliability and maintainability data, one of the most complete contributions is the dataset published by Carroll et al. [6], for a population of 350 offshore wind turbines. Despite the results presented being from a single manufacturer, the detailed definition of the failure, and the further results on the repair time, material costs, and required technicians per subassembly, are provided.
In terms of availability, onshore wind turbines have been shown to reach values in a range of 95-97% for modern systems [2]. For offshore projects, however, the location and associated challenges (i.e. accessibility and exposure to extreme weather conditions) can considerably lower availability. As observed by Ref. [46,47], older farmscomprising turbines with relatively low nominal capacity, and relatively close to the coastexhibit an availability in the range of the onshore average one. Newer farmsbigger and generally located further from the coastare characterised by an increase in maintenance efforts [48]. Despite the higher CFs [44], the technical availability (A E , in Table A.2) of offshore wind farms can fluctuate across the years, depending on the distance from shore and hence the ease to perform the required maintenance operations [49]. Additionally, the fluctuation of A E among several surveys can be related to the varying maintainability of components for the different wind turbines' concepts and designs [50,51].

Industry-led RAM databases
Following the compilation of the first databases (e.g. LWK, WMEP) and recognising the limits of these earlier data collection initiatives, several authors have suggested possible improvements to the techniques and processes used for gathering and analysing data (e.g. Ref. [42,52,53]). Among these, Hameed et al. [54] proposed an optimal RAM database to be applied to offshore wind turbines, identifying the shortcomings of the historical databases from both the onshore wind and the offshore oil and gas (Offshore Reliability Data, OREDA) industries. Observing how the lack of standards associated with reliability data adversely impacts industry progress in addressing the reliability issues, the IEA Wind Task 33 started compiling recommended practices [55].
In line with these conceptual examples are two recently launched RAM databases for onshore and offshore systems: • The SPARTA (System Performance, Availability and Reliability Trend Analysis) initiative [19], started in 2013 by The Crown Estate (UK) under the supervision of the Offshore Renewable Energy (ORE) Catapult research centre. SPARTA is gathering KPIs (at wind farm level) and reliability figures (at subsystem level) from the participating operators, outputting a monthly benchmark. • The German equivalent, WInD-Pool (Wind-Energy-Information-Data-Pool), with Fraunhofer IEE as trustee [18]. It can be seen as the successor to WMEP [33], where additional (but never published) information on the cost of the maintenance services was collected. It continues and merges the EVW (Erhöhung der Verfügbarkeit von Windenergieanlagen) [37,38] and Offshore-WMEP [56][57][58] research projects, gathering historic and recently collected data for both onshore and offshore wind turbines.

Methods
To consistently compare the RAM statistics and unveil potential trends, three methods are employed in this study. First, a cataloguing activity is used to gather the information available from the literature using a standardised terminology for the statistics, the type of data and the wind turbine typology. This process facilitates access to the statistics and allows the evaluation of the completeness and quality of the data collected by each initiative. The adaptation of RAM repositories to a unique taxonomy based on the most widely adopted reference designation permits comparing data fairly across different initiatives. Finally, calculations are carried out to uniformly compare the onshore and offshore reliability and maintainability data in terms of operational (time-based) availability for a hypothetical offshore wind farm scenario.

Cataloguing approach
To compile a comprehensive catalogue of the most important (and accessible) information for onshore and offshore RAM statistics, it is first necessary to identify what characteristics are worth being collected. The database and the population size are usually reported to provide an indication of the statistical significance of the data collected. However, consistency issues when discussing and comparing the results can arise from the lack of sufficient details on the population of wind turbines [26]. For this reason, the classification applied here differentiates the turbines by: • Capacity (power rating). As several studies have already shown (e.g. Ref. [26,59,60]), the number of failures (per turbine and/or per component) can be dependent on the dimension of the turbines, making it necessary to present the results by power class grouping. • Age of the installation. It is common knowledge that the failure rate is a time-dependent variable [32]. Therefore, the age of the system, whenever known, is reported to account for the possible influence on the population statistics (e.g. infant mortality and end-of-life wear out failure events). • Technical concepts and drivetrain configuration. A deep understanding of the results also comes from the knowledge of the type of structure and configuration analysed. It is proposed here to identify the drivetrain layouts according to the concept classes, as presented in Table 1. These configurations are similar to those of [49]; however, the sub-types acronyms are arranged to meet the concepts used by the WMEP [61] and LWK projects [31]. The generators' description and acronyms are retrievable from Refs. [62,63], respectively. Schematics of the configurations, can be found in Ref. [64].
The terminology introduced in Appendix A is used to state which RAM indicators are provided in each reference. The level of detail reported is specified according to the conventions introduced in Table 2.  Even though a constant (averaged) value of the failure rate over time is generally reportedbased on the generally assumed homogeneous Poisson process for the components' useful life, information on the time variance of the failure rate and the other RAM indicators is reported where possible. It must be noted that, although the best match between the provided definitions and single initiatives terminology is pursued, the classification is completed based on the engineering judgement of the authors.
To implicitly suggest the definition of failure used, the classification of the typology of the O&M data sources is specified if possible. As reported by Kaidis et al. [65] the use of different methods for the data collection is associated with multiple RAM information and data quality. While in the IEA Wind Task 33 [54], the four main groups of equipment, operating, failure and maintenance/inspection data are distinguished, a higher level of detail is here given by following the grouping of [65]. As for [12], the data sources are classified in Table 3. Furthermore, the difference between the collection scheme typologieseither raw or results data approach, as proposed by Ref. [2] is additionally made, whenever possible.
To assist the understanding of incomplete and/or public-restricted analysis, an additional column is added for summarising the main findings. The list of references is then included by specifying the type of information given in the case of multiple citations. Other basic information (such as country, study period and database size) is also reported for completeness.

Synoptic tables
The characteristics of the 24 databases found in the literature were accessed either through the initiatives' original publications (if possible), or the review works mentioned in Section 3. Due to the low level of detail in some of these works, their cataloguing remained partially completed, not providing a sufficient description for the classification of some quantities. It should be noted that further analysis is required to integrate the results from the European EUROWIN and EUSEFIA projects [2], the Japanese NEDO initiative [2], and the Fraunhofer's EVW and recently launched WInD-Pool database (see Section 2.2). Therefore, only the initiatives that could be fully accessed are reported in the synoptic from Tables 4-7.

Taxonomy adaption
As for [2,10], it was necessary to select a uniform and convenient language to identify the equipment in a wind turbine to coherently compare the several statistics. As outlined by Ref. [54], different lists of terms can be used for categorising aspects of components, failures, maintenance tasks etc. as "taxonomies". Several equipment taxonomies have been developed in the past. Among these are the VTT components' breakdown presented by Stenberg in Ref. [69], the SANDIA Laboratories taxonomy in Refs. [86], and the GADS (Generating Availability Data System) used for North America wind plants. Another two, more recent, sophisticated and comprehensive classifications are the ReliaWind project taxonomy (see e.g. Ref. [4,8]) and the RDS-PP® (Reference Designation System for Power Plants) taxonomy, published by the VGB PowerTech e.V. in Ref. [87]. The first was created for an extensive failure data analysis and is internationally recognised, while the second one, also widely accepted, is currently employed for the offshore data collection schemes of SPARTA and WinD-Pool (see Section 2.1.2). However, the ReliaWind complete taxonomy is not publicly available, and there is no ongoing development to maintain it. In contrast, the RDS-PP® offers open access to the draft document [87], and a high level of detail for both system and subsystem identification and components' technical information. For these reasons, the RDS-PP® was adopted to unify the statistical reliability and maintainability data in this analysis. The information necessary for the adaptation of the taxonomies is accessed from the VGB PowerTech e.V. draft document [87], and summarised in Table 8.
The authors of this paper mapped, to the best of their knowledge, the initiative-specific taxonomies to RDS-PP®. When a proper mapping was not possible, a higher share was given to the introduced "Other" category. On the other hand, the generic "electrical systems" category, usually adopted in the earlier data collections, is integrated here into the transmission group.

Availability assessment tool
The variance in the failure rate of the wind turbines results in a variance in their estimated availability. This can affect decision making for the accurate planning of future offshore wind projects. The open-O&M assessment tool, developed by the authors in Ref. [88], is employed to estimate and investigate the impact of the different failure rates provided in the literature on the potential availability of an offshore wind farm. The tool was built with the aim of supporting the development of wind farms' maintenance strategies. As for other O&M management tools [89,90], they have a modular structure consisting of the following core modules: (1) reliability, (2) power, (3) weather forecasting, (4) maintenance and (5) cost. The flowchart of the processes and steps of the tool are shown in Fig. 2. The inputs are weather data, cost data, reliability data, turbine specific data (power curve), wind farm layout data (distances from shore), and repair information, such as number and type of vessels, and the crew required for the restoring of the system. The wind farm lifecycle and maintenance activities are Table 3 Reliability and maintenance data sources and their characteristics. Adapted and extended from [54, [65]].
In the original study [66] (in language) *i(t) are also reported [67,68] Results per age and size [72] all (by concept) X X -Results per age, size and concept -Only stops for unplanned maint. [73] Ave n • WTs (in 2008): over 1500  On  simulated, and a number of KPIs are produced, such as the time-based availability, power produced, and operating costs.

Wind farm availability estimation
For the calculation of the lifetime availability of a wind farm, both planned and unplanned maintenance activities are considered. On the one hand, scheduled maintenance happens at yearly intervals and is performed for each subsystem of the turbines in the farm following grouping and prioritisation. The downtimes are calculated based on the maintenance activity duration which is assumed to be fixed. On the other hand, unplanned maintenance downtime is sensitive to the availability of spare parts, vessels and personnel for the repair of damaged subsystems.
With regard to the distribution of unforeseen failures in time, this information is modelled from the reliability module based on the reliability data from the literature. The input failure rates are grouped into minor repair (mr), major repair (Mr) and major replacement (MR), according to the material costs indicated by Carroll et al. in Ref. [6]. When this information is not provided, the downtime statistics are used for the classification of the failure rates, similarly to what is suggested by Ref. [91] (cp. Table 9). When a failure occurs, the turbine status varies depending on the failure type. For mr, the turbine is assumed to continue operating even after the failure detection, and the shutdown is only assumed during the repair time. For Mr and MR, the turbine is stopped after the detection of malfunctioning, going back to service only after the system is restored. The time to failure associated with each failure mode, for a particular subsystem i, is assumed to be distributed by an exponential probability density function f(t) (Eq. (1)) with parameter λ i,mode being the failure rate for subsystem i under a particular failure mode (i.e. mr, Mr, or MR).
The cumulative distribution function is the probability of failure (PoF) of the subsystem according to the exponential reliability theory Table 8 RDS-PP® taxonomy adopted for system and sub-system added with numbered labels for the presentation of the results.  Fig. 2. Workflow of openO&M tool.

Table 9
Criteria for the classification of the reliability databases in minor and major repair and major replacements. and is given in Eq. (2). The PoF of the whole wind turbine is the PoF of all subsystems considering all failure mode classifications, as explained further in Refs. [88].
Further inputs for the availability estimation are the farm layout and the forecast of the environmental conditions during its lifetime. For the purpose of this analysis, the wind farm reference layout is based on Bak et al. [92]. The weather data simulated throughout the lifetime of the farm for its operation (wind speed for power production) and accessibility (wave height for the mobilisation of the vessels) are based on the FINO3 database. The stochasticity of the weather module is obtained by the implementation of a Markov model trained on the historical wind speeds and wave heights. Finally, information on the times and logistics for the performance of the unplanned maintenanceincluding repair times and resources neededare based on [6]. The further assumptions in the maintenance module and additional information can be retrieved from Ref. [88].

Identification of trends based on the review of the RAM statistics
A more in-depth view of the data repositories enables a crosscomparison of the statistics and critical discussion. Initially, the quality and consistency of the averaged reliability and maintainability figures are evaluated in an all-in-one comparison. A detailed discussion of the effect of the deployment parameters on the reliability and performance of onshore wind turbines is then suggested, leading to either further supporting the trends already identified in historical repositories or updating them based on the experience of the more recent surveys.

Trends in the averaged reliability and maintainability statistics
In Fig. 3, the data from all the complete and accessible initiatives (mentioned in Section 2) are presented as dimensional quantities, in failure frequency against time lost to restore the system after failure. Due to poor documentation, confidentiality reasons, or the lack of a standardised approach, a significant spread across these averaged results is immediately observable. These statistics generally collate the data over broad populations, for varying characteristics of the units. Based on the detailed analysis of Tables 4-7, and the homogeneous taxonomy adaption activity, it is possible to discuss and draw the main conclusions from the plot.
It is noticeable that the results from the WMEP and Huadian databases are largely in the medium range of lost hours per failure of the components (from 5 to 10 h/turbine/year), whilst the data from the other sources are distributed over a wider range. For instance, the majority of Spanish (CIRCE) results are distributed below 5 h per turbine and year limit, while the VTT population shows values largely outside the average range, as observed by Ref. [60]. A possible contributor to the inconsistencies between these initiatives can be the definition of "failure". This can indeed vary from being just a required visit to the turbine, considering only when a maintenance activity is required (e.g. Ref. [6,73]), to when an event has a downtime over a certain threshold (e.g. Refs. [65]), or, conversely, to account for alarm logs and remote resets (e.g. Ref. [4,39]). When counting remote resets, the failure rates recorded are higher. Likewise, the mean time lost to restore the turbine operation is lower, because of the presence of these small downtime events. This is the case with the data collected by the Southeast University Nanjing [39], whose outlier behaviour can be traced back to either the use of SCADA alarms or the very short period of the survey [11]. The unrealistically high failure rate and small downtime of the EPRI statistics can be associated with the infant mortality of its early stage technologies. With regard to the recent CIRCE statistics, while the time lost per failure is generally comparable to that of the older data collected (Vindstat and Felanalys, VTT, LWK and WMEP), the lower frequencies of the malfunctions can be related to the higher maturity of the technology and to the fact that only components' (internally caused) failures were considered, excluding from the analysis all the other outage events.
Regarding the Strathclyde offshore (Strath-Off) statistics, their skewed behaviour is associated with the use of the mean active repair time (MART, see Appendix A) as an indication of the downtime. Nonetheless, the gap between these MARTs and the mean downtime (MDT, see Appendix A) of the other statisticsexcept for Huadian and EPRI collections, already suggests the possible high impact of logistics and technical delays on the downtime for maintenance actions.
Looking then at the single initiatives' systems' share, the drivetrain failures seem to be, in general, the highest contributor to the hours lost per turbine per year, due to the presence of a gearbox. While this is true for the European initiatives, the Chinese statistics of Huadian and Nanjing contrapose a higher criticality for the control and electrical transmission system. This is in line with what is reported in the CWEA study, where converters yield the highest failure rate among all the subsystems. As explained by the authors [3], this could be related to the harsh environment (very low temperature reached) in the area where these wind farms are installed. However, Artigao et al. [10] suggest that this is a common trend among the Chinese statistics.

At a turbine and farm level
The authors from the DOWEC project [26] were the first to identify the need to separate the data also by power classes when plotting and comparing the reliability statistics. By using the data available at the time from the WMEP statistics, they observed that the turbines rated between 0.56-1.5 MW fail significantly more often than the smaller turbines; however, the population is 95% represented by lower rating units. Based on the German "250 MW Wind" program, more detailed and complete results for the WMEP project were collected and published. From the data of the first 15 years of the initiative, Hahn et al. [72] and Echavarria et al. [73] observed a time invariant increase of the turbine failure frequency with power rating.
In the LWK survey, the distribution of failure intensity among 12 different turbine models was sorted by turbine size [60]. The same authors who, in Ref. [30], already intuitively appreciated a lower reliability in the newer German turbines comparing them to their smaller predecessors, reaffirmed in Ref. [7] the general trend of failure rate to increase proportionally with the turbines' rating. This was shown to be particularly true for the type A1 turbines [60], while the direct-drive technologies (DDE type) seemed not to follow the rule, maintaining an almost constant overall failure rate of about 2.5 failures per turbine and year for larger units. Nothing can be stated for other concepts, such as type B, due to the lack of data for large units. Looking at the averaged failure rates for type C, the more recent CIRCE data collection reported results are in line with the discussed hypothesis: 0.46 and 0.52 failures per turbine and year, for the population below and above 1 MW, respectively [4].
This higher reliability would lead to generally higher technical availability. Reinforcing this argument, Harman et al. [58] observed that the operational availability of sub-MW units is higher than that of the larger units. However, at the array level, they additionally observed that an increase in the availability of larger farms (with more than 40 units), was proportional to the number of turbines and is independent of the units' rating.

At a system and subsystem level
While this last deduction cannot be verified yet, as the wind farm specific data are limited and incomplete, the trend of the failure rate with the power rating is investigated here at the system (Fig. 4) and subsystem (Fig. 5) (Fig. 4). Significant differences in the contribution of single components' failures to the total failure frequency were detected for small and large sized turbines. A smaller scatter for the share of each system to the failure rate is generally observed for medium rating turbines (above 500 kW and below 1.5 MW), reaffirming what was noticed by Dao et al. [11]. For the higher power class, the reduction in the percentage failure of the mechanical and structural components is balanced by an increase in the percentage of the electrical failures. As they are associated with an overall rise of the annual failure frequency [72], the electrical and control systems can be seen as the most critical components for the WMEP larger sized turbines. Although these results are in agreement with the project final average statistics [75], information on the distribution of the power rating in the final WMEP population is missing. Furthermore, it has to be noted that these results are mainly representative of technologies and layouts that are no longer adopted (type A0-A2) [33]. For these reasons, the results from other initiatives were analysed, seeking for a match with the WMEP trends.
The WMEP statistics are cross compared with those of the LWK and CIRCE surveys. To maintain the analysis as unbiased for the drivetrain configuration, a comparison is suggested among turbines of the same typology. The LWK results for the Enercon E40 (500 kW) and E66 (1.5 MW) gearless turbines [31,60] are presented in Fig. 5-bottom, to better understand the anomalous behaviour of LWK's DDE model's failures. Similarly, the failure rates derived by accessing CIRCE data [4] for the • The number of shutdowns caused by the rotor system (1) and the power generator system (6) -either EESG of LWK or the DFIGs of CIRCEdecrease when the turbine rating increases, in agreement with WMEP results; • An opposite trend characterises the control (5) and transmission (7) systems of CIRCE turbines, in agreement with the WMEP observation. In contrast, the transmission failure rates for small sized DDE turbines are higher than for larger systems, as verified by Ref. [60] as well; • The speed conversion (2a) and drivetrain brake (2b) subsystems of the CIRCE population seem to fail less frequently for higher power ratings, in line with WMEP results for the drivetrain system (2) and with what is reported by Ref. [93] for the type A1 turbines of the LWK survey; • The pitch system (1d) -not identified as a separate subsystem in the WMEP results and likely to be integrated and averaged into the hydraulic systemfollows the same trend for both LWK and CIRCE initiatives, increasing its number of failures with increasing turbine size.

Modern layouts statistics
With regard to the influence of drivetrain configurations on the statistics, the WMEP authors noticed a general decrease in the average reliability of the assets moving towards more advanced concepts, compared to the simple Danish concepts and standard variable speed [33]. Similarly, the analysis of van Bussel et al. [49,50] identified the robust designconsisting of a two-bladed turbine, with no pitch control installed on a monopile foundationas the best design solution for obtaining the highest availability in a large offshore wind farm project.
For the direct-drive configurations, the WMEP researchers observed that gearless layouts are not necessarily more reliable than geared ones [61]. Echavarria et al. [73] highlighted this aspect in more detail by analysing 10 years of WMEP time-trend results. They noticed that the direct-drive synchronous generator is not mainly responsible for the generally higher failure rate of these turbines compared to the geared alternative with an induction generator. In contrast, the failure events of power electronic components, in systems using synchronous generators, were significantly more frequent. This suggests that the statistics could possibly have been affected by the young age and novelty of the technology. Similarly, the LWK's authors [7,30,31] noticed that the aggregate failure of the generator and converters in direct-drive layouts (DDE) is greater than the aggregate failure rate of gearbox, generator and converters in indirect-drive ones. Indeed, the elimination of a gearbox resulted in a substantial increase in the failure rate of electrical-related subassemblies.
Some authors justified this tendency as being due to the immaturity of these technologies and the presence of new issues related to the new design [16] and larger dimensions [30] of the direct-drive generators. In agreement with this hypothesis, the more recent data collection from Reder et al. [4] registered an overall decrease in the failure rate for the Spanish direct-drive population compared to the type C one: 0.19 vs. 0.49 failure per turbine and year, respectively. The updated RAM statistics for the newer typology of DDP published by Lin et al. [3], highlighted an increase in the availability, during the second year of operation, of direct-drive design compared to the C type. To complete the discussion investigating the variance in failure frequency and downtimes, per component, between type C and DD turbines, a comparison among CIRCE statistics [4] is suggested in Table 10. Analysing only the data for turbines above 1 MW, an unexpected higher failure of

Table 10
Comparison of CIRCE type C and type DD reliability and maintainability statistics.
the rotor system is observed for the type C turbines. This behaviour could be affected by several factors. In contrast, the 50% reduction in the number of failures of the direct-drive generators, could be explained by a profound technology improvement of EESG and/or a switch to the PMSGs' direct-drive concept [78]. Nonetheless, as mentioned by Ref. [16], the lack of detailed information about the typology of the generator only allows us to speculate about the cause of this higher failure rate.
As far as the medium-speed configurations are concerned, it has been shown [94,95] that they have the potential to offer a good compromise between reliable operations and cost optimisation. On the one side, Carroll et al. [78] compared the statistics of a population of 1800 type C turbines with those of a group of 400 DImP. They observed that the full-rated converters are the most critical components in terms of failure rate, while the PMSG fails almost 40% less than a DFIG and its failure modes are for minor repairs of the auxiliary (lubrication and cooling) system. Nonetheless, due to the presence of the gearbox, they noticed that the hybrid layout fails nearly three times more often than the traditional type C configuration. On the other side, Lin et al. [3] reported a significantly higher technical availability of the hybrid configuration compared to both type C and DDP. This latter study is likely to be skewed by early-stage failures. Because of this contrast and due to the recent and sporadic installations, it is not yet possible to draw any conclusions on their robustness. Thus, more information needs to be collected and compared for these kinds of systems.

Trends with the deployment parameters
Although site-specific information is not yet available for supporting the observation of Harman et al. [58] (see Section 4.2.1) on how the farm size influences its availability, some experience on the effect of the location and the environmental parameters on the turbine reliability figures can be found in the literature. Indeed, much research effort has been dedicated to identifying the critical meteorological parameters that influence the turbine failure behaviour negatively.
One of the first extensive analyses on the effects of weather on turbine reliability was presented by Hahn et al. [36] showing increased failure rates of certain components with rising average daily wind speeds. The electrical system subassemblies showed the strongest dependency on wind speed, followed by the control system, while a significantly weaker correlation was exhibited by the other main subassemblies. Tavner et al. [32] identified an annual periodicity in failure rates due to seasonal variation in weather conditions, by analysing the correlation between monthly averaged wind speed conditions and component failures. Following this first study, they extended their analysis in Ref. [15,20] by cross-correlating the component failures with average monthly maximum and mean wind speed, maximum and minimum air temperature, and average daily mean relative humidity. They concluded that other weather conditions, rather than just wind speeds, can be closely related to the turbine failures. Wilson et al. [96] used artificial neural networks to investigate if any relationship existed between maximum daily gust speed, average daily wind speed and temperature, and the turbine's failure rates. The gearbox, generator and hub were shown to be more likely to fail in variable wind conditions, with a high potential impact on the failure rates of these subassemblies offshore. They additionally noticed that gust speed is a key parameter of the number of failures. This observation reaffirms what was shown already by Ref. [32], who observed that malfunctions occur more frequently in the winter months where average daily wind speeds can be lower but maximum daily gust speeds are higher.
Regarding the data collected for offshore wind turbines, Carroll et al. [6] noted that offshore turbines sited in areas with higher wind speeds experienced higher failure rates. This observation is in line with what was shown by Wilson and McMillan in Ref. [97] for onshore systems. Nonetheless, while this correlation appears to be rather weak onshore (linear regression slope is 0.08), the higher offshore wind speeds seem to have a higher impact on the failure rates (with 1.77 slope).
Finally, some studies on the possible effect of near-shore location were analysed. Examining the WMEP statistics, Faulstich et al. [33] observed that turbines located near the coast and in the highlands suffer higher failure rates. By analysing a more segregated population of turbines (type DImW, with sub-MW rating), Tavner et al. [15,20], showed similarities between the results from the Krummhörn and Fehmarn wind farms, presumably because of their near sea locations, compared with Ormont farm, which is located inland. While the turbines from the first two locations are subject to humid conditions, Ormont failures cross-correlate with wind speed standard deviation, suggesting the influence of turbulence on failure rate.

Critical discussion for offshore wind turbines
The failure of offshore and near-shore deployments can differ in number and typology compared to those of onshore systems, due to the potential effect of certain environmental conditions (such as humidity, gust events and turbulence intensity). For this reason, in this final discussion, the influence of the studies' specific parameters (i.e. reliability statistics and power ratings) is analysed by estimating the lifetime operational availability of hypothetical wind farms installed at a typical offshore location. The normalised reliability and maintainability figures of onshore and offshore studies are eventually compared, to identify the possible sources of the discrepancies in the results.

Lifetime operational availability estimates and trends
The derivation of the lifetime operational availability for a set of offshore wind farms establishes a common ground for integrating the offshore statistics into the analysis and consistently comparing them with the existing onshore ones. A similar analysis has been already carried out for the DOWEC project, where van Bussel et al. [49,50] estimated the availability and costs associated with the installation of different turbine technologies (drivetrain and foundations) for a fictional 500 MW offshore wind farm, erected at 35 km from the Dutch coast. Their findings concluded that there is a reduction in the availability of advanced layouts, as opposed to the traditional ones (see Section 4.2.3). Despite the fact that this observation supports the conclusions of Faulstich et al. in Ref. [33], the reliability figures used are extracted from a specific population of coastal wind turbines and adapted to several types of designs and the offshore application, based on the authors' best knowledge.
In contrast, the analysis presented here focuses only on the impact of the implementation of the several failure statistics collected from the literature. The reliability studies selected and adapted as required for the implementation on openO&M are reported in Appendix B. The systems and subsystems considered in the availability calculations are based on the taxonomy shown in Table 8 and are additionally subdivided into failure classes (mr, Mr and MR) as suggested in Section 3.3.2. For simplicity of comparison, the wind farm layout, the repair information and the weather data inputs are kept the same for all case studies. The wind turbine power curves for the estimation of the operational uptime are assumed, based on commercial models at the average power rating of the population. This information, together with the estimated timebased availability, is reported in Table 11. The results are presented in terms of averaged value and standard deviation, due to the random selection of the failure modes (as explained in Section 3.3.2).
Although the same characteristics and maintenance strategy for each of the offshore wind farms scenarios are selected, a spread in the estimation of the availability associated with the several reliability datasets can be noticed. This indicates that the frequency of the specific failure events largely affects the availability calculations and consequently the maintenance decision making. In contrast, the low standard deviation of the estimated availability (the highest of ±0.13% for the WMEP survey), implies a small impact of the randomness in using these failure data when supporting decisions for the operation of wind farms.
A generally higher availability is found for the onshore studies compared to offshore ones, independently from the turbine rating, supporting the hypothesis of a high correlation of the failures with environmental conditions (generally more unfavourable offshore). It is worth pointing out that the direct drive wind farm scenario achieves the highest lifetime availability, in line with the calculations of Carroll et al. in Refs. [95,98], and supporting further the observation of Section 4.2.3 on the high potential of this technology. Among the onshore failure-based scenarios, the WMEP and the Huadian datasets resulted in the smallest estimated availability, as low as 96%. Nonetheless, these results seem reasonable when compared against the operational availabilities collected by Ref. [2], on average around 95% for onshore turbines, with the exception of the sub-MW population, and around 93% for the offshore systems in the SPARTA and the WInD-Pool projects. Therefore, the generally high predicted values for the onshore-based results could suggest a general underestimation of the repair and logistic times as inputs of the maintenance module.
Independently from the sensitivity of availability to these parameters, some of the offshore-based results look surprising. On one side, the low availability, of about 80%, for the OWEZ wind farm can be related to the extraordinary maintenance activity during the time of the survey.

Table 11
Lifetime availability estimated for the selected surveys in Appendix B (in grey, from onshore failure statistics) and according to the derived averaged power ratings and assumed wind turbine power curves. Fig. 6. Comparison of offshore and onshore reliability statistics for geared turbines with an induction generator. In the x-axis are systems and sub-systems (Table 8) and on the y-axis the normalised failure rates. The normalised downtime is represented by the bubble size. This result agrees with the statistics for the UK offshore round I, affected as well by infant mortality events and the underdeveloped supply chain for O&M. On the other side, the 84% operational availability of the Strath-Off population is unexpected, considering that it is associated with the statistics for a population with similar characteristics (turbine configuration and power rating) to those of the recently published RAM database results [2]. The explanation of such behaviour can be associated with: • the effect of site-specific environmental conditions, which affect significantly the failure behaviour (as introduced in Section 5.2) and enhance the statistical uncertainty of the reliability figuresconsidering that these data are built upon averages from various sites; • the effect of failure distribution on the estimated availability levels, which has been shown to have a potentially high impact [99], and it has been considered here as exponential only; • the potential impact of: 1) preventive maintenance activities and condition monitoring systems, implemented in the real application, which could potentially avoid or prevent the failure, as well as 2) maintenance activities performed via helicopter access or other service operation vessels, which could reduce the downtime for the logistics and inaccessibility windowsnot yet included in the tool.
Speculating on the uncertainty surrounding the recently published availability statistics, the WInD-Pool project only reported vague averaged results in Ref. [2], while the SPARTA data are provided for 14 months only of recording [46]. Another year of availability figures from the SPARTA project have recently been published [100]. However, their results are reported in terms of production-based availability, giving only an overview on the goodness of the performance of the turbines compared to their power curves.

Influence of the offshore location on systems and subsystems
With the intent of understanding which systems and subsystems could mainly be affected by the offshore location, the reliability statistics from CIRCE [4], Strath-Off [6] and OWEZ offshore WF [45] (cp. from Tables 4-7) are compared in the bubble plot of Fig. 6. While their populations are consistentbeing representative of geared turbines with an induction generator installed and of comparable power rating -, the definition of failure and downtime differs among these initiatives. Thus, normalised failure rates and downtimes (in terms of h/turbine/year) for each assembly (horizontal axis), are respectively represented on the vertical axis (indicated by the centre of the bubble) and by the bubble size. It is first worth stating that, although adding value to the analysis, the statistics of the OWEZ farm can be affected by early failure events and are skewed by their derivation from the number of stops [85]. This is also the reason for the low share in frequency and repair time of "others" (12) unforeseen failures, while a higher ranking is in the Strath-Off and CIRCE statistics because of the use of a more detailed taxonomy for the collection and analysis of the data.
The highest share in the failure of the OWEZ farm is in the control system (5). Differently, the Strath-Off (and SPARTA [46]) surveys, recorded more frequent malfunctioning of the pitch (1d) and its hydraulic system (if present). These observations are in line with the results from the onshore surveys. Recently published statistics from the CREW data collection ranked the rotor as the assembly with the highest contribution to the turbine unavailability [16]. In the CIRCE survey, the rotor (mainly for the blade adjustment system and its hydraulics) and control system failures are second only to those of the drivetrain system.
With respect to gearbox (2a), the two offshore studies show a similar percentage share of failure rates, even though the high failure rate of Strath-Off is mainly caused by the inclusion of minor malfunctions, while that of the OWEZ statistics is mainly affected by catastrophic infant mortality failures. It is interesting to observe that, independently of the class of maintenance performed, the gearbox's overall share of the downtime for the offshore population is higher than for all the other components. Therefore, the observed potential high criticality of this component justifies the consistent research effort put into the direct drive designs [16] and advanced monitoring systems and techniques [101][102][103].
Regarding the power generation system (6), it is noticeable that its unplanned maintenance actions have a similar impact on the total corrective downtime of the several initiatives. Nonetheless, the Strath-Off induction generator has a considerably higher share of the frequency of the repair, compared to the CIRCE statistics, suggesting again a potential high influence of the offshore condition on this system. Furthermore, as already shown by Ref. [6], the repair rate of offshore generators is significantly higher than that of onshore ones, mainly due to the frequent minor maintenance actions required. Of minor impact for frequency and repair time required are the converters (7a). However, it is worth noticing that both offshore studies recorded a higher share of this subsystem to the failure frequency than in the onshore dataset. Similarly, the repair rate of the converters is ranked fourth by the SPARTA monthly averaged statistics, after issues related to the rotor system. The cause of their failure lies in the offshore environment and can be either associated with low temperature and/or thermal cycling, as intuitively proposed by Ref. [3], or related to other environmental factors (such as humidity) as shown by Ref. [104]. Because of the relatively high cost of repair and replacement, and the relatively long repair time [6], components in the transmission system (including the transformer) have a potentially high criticality.
With respect to the structural parts, the unpredicted failures of the rotor system (blades and hub) are shown (1) and known to be critical to the O&M of offshore wind projects [6,105]. On the other hand, it is worth pointing out the potentially high criticality of the offshore foundation systems' failures (11b), [91,105]. In spite of their little share in the unplanned maintenance activities [6], the performance of inspections and repairs of the structures can have a significant financial impact [106], especially for areas below the water level. Despite the increase in the number of structural-health monitoring systems normally installed on the offshore wind systems [107], the post-processing and the analysis of the signals for an effective data management, and for detecting anomalies, still faces some challenges [108,109].

Conclusions
This paper has presented a comprehensive review and critical discussion for the identification of the most critical components based on studies of currently installed wind turbine statistics. To achieve this, both onshore and offshore RAM statistics from historical European and newer overseas initiatives are collected and catalogued. Because of the extensive onshore industry experience, the trends of reliability and availability statistics with design deployment parameters are first investigated for onshore systems. The analysis is then extended by including the results from available offshore databases, as normalised statistics, and, via the estimation of the operational availability of a typical offshore wind farm (implementing the onshore and offshore failure statistics), the effect of statistical uncertainty of the different datasets is evaluated.
The main findings from the literature on onshore systems are: • at a turbine level, an observed generally lower reliability for increasing turbines' size, but a potentially inverse trend in the availability of the wind farms with the number of units installed, independently from their power rating; • at a system level, a reduction in the shutdown events linked to the unforeseen failure of the rotor, the drivetrain and the power generator systems are observed as opposed to an intensification in the corrective maintenance of the control, transmission and blade adjustment systems for an increase in the turbine size; • at a turbine level, a higher number of failures is generally recorded for higher averaged wind speed, gust speed, and in presence of other environmental conditions (such as humidity and potentially high turbulence level), with a sensitivity of the drivetrain, power generation and transmission subsystems to these phenomena.
With respect to offshore wind turbines, it is noticed that: • although generally higher failure frequencies are observed compared to the onshore projects, the recently collected statistics from the industry-led RAM databases show a significant improvement in the operational availability compared to the first generation of offshore turbines; • a high share of the blade adjustment, drivetrain and transmission systems to the overall failure rate is common to all offshore studies, with the drivetrain and rotor systems being potentially the most critical due to being associated with longer downtime and cost of the repair.
From these observations, and from the further comparison of the normalised reliability and maintainability figures for the onshore and offshore studies, it is possible to speculate on the potentially high criticality components for the next generation of offshore wind turbines. Moving towards larger systems and/or direct drive designs, there could be a reduction, or at least a steady trend, in the failure of the drivetrain and rotor systems, and of power generation systems, when switching to a synchronous (permanent magnet) generator type. On the other hand, higher costs for corrective maintenance should be expected for the failures of the transmission systems and the tower structures. This should push research in looking for improvements at the design stage and/or for the implementation of monitoring systems on these assets.
As far as the availability estimation of offshore wind farms is concerned, the discrepancy between the predicted results and reference values from the literature suggests that a higher level of detail is needed and should be fed into the tool for obtaining conclusive results. This analysis, complemented with a cost analysis, is fundamental to a wide range of stakeholders in the offshore wind industry to achieve improvements of the financial targets of current and future projects. To tackle the statistical uncertainty associated with the input failure statistics, information on the distribution in time and among the turbines in the array (subjected to varying environmental conditions) should be provided. Finally, reference availability statistics for a longer period should be collected and made available for validation.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

A. RAM terminology definitions and performance indicators
As the data collected is heterogeneous, a terminology is introduced for describing and classifying the results in the several databases. Thus, standardised definitions of reliability, maintainability and key performance indicators (KPIs) are summarised in the following section.

Reliability and maintainability terminology
In the reliability analysis, an indication of the frequency of the failure and/or the time elapsing until the system is restored is generally given. The frequency parameter is usually represented by the failure rate (λ), which is the likelihood of a system to fail within a specific period. Unlike a probability, however, it can reach values greater than 1. Focusing on the constant failure rate region of an asset, the indicator λ of a WT, consisting of K components, is averaged over the i − th recording periods according to Eqn A.1 [19,67]. The I is the number of intervals for surveys of length T i and n i failures per interval. This Power Law Process is commonly used in the reliability analysis of repairable systems [18]. As the data collected are from many turbines, the data are generally normalised by the number of units in the population N i , as well as providing, for instance, information on the number of failures per turbine, per year (i.e. [failure/turbine/year]).
According to the relevant ISO standards [110,111], a distinction between a repairable and a non-repairable system is required, as well as distinguishing whether the maintenance action consists of a repair or a replacement measure. Having only had access to limited information, this quantity is considered in this paper as a general failure, or maintenance event rate. Furthermore, λ depends on the definition of failure itself. A "failure" is the loss of ability to perform as required; however, the qualitative judgement of the authors can vary when interpreting the term "loss of ability". With "active", the ISOs define the effective time to achieve repair of an item. This accounts for: fault localization, correction and checkout time. This definition is in agreement with the one of the IEC [112] defined, and commonly called, "repair time". (Refer to Fig. 4 of [111], and Figs. 5 and 6 of [110]).

Mean Time to Repair/Restoration
"Expected time to achieve the following actions: The ISOs defined MTTRes (mean time to restoration) wants to be an elucidation to the MTTR (mean time to repair) from IEC [113]. In the latter, the fault detection time is not considered. Thus, MTTRes is defined as (continued on next page)  [110,111]) Comments and Notes -time to detect the failure; -time spent before starting the repair (with administrative, logistics and technical delays); -effective repair time (MART); -time before the component is available to be put back into operation (possible other administrative delays)." MTTRes = MRT + MFDT where MRT and MFDT are, respectively: -the time elapsing from the actual occurrence of the failure of an item to its detection, andtime elapsing from the detection of the failure of an item to the restoration of its function.

Mean Downtime "Expectation of the downtime."
The downtime is the time interval during which an item is in a down state, and thus "unavailable". (Refer to Figs. 3 and 4 of [111]) • It can be either for planned or unplanned maintenance actions. However, only the latter (corrective O&M actions) is considered in this work. • The downtime includes all the delays between the item failure and the restoration of its service. • It differs from MTTR accounting also for "other unplanned outages"; among these are: operational problems, restrictions, and machinery shutdown (trip 1 and manual). 1 Defined as the shutdown of a piece of machinery (activated automatically by the control/monitoring system) from normal operating conditions to full stop. It can either be "real", if the result of exceedance (monitored or calculated) of a pre-set limit, or "spurious" when an unexpected shutdown caused by a failure.
With respect to the quantity characterising the "failure time" several definitions are relevant, and the most recurring statistics analysed are reported in Table A.1. These measures are used as representative of the maintainability characteristics of the failure. They have the dimension of time, with varying resolution, and they can be normalised by either the number of turbines or the failures in the time interval considered. More specifically, the definition of "downtime" varies from the minimum time to perform the repair (MART), to the total time expected from when the system fails to its restoration (MTTR or MTTRes). The more commonly collected mean downtime (MDT) differs from the MTTR in reporting shutdown events due to grid restrictions, weather conditions, and other causes (for more information refer to Table 4 of reference [111]).

Performance indicators
Production factors are used by several authors as an indication of the averaged performance of wind turbines and farms (e.g. Ref. [3,43]). The KPIs used in these papers are explained and reported in Table A.2 and are in line with those adopted by Refs. [2,114]. Among them, the technical availability (A T ) is the most meaningful one for the understanding of unexpected failures. It is defined as the ratio between uptime and downtime of the turbine, and, by considering for the latter only the downtime for corrective maintenance (excluding scheduled actions), it gives combined information on the frequency of and restoring time for the failure. The time-based, or operational (A O ), and energetic (A E ) availability are employed as a measure of the actual performance of turbines and/or farms. The A O is the availability derived by the lifecycle assessment tool of Section 3.3.1. As regards the A E , being the estimation of the potential power output of a complex process, and subject to high uncertainties, the capacity factor (CF) is often encountered instead. A T Technical Availability [112] Fraction of a given period of time in which a turbine is operating according to its design specifications t available t available + t unavailable t available -time of full and partial performance -technical standby and requested shutdown -downtime due to environment and grid t unavailable time of corrective actions and force outage (excluding missing data and scheduled maintenance) A O Operational Availability [112] (or Time-based Availability) Share of the time when the system is operating and/or able to operate, compared to the total time t available time of full and partial performance (considering low wind as well) t unavailable time of all the other cases (excluding missing data) A E Energetic Availability [2] Amount of energy produced by the system compared to its potential energy production P actual P potential P actual : Average actual power output P potential : Average potential power output (excluding missing data) CF Capacity Factor [112] Ratio between the amount of energy actually produced by the system and what it can theoretically produce P actual P P: Rated power output

B. Failure statistics for availability calculations
In Table B.1, the failure rate statistics used as an input for the availability calculation in the openO&M tool are reported. The Strath-Off and onshore data are integrated by making assumptions on the time required for performing offshore maintenance. The OWEZ statistics are used for comparison, being the only other offshore survey reporting sufficient information on its reliability and maintainability figures. Additionally, with respect to the onshore studies, only the most complete and consistent studies identified in Section 4.1 are considered, further subdividing the surveys depending on turbine configuration and/or power rating where possible.