Reduction of uncertainties for photovoltaic reference cells

This paper presents the latest implementation of the ESTI reference cell set at the European solar test installation. The set consists of five photovoltaic reference cells used to measure solar irradiance. Each cell has been calibrated multiple times by primary methods. The stability of the set has been maintained and monitored over a long time period (up to 20 years). Taken together this allowed a rigorous determination of the assigned calibration value for each cell based on the weighted average, together with its uncertainty. This has led to unprecedented low uncertainties of 0.23% (k = 2), which are at least a factor of two smaller than achievable with any single primary calibration. Using these cells for calibrating secondary reference cells with a dedicated set-up leads to uncertainties for these secondary calibrated cells which is of the same order as the best primary calibration methods.


Introduction
In the determination of electrical performance of photovoltaic (PV) devices, they are exposed to natural or simulated sunlight and their electrical characteristics measured. The calibration result is reported to standard test conditions (STC). In the uncertainty analysis of such calibrations the major component is the uncertainty of the PV reference cell used to determine the irradiance. Therefore, the most promising approach to reduce the overall uncertainty of PV device calibrations is to reduce the calibration uncertainty of the reference cell. In this paper we outline the approach taken and implemented at the European solar test installation (ESTI).

Purpose
Each calibration laboratory needs to perform measurements traceable to SI units. The most difficult part of this, in the calibration of PV devices, is the traceability to the SI irradiance scale, whereas traceability to electric units (current and voltage) or temperature is readily achieved by standard approaches. Methods applied that lead to traceable calibration of PV reference cells are described in IEC 60904- 4 [1] which also lists a number of possible routes. Essentially traceability can be directly to SI irradiance scale by using standard lamps or trap detectors, whereas another common route is to calibrate against the world radiometric reference (WRR) [2] which is established by the world standard group (WSG), a set of absolute cavity radiometers, at the world radiometric centre (WRC) in Davos, Switzerland. Comparisons of both irradiance scales, WRR and SI, have shown that the two scales were indistinguishable [3][4][5] within the uncertainty of the comparison. The uncertainty of this comparison of scales was included in the overall uncertainty budget for calibration of PV reference cells against the SI irradiance scale when measured against an instrument traceable to the WRR. However, a more recent comparison with improved instrumentation [6] has shown that the two scales have an offset of 0.34%. Here we apply this scale shift retroactively. As it has been shown that the WRR is stable, it is deduced, that the offset has always been present, but it had not been possible to determine it [6]. Furthermore the remaining uncertainty of this shift is now smaller than the previous uncertainty of the two scales being indistinguishable.

History
Previous attempts to establish primary calibration of PV reference cells started in the 1980s.
Following the final report of a comparison in the framework of the third Photovoltaic Energy Project PEP'93 [7] which was initiated as a seven nation program of reference cell calibration and its correlated publication [8], the so called world photovoltaic scale (WPVS) was established. Essentially a number of stable reference cells were circulated amongst participants and the primary calibration values of four laboratories were used to assign WPVS calibration values to reference cells. The assigned calibration values were based on the arithmetic average of the four results and the uncertainty was taken as the standard deviation of the data set [7,8]. In the final report [7] the estimated measurement uncertainties had been quoted, but not further taken into account. The scatter of the data was considerable, leading to typical expanded uncertainties of 2% (k = 2).
In 1999 the cells were recalibrated (first recalibration of WPVS) by one of the four laboratories (National Renewable Energy Laboratory, NREL, Boulder, Colorado, USA) [9] and new values assigned with the same method as above, but now based on five measurements.
In 2004 the second recalibration of the WPVS took place [10], led by another of the four laboratories (Physikalisch Technische Bundesanstalt, PTB, Braunschweig, Germany). This was organised as a star like intercomparison, where each participant calibrated their PV reference cell (by traceable primary methods), then sent them to the lead laboratory (PTB), which calibrated them, after which they were then calibrated again by the home institution. While the measurements were completed and results exchanged, only a first brief account has been published [10], whereas a final report was not. Therefore no new WPVS calibration values were assigned to the reference cells.
The ESTI laboratory, which had participated in the original exercise PEP'93, but not with primary methods, established a portfolio of three primary calibration methods [11][12][13][14][15] prior to the second WPVS recalibration. The results showed good agreement, both with historic data as well as with recalibration by PTB. Subsequently ESTI was accredited under ISO 17025 for these primary methods.
In 2008 ESTI established the so called 'ESTI reference cell set', consisting of five c-Si PV reference cells. Three of these cells had participated in all WPVS measurement campaigns, and two only in the second recalibration. However, the latter exercise produced several data points, as the reference cells were calibrated not only by PTB but also before and after at ESTI with various calibration methods. When the ESTI reference cell set was established, the arithmetic average of all primary calibration values was assigned as calibration value and the uncertainty as their standard deviation, following the approach taken by the WPVS previously. The expanded uncertainty was found to be 0.64% (k = 2). The stability of the set has since been checked at least once yearly by verifying the cells against each other.
More recently (2013-2014) additional primary calibration results on most of these cells have become available, by measurements at ESTI [16], intercomparison under the Bureau International des Poids et Mesures (BIPM) by the pilot laboratory PTB [17] and bilateral comparison with the National Institute of Advanced Industrial Science and Technology (AIST, Japan) [18]. Therefore it was necessary to update the assigned calibration values based on these new additional calibration measurements. At the same time it was decided to use a more rigorous data analysis using the weighted average and associated uncertainty. Furthermore recent information about the offset of the WRR against SI irradiance scale was incorporated. This paper reports the implementation and results of these three updates to the ESTI reference cell set, effectively establishing the new and 'true' WPVS.

ESTI PV reference cell set
The ESTI reference cell set was established as described above. It takes the original idea of the WPVS on board, namely that the average of different primary calibrations is a better estimate than any single primary calibration. However, the ESTI reference cell set has some additional features. As it consists of five PV reference cells it allows regular (at least annual) verification of the cells against each other. This would highlight any drift (instability) of any cell in the set. It is highly unlikely that the entire set drifts in the same way as it contains cells of different make and different design. This has allowed to verify that these cells are stable over a long time period (up to 20 years) and therefore all historic calibration measurements can be used to assign the best estimate of average calibration value. The ESTI reference cell set constitutes the highest level (i.e. primary reference) of PV reference cells for measurement of solar irradiance at the ESTI laboratory. The level primary reference is also the highest attainable by any PV device [1]. In the traceability chain to SI units there are reference instruments on two higher levels, primary standards and secondary standards [1]. For ESTI, in its traceability chain to WRR, they are embodied respectively by the WSG (at DAVOS) (primary standard) and three absolute cavity radiometers (secondary standard) at ESTI. The latter are compared every five years to the WSG during the International Pyrheliometer Comparison (IPC) [19] and when necessary a correction factor is determined.
The five cells of the ESTI reference cell set are PX102C, PX201C, 930417-2 (all three participating in WPVS since the beginning), and PX301C and PX304C (figure 1). PX102C was made by Stella and the latter four cells by PRC Krochmann, but with different designs (e.g. PX201C and 930417-2) as well as from different batches (e.g. produced in 1990: PX201C and in 2000: PX301C and PX304C). All five devices are encapsulated crystalline Silicon cells.
The cells are stored in the ESTI laboratory and not used for daily work. They are regularly re-calibrated by primary methods at ESTI, and elsewhere (NREL, PTB, AIST), and are used once a year to calibrate the set of secondary PV reference cells at ESTI (ca. 30 cells). During this annual calibration exercise the five cells are also compared to each other. The calibration value of each cell is calculated as if it were calibrated using the other cells of the set in turn as reference, and then compared with the assigned calibration value of the cell. This is a verification rather than a calibration, as it is a secondary calibration level, whereas for these cells only primary calibrations are considered in assigning their calibration values. However, the comparison would highlight any instability of any cell within the set. The comparison of only two cells would allow to identify a possible instability (drift), but not to which of the two cells (or both) it has to be attributed. It would also overlook a common drift of both cells. However, using five cells (as in the ESTI reference cell set) it can be established that the entire set is stable as a common drift of all cells is highly unlikely due to the variety of devices. Obviously there are limits to the detection of such a drift, in particular in the (unlikely) presence of a common drift of all cells. The latter would be revealed by any new primary calibration. Given that the uncertainty of any primary calibration is always larger than the uncertainty of the determined reference value, small drifts of the reference value would remain undetected. The observed data (see figure 2 below), however, do not suggest any significant drift component.

Data analysis
For the data analysis we choose the weighted average and its uncertainty, mainly following Cox [20][21][22].
Assumptions. The following assumptions were made: -The devices (five cells) are stable, as already established above. -All primary calibration values are stated together with their expanded combined measurement uncertainties (k = 2) representing a Gaussian distribution. -All primary calibrations and their uncertainties are uncorrelated. For possibly small correlations see discussion. These are essentially the conditions for the applicability of the weighted average [20]. The last of the three assumptions above is equivalent to the measurement of each institute being realized independently of other institutes' measurements. As will be presented below, here multiple results from several institutes are included. For the justification of this approach see the discussion about correlations.
Weighted average and uncertainty. We then calculate the weighted average x and its expanded uncertainty UC (k = 2) for each cell (following Cox [20,21] (Note: In standard statistics textbooks normally the standard uncertainty is used equating to half the expanded uncertainty. As we are interested in the final expanded uncertainty (k = 2) we use UC i here and the standard uncertainty only for the Chi-squared test below.) Consistency check with Chi-squared test. Following the approach by Cox [22, p 189], the data set is checked to see whether the application of weighted average is justified. We calculate The weighted average is consistent with the N calibration measurements if where χ α − N 1, 2 denotes the 100α percentage point of the Chisquared distribution with (N − 1) degrees of freedom. Here α = 0.05 is taken.
If the above condition is fulfilled the weighted average x is acceptable as a reference value. Otherwise the weighted average of the largest consistent subset can be determined [22]. However, as shown below, the latter is not required here, as the data satisfy the condition (equation (6) with absolute En numbers below 1 signifying satisfactory agreement, and above 1 unsatisfactory. Given the number of measurements (63, see below) based on 95% coverage we might expect about three unsatisfactory results.  [3][4][5] the scales were deemed to be indistinguishable with some associated uncertainty. Hence only this uncertainty between scales was included in the uncertainty budget of calibrations traceable to WRR, but the calibration values were not modified. The latest comparison of scales [6], however, established that there is a systematic shift between the scales, with WRR reading 0.34% higher irradiance than the SI scale. The uncertainty of this shift was given as 0.18% (k = 2) [6]. Furthermore, it was claimed that the WRR in itself has been stable since its introduction in 1979. Previous comparisons had not been able to establish the shift given their uncertainty [6]. This implies that the offset between scales has always been there and therefore should be taken into account retroactively [6,25]. Therefore, all results traceable to WRR have been scaled by 1.0034 (as the calibrations are normalized to 1000 W m −2 irradiance, the higher irradiance reading of WRR traceable measurements translates to lower calibration values so they have to be multiplied by the scale offset). A further implication of this is on measurement uncertainty. Previously the scale difference was included as an uncertainty which was typically 0.17% (k = 1) [1]. (Note: this just covers the now established offset of 0.34% in the k = 2 band). However, having established the offset it needs to be corrected explicitly (as outlined above), but then the uncertainty contribution of the scale difference reduces to 0.09% (k = 1) [6]. Therefore the assigned uncertainties of all WRR traceable measurements will potentially change. For the ESTI DSM measurements of 2014 this was calculated leading to a reduction from 0.58% (k = 2) (without shift) to 0.48% (k = 2) (after shift). Obviously this will influence the results as any change in uncertainty will influence both, the weighted average as well as its uncertainty. However, this has not been implemented as it was not possible to do for those measurements not made at ESTI, as the full uncertainty calculations are not available and it is not possible for one laboratory to change the uncertainty stated by another. Consequently, for consistency, the change in uncertainty was also not implemented for the measurements made at ESTI, although the latter would be possible.

Weighted average and associated uncertainty
The uncertainties of the weighted average range from 0.23% to 0.27%.
In figure 2 the calibration values and associated uncertainties of two cells (930417-2 and PX304C) are presented graphically.

Chi-squared test
The consistency was checked with the Chi-squared test (table  1). All five cells pass the test, most with a large margin. This is a strong indication that the data set is consistent and might even indicate that the uncertainty estimates on average are conservative. The original calculation not including the WRR shift (not shown) yielded higher Chi-squared values, but still all cells passed.

En numbers
Only one absolute En number is larger than 1 (table 1), again indicating consistency and conservative uncertainty estimates, as a larger number of outliers would be expected (namely three). Not including the WRR shift (not shown) four unsatisfactory results were observed. So obviously the WRR shift improved overall data consistency.

Use of weighted average for secondary calibration
The ESTI reference cell set is mainly used for calibrating the secondary reference cells at ESTI, establishing the traceability chain to SI units. Therefore, it is important that the set is well controlled and maintained, guaranteeing the stability which is regularly verified. Furthermore the assigned calibration values should be the best possible, representing the state-of-the-art in measurement technology, with the lowest possible uncertainty, as the latter point will determine the uncertainty further down the calibration chain.

Measurement uncertainties
The data consistency checks (Chi-squared and En number) have shown that the single assigned uncertainties are reliable (or if anything slightly too conservative). Therefore, the uncertainty of the weighted average can also be deemed reliable. Obviously this uncertainty of the weighted average is lower than any uncertainty of a single calibration result, which is exactly the purpose of using multiple measurements and combining their information.
The uncertainty of the weighted average determined here of 0.23% (k = 2) is considerably lower not only for any single measurement but also for averages used previously. This was essentially the WPVS (2.0% (k = 2)) and the ESTI reference cell set in its original implementation (0.64% (k = 2)). This is partly due to the large historic scatter of data which was reduced in recent times and partly due to implementing the more rigorous data analysis based on weighted averages and their uncertainty rather than simple arithmetic averages and the standard deviation of the results. In fact the historic approach of arithmetic averages and using data scatter for uncertainties did not make use of the (available) information of measurement uncertainty, which is achieved here.
Using the ESTI reference cell set in the annual calibration of all ESTI secondary PV reference cells on a dedicated set-up [26], yields an uncertainty of 0.48% (k = 2). Incidentally this is the same as the uncertainty of the primary calibration by the DSM method (including WRR shift and its uncertainty). So it is of the same order as the lowest available uncertainty of primary methods (ESTI DSM 0.48% and PTB DSR 0.50%). However, a secondary calibration can be achieved with much less effort, both on instrumentation and manpower (measurements and data analysis) and is therefore less costly and faster. This is mainly due to the fact that a secondary calibration is a transfer between two reference devices of very similar characteristics (material, size, make) whereas a primary calibration is a transfer between instruments of vastly different properties (size, cost, sensitivity, stability).

WRR shift
A few comments regarding the WRR shift seem appropriate.
While the shift was only established recently [6] it has always been there and therefore a shift of all historic data is justified [25]. For future measurements traceable to WRR it will have to be established whether the executing laboratory already took the shift into account (and hence quotes calibration with respect to SI units) or not (and hence is traceable to WRR). In the latter case the shift would have to be applied retroactively by ESTI before including the value to update the reference cell set weighted average. The shift between scales may also be taken care of by a redefinition of the WRR, making a separate shift superfluous [25]. In order to properly compare data from different laboratories and methods the WRR shift needs to be taken into account. It is reassuring that the shift actually improved data consistency (reducing Chi-squared and reducing number of absolute En numbers larger than 1. In another paper comparing primary reference cell calibrations, it was found that the WRR shift did increase the discrepancies, but that nevertheless all En numbers remained well below 1 [27]. In the past the shift of the WRR (0.34%) was also not easily identifiable between methods, as the data scatter of PV reference cell calibration was historically much larger (e.g. WPVS 2%). However, with the recently improved implementation of methods all uncertainties have reduced and the shift becomes more noticeable. The statistical analysis presented above shows that it is actually a component in the data set which can be discerned once noted. Also the now achieved low expanded uncertainty of the weighted average of 0.23% (k = 2) are smaller than the WRR shift, so neglecting it would lead to incorrect results. In fact the inclusion of the WRR shift raised the weighted average of the cells by 0.2% (as only results traceable to WRR are shifted, data not shown) on average, which is a non-negligible component when the final expanded uncertainty is only 0.23% (k = 2).
An outstanding implication is that the measurement uncertainty calculations for all methods traceable to WRR should be updated, correcting for the shift explicitly leading to a smaller uncertainty contribution from difference in scales between WRR and SI.

Correlations
The assumption of uncorrelated results was made, but some correlations may be present. An obvious example is the WRR shift, as in the case of WRR traceability a number of laboratories and methods take traceability to an irradiance scale and the offset in this scale translates into all results with perfect correlation. After correcting the shift, the contribution of this particular component is not any more major, as the remaining uncertainty is 0.09% (k = 1).
Another aspect of correlation is that here multiple results from several institutes have been included. This is not normally admissible, as such results would be highly correlated. In general the components contributing to measurement uncertainty are of two types, statistical and systematic. The former can be reduced by taking more measurements and averaging, whereas the latter cannot be reduced by this approach. Therefore, the question whether multiple results from a single institution obtained with the same method should be included in the weighted average can be regarded as whether the uncertainty is dominated by statistical or systematic components. However, over a long time period some systematic components may be regarded as random. For example the uncertainty arising from the data acquisition system is analysed. Over short time periods (without recalibration of the data acquisition system) measurements taken using this data acquisition system will be correlated. Over longer time periods (with recalibration of the instrument and its calibrator, with possible replacement by different model or make) the uncertainty can be assumed to vary randomly. Given the time periods involved here, we assume that the re-measurement of a reference cell by the same institute with the same method is not correlated. The only exception would be the calibration by SSM at ESTI performed twice in 2004. In this particular case, however, the largest uncertainty was due to positioning of the cell [11], and therefore statistical in nature. Furthermore, the SSM has a high uncertainty which gives it a low weight in the average. In order to investigate the sensitivity of the results to multiple measurements by the same method from the same laboratory, all multiple measurements were removed. The criteria used were to keep the result with the lowest uncertainty, and if equal to use the latest in time. Performing this, the weighted averages changed on average by ±0.1%. The largest deviation was +0.17% for reference cell 930417-2. This was mainly due to the removal of the PTB measurement of 1994 which was the lowest measurement of all.
While an explicitly mathematical consideration of correlation is possible [21] it requires information or assumptions about the size of the correlation which is not available here. The correlation here is deemed to be small to the extent to be negligible as the average is over a number of (leading reference) laboratories and time. In fact the implementation of the same method over time develops (change of instrumentation and/or procedures) which reduces correlation. For example it might be expected that the results of a laboratory over time using the same method on a stable artefact would be highly correlated. Looking at the (extreme case) of the results by DSR from PTB for cell 930417-2 between 1994 and 2004 shows a large swing from the lowest of all values to the third highest. This further supports the notion that results on the whole are not (or only weakly) correlated and justifies neglecting correlation.

Conclusions
The implementation of a rigorous assignment of weighted average and uncertainty to a set of five c-Si PV reference cells to be used as irradiance primary references at the ESTI laboratory has been described. This included all available information, calibration results together with their stated measurement uncertainties, as well as scale effects (IEC 60904-3 ed. 1 to ed.2 and WRR to SI comparison).
This resulted in the most accurate assignment of calibration values to PV reference cells ever, as exemplified by the lowest expanded uncertainty of 0.23% (k = 2) being a factor of two below that of any single method. It will make it possible to calibrate PV reference cells by cheaper and faster secondary methods to the same level or even lower uncertainty than any primary method currently available in the world.
The ESTI reference cell set made this possible together with 20 years of calibration history, well controlled maintenance and regular stability checks. The effort to also execute all the primary calibrations over the years is non-negligible amounting to several man-years. It is believed that the ESTI reference cell set represents the best available reference for PV irradiance measurements in the world. It therefore is the most advanced and to date 'true' implementation of the original concept of WPVS.