Linking the Results of CIPM and RMO Key Comparisons With Linear Trends

A statistical approach to link the results of interlaboratory comparisons with linear trends is proposed. This approach can be applied to the case that the comparison artifacts have the same nominal values or the measured quantities have the same magnitudes. The degrees of equivalence between the pairs of National Metrology Institutes that have not participated in the same comparisons, and their corresponding uncertainties are established. The approach is applied to link the CCEM-K2 and SIM.EM-K2 comparisons for resistance at 1 G Ω level.


Introduction
The work of linking the results of International Committee for Weights and Measures (CIPM) and Regional Metrology Organization (RMO) key comparisons (KCs) is an important part of implementing the CIPM Mutual Recognition Arrangement (CIPM MRA) of the CIPM. Recently, several methodologies have been proposed to deal with the linkage problem. Delahaye and Witt [1] proposed a practical method, which used an additive correction to link a CIPM KC of 10 pF capacitance standards to results obtained by a corresponding EUROMET comparison. A similar method was used to link key comparisons CCEM-K8 and EUROMET.EM.K8 by Marullo Reedtz and Cerri [2]. Elster, Link, and Wöger [3] suggested a method based on a ratio correction, which can be applied when the results of the CIPM and the RMO comparisons are of different magnitude or different physical dimension. Nielsen [4] and Sutton [5] suggested combining the measurements from CIPM and RMO key comparisons by applying weighted least squares or generalized least-squares estimation. As pointed out in [3], however, this approach will generate a completely new analysis, which obviously will influence the existing results. Kharitonov and Chunovkina [6] and Decker et al. [7] have also discussed linking of CIPM and RMO key comparisons.
Zhang et al. [8] proposed a statistical approach to KCs with linear trends. Later, Zhang et al. [9] extended the results to the case of multiple artifacts. Discussions of key comparisons with trends can also be found in [10] and [11]. In this paper we propose a method to link the existing results from CIPM and RMO KCs both of which have linear trends.
Section 2 provides the statistical models and major results for key comparisons with linear trends based on the general case discussed in [12]. In Sec. 3, the difference between the degrees of equivalence of the two comparisons is defined and used to establish the relationship between these two comparisons. An estimator of this quantity is proposed, and is used to estimate the degree of equivalence of a laboratory that participated only in the RMO KC, with respect to the key comparison reference value (KCRV) of the corresponding CIPM KC. In Sec. 4, degrees of equivalence with their corresponding uncertainties, are established between pairs of National Metrology Institutes (NMIs) that only participated in one of the two comparisons. In this study we assume that the artifacts in the two KCs have the same nominal values or values of the same magnitude. When two comparisons have different nominal values, linking would be a challenge unless there is strong correlation between the two and the corresponding uncertainty is estimable. As an example, in Sec. 5 the methodology is applied to link the CCEM-K2 and SIM.EM-K2 key comparisons for instance at the 1 G Ω level.

Statistical Models for Interlaboratory Comparisons With Linear Trends
In some key comparisons, the measurand has a trend or a drift and thus, the measurements of the transport artifacts made by the participating NMIs will show trends. References [8] and [9] proposed statistical approaches to KCs with linear trends for a single and multiple artifacts, respectively. A recent paper, Zhang et al. [12] provided a generalized method, which can deal with the case when multiple NMIs measure the traveling artifacts more than one time and when the uncertainty structure is more general. Since [8] and [9] can be treated as special cases of [12], we will adopt the statistical model and notations in [12] for the comparisons.
We assume that N laboratories participated in the first key comparison, for example, a CIPM KC. We assume also that there were P artifacts traveling together and for each artifact, the nth laboratory (n = 1, … , Ν ) makes J n measurements with J n ≥ 1. For the pth artifact ( p = 1, … , P ), the jth measurement (or the jth average of the measurements) made at the nth laboratory, X nj ( p) is measured at the time t nj (p) (j = 1, … , J n ). As in [12], we assume a simple linear regression holds for all the measurements, i.e., (1) for j = 1, … , J n , n = 1, … , Ν , and p = 1, … , P, where for a fixed artifact the slopes of the trends for all N laboratories are the same, while we allow different intercepts for different laboratories. We further assume that for each laboratory, the random error in the measurement X nj ( p) can be expressed as (2) where the indicator I n ( p) = 1 when the errors e nj,B ( p) are the same for all the measurements made by the nth laboratory, and I n ( p) = 0 otherwise. The random components e nj,A ( p) and (e n,B ( p), e nj,B ( p)) are statistically independent of each other with standard uncertainties of σ nj,A ( p) and (σ n,B ( p), σ nj,B ( p)), which are the Type A and Type B evaluations of standard uncertainty, respectively. This indicates that the measurements of different artifacts (whether by the same or by different laboratories) are statistically independent, while the measurements for the same artifact, made at the same laboratory can be independent or not, depending on the indicator I n ( p). Regarding the case for different artifacts measured by the same laboratory we understand that: (a) the errors quantified by the Type A uncertainty, i.e., e nj,A ( p), are statistically independent; (b) the errors quantified by the Type B uncertainty, i.e., e nj,B ( p) or e n,B ( p) definitely have some correlation; (c) since not all artifacts are created equal and even when the metrologists make every effort to measure artifacts in as "correlated" a way as possible, there is still a random component. Thus, we think it is reasonable to assume that measurements of different artifacts (whether by the same or by different laboratories) are statistically independent. From (2), when I n ( p) = 1, the Type B uncertainties are the same for all the measurements made on the pth artifact by the nth laboratory. On the contrary when I n ( p) = 0, the Type B uncertainties may be not the same for all the measurements made by the nth laboratory. Without loss of generality, we assume that the pilot laboratory is the first one among all P laboratories with J 1 > 1.
From Eqs. (10) and (11) in [12], the generalized least-squares estimators of a n ( p) and β ( p), which are the best linear unbiased estimators of these parameters, are given by    respectively. From (5) and (6), the corresponding uncertainty for X n ( p), u n ( p), for the pth artifact in the nth laboratory is given by As discussed in [8], [9], and [12], in cases with trends, the KCRV is time-dependent. As in [12], for a fixed set of weights v = {v p }, and at an optimal time (8) with (9) the corresponding optimal KCRV is given by (10) uncertainty is given by (11) In practice, a choice of v p can be formed by the "meansquare residuals" for the pth regression line for the pilot libratory, i.e., (12) where (13) In [8], the degree of equivalence of one laboratory with respect to the KCRV at some time t is defined as the difference between the predicted value of that laboratory based on the corresponding regression and the KCRV at t. From [9] and [12], for the first comparison, the degree of equivalence of the nth laboratory with respect to the KCRV t → = t →* in (10) is the difference between a weighted mean of the predicted values of that laboratory for all artifacts and the corresponding regressions and the KCRV at t . It is given by, for n = 1, … , N. For simplicity, we drop the t →* in the notation of D n, KCRV . The uncertainty of D n, KCRV is provided by Eq. (33) in [12]. For the second comparison, for example, an RMO key comparison, we assume that there were M laboratories participating and Q artifacts traveling together. We also adopt the same statistical assumptions and models for the second comparison as used in the first comparison. Where necessary, a′ will be used to distinguish quantities in the second comparison from the analogous quantities in the first comparison. Under these assumptions, the corresponding weighted means of time and measurements as in (5), are t ′ m (q) and Y m (q) for m = 1, … , M and q = 1, … , Q. Similarly, corresponding key comparison reference value KCRV ′ i ′ * is obtained using the methods previously outlined for the first comparison. Similar to the first comparison, the degree of equivalence of the laboratory with respect to KCRV ′ i ′ * at the optimal t → = t → ′* is then given by We also assume that the artifacts used in the first comparison are different from those used in the first comparison.

The Difference Between the Degrees of Equivalence of the Two Comparisons With Respect to Their KCRVs
In order to be able to link two comparisons, we must assume that K laboratories, called linking laboratories, participated in both comparisons. In the case of no trend, [7] proposed to use a weighted mean of the differences between the measurements in the two comparisons for each linking laboratory. On the other hand, [1] and [2] used a weighted mean of the differences between the degrees of equivalences of the national measurement standards with respect to the KCRVs of the two comparisons for each linking laboratory. It is clear that the difference between the degrees of equivalences of the national measurement standards with respect to the KCRVs of the two comparisons for a linking laboratory contains information not only about the difference of the measurements in the two comparisons for the same laboratory but also about the differences of the measurements of other laboratories through the two KCRVs. We think that the combined difference in the second approach used in [1] and [2] represents the difference between the two comparisons better and thus adopt it.
Without loss of generality, we assume that the first K laboratories in both comparisons are the linking laboratories. Namely, for the pth artifact ( p = 1, … , P ), (5) are the representative measurements from the linking laboratories in the first are from the linking laboratories for the qth artifact in the second comparison. Note that K < min (M, N). For the kth linking laboratory, as considered in [1] and [2] the difference between the two degrees of equivalence given in (14) and (15) is given by (16) for k = 1, … , K . Since the KCRV of a comparison in our case is time-dependent, for the chosen optimal time, D k,KCRV is a relative quantity with respect to that KCRV. We treat D k as a realization of the difference between the degrees of equivalence of the two comparisons for the kth linking laboratory. We assume that D k is random as in the statistical model given by where D is the true value of the difference between the degrees of equivalence of the two comparisons and the random error η k , k = 1, … , K , corresponds to the kth linking laboratory with zero mean. We use a weighted mean of D k (k = 1, … , K ) to estimate D. Namely, We use the weights given by (18) The quantity D will be used to estimate the differences between the degrees of equivalence of two laboratories of which one only participated in the CIPM KC and the second one only participated in the RMO KC or vice versa. Note that {D k } are correlated because D k ,KCRV and D j ,KCRV as well as D ′ k ,KCRV′ and D ′ j ,KCRV′ for any k ≠ j and k , j = 1, … , K , are correlated. Thus, the variance of D with ψ k given in (18) is not equal to statistically independent from each other, and is calculated as follows:

Pair-Wise Comparisons-Degrees of Equivalence of Pairs of National Measurement Standards
These degrees of equivalence are for any pair of two different laboratories in the two key comparisons.
(1) For any two laboratories participating in the first comparison, e.g., the CIPM KC (regardless of whether they participated in the RMO KC or not), their degrees of equivalence and the corresponding uncertainties are based on the results from the first comparison.
(2) If two laboratories participated only in the second comparison or one laboratory participated in both comparisons and the second one only participated in the second comparison, then their degree of equivalence is the corresponding one in the second comparison with its uncertainty.  This linking methodology is based on the model and the approach proposed in [12] for the case of trend. Although the mathematical derivations for the linking as well as the results from [12] seem complicate, the calculations are straightforward. We used MATLAB 1 [14] to implement the method for SIM.EM-K1, SIM.EM-K2, and SIM.EM-S6 comparisons [15] as well for the example in Sec. 5.

Linking the CCEM-K2 and SIM.EM-K2 Comparisons
To illustrate this linking approach, we applied it to the CCEM-K2 key comparison for resistance at the level of 1 G Ω and the SIM.EM-K2 key comparison for resistance at the same level. From 2006 to 2007, the Working Group for Electricity and Magnetism of the Inter-American Metrology System (SIM) conducted the key and supplementary comparisons SIM.EM-K1-K2-S6 to provide the first internationally recognized comparisons of precision resistance measurements for nations of the western hemisphere. Six NMIs participated in the comparisons. The National Institute of Standards and Technology (NIST) provided the comparison standards and acted as the pilot laboratory. Two NIST film-type resistors were used as traveling standards. Over the course of the comparison, the two traveling standards were measured at the pilot laboratory, NIST, during five time periods. For each period, an average value of the dates when the measurements were made was calculated and called the mean date of measurement. In the SIM.EM-K2 comparison, each of the five non-pilot laboratories made measurements during two separate time periods except one which only measured at one time period. An uncertainty budget that includes the Type A and Type B evaluations of uncertainties for each NMI's measurement process was also reported. Table 1 lists the information describing the CCEM-K2 results taken from Table 5 of [16]. In the table, the listed resistance measurements are relative deviations from the norminal value. Namely, in the table the entries for the three artifacts, S/N HR 9101, S/N HR 9102, and S/N HR 9106 are expressed as (measurement value-1 G Ω) × 10 6 / 1 G Ω. NIST was the pilot laboratory and the only laboratory to make multiple measurements in seven time periods. Figures 1 to 3 show the three regression lines corresponding to the measurements of the three traveling standards made by NIST, the pilot laboratory. The figures also show the measurements made by all participating laboratories.        Tables 2 and 3 list the information for the two traveling standards used in the SIM.EM-K2 comparison [15]. Figures 4 and 5 show the five regression lines corresponding to the five laboratories each with two or more measurements. These two figures were published in [12]. Table 4 lists the degrees of equivalence of national measurement standards with respect to the KCRV (× 10 6 ) (KCRV = 301.0 at t * = 1998.8) and the associated uncertianties from CCEM-K2. The results were calculated based on the statistical analysis from [12]. Table 5 lists the degrees of equivalence of national measurement standards with respect to the KCRV and the associated uncertianties from SIM.EM-K2 comparison based on [12]. The other results can be found in [12].   There were two linking laboratories: NIST and the National Research Council (NRC) of Canada. Figure 6 shows the degrees of equivalence with respect to the KCRVs for the two comparisons as listed in Tables 4 and 5. From (16) and (20), the D k for k = 1,2 and their associate standard uncertainties corresponding to NIST and NRC were calculated. From (17) to (24), D , the weighted mean of the D k with weights based on (18) and its standard uncertainty (× 10 6 ) were calculated to the values D = -1.87 and u D = 2.86. Note that the covariances of {D n,KCRV } and {D ′ m,KCRV′ } were not provided by the reports for CCEM-K2 and SIM.EM-K2. Instead, these terms, which are necessary for calculating the uncertainties of D and other terms, were calculated using (23) and (24). For the four remaining NMIs in the SIM.EM-K2 comparison, which did not participate in CCEM-K2, the degrees of equivalence of their national standards with respect to the KCRV of CCEM-K2 comparison were calculated using (25). Their corresponding standard uncertainties were calculated from (28) and are listed in Table 6.
Thirteen NMIs participated in the CCEM-K2 comparison but did not participate in the SIM.EM comparison. Similarly, four NMIs participated in the SIM.EM-K2 comparison but did not participate in the CCEM-K2 comparison. Figure 6 shows the degrees of equivalenceof the national measurement standards with respect to the KCRVs for the two comparisons. In Fig. 6, the solid squares represent the degrees of equivalence of the CIPM national measurement standards with respect to the CIPM KCRV, i.e., D n,KCRV (n = 1,…, 15) for CCEM-K2 while the open circles represent the degrees of equivalence of RMO national measurement standards with respect to the RMO KCRV, i.e., D ′ m ,KCRV′ (m = 1, …, 6) for SIM.EM-K2. For the four non-linking laboratories in the RMO comparison, the degrees of equivalence of the RMO national measurement standards with respect to the CIPM KCRV were calculated by (25) and represented by solid triangles. Pair-wise comparisons between the NMIs of these two groups, i.e., their degrees of equivalence and their associate standard uncertainties were calculated using (29) to (31) and are listed in Table 7.

Conclusions
Statistical approaches have been developed recently to deal with interlaboratory comparisons with linear trends. In this paper, a statistical analysis is proposed to link two interlaboratory key comparisons, where both have the same nominal value or values of a same magnitude and both show linear trends. The degrees of equivalence, either with respect to the KCRV of the CIPM KC for those laboratories that did not participate in the CIPM KC or between any two laboratories that participated in only one of the two comparisons are obtained with their associated uncertainties.