Statistical Interpretation of Key Comparison Reference Value and Degrees of Equivalence

Key comparisons carried out by the Consultative Committees (CCs) of the International Committee of Weights and Measures (CIPM) or the Bureau International des Poids et Mesures (BIPM) are referred to as CIPM key comparisons. The outputs of a statistical analysis of the data from a CIPM key comparison are the key comparison reference value, the degrees of equivalence, and their associated uncertainties. The BIPM publications do not discuss statistical interpretation of these outputs. We discuss their interpretation under the following three statistical models: nonexistent laboratory-effects model, random laboratory-effects model, and systematic laboratory-effects model.

of a CIPM key comparison is to establish the key comparison reference value 1 , the degrees of equivalence 2 , and their associated uncertainties on the basis of the data provided by the participants. This paper is limited to a simple CIPM key comparison where the common measurand is a physical quantity of stable value during the comparison. Many CIPM key comparisons are not simple because it is often impractical or impossible to realize exactly the same measurand for or by all participants. We use the symbol Y for the stable value of the measurand. The data provided by the participants of a simple CIPM key comparison are paired results and standard uncertainties [x 1 , u(x 1 )], …, [x n , u(x n )], where the results x 1 , …, x n are measurements of Y. The outputs of a statistical analysis of these data are the key comparison reference value x R , the degree of equivalence d i = x i -x R of the result x i , the degree of equivalence d i, j = d i -d j = x i -x j of the results x i and x j , and their associated standard uncertainties u(x R ), u(d i ), and u(d i, j ), respectively, for i, j = 1, 2, …, n and i ≠ j [1]. The key comparison reference value x R is an estimate forY. An estimate for Y is a combined result of measurement determined from the data [x 1 , u(x 1 )], …, [x n , u(x n )].
An understanding of the difference between sampling probability distributions, used in classical (frequentist) statistics, and state-of-knowledge probability distributions, used in Bayesian statistics, is necessary for proper analysis and interpretation of the data from a key comparison. Briefly, they are defined as follows. In classical statistics, the value of the measurand is assumed to be an unknown constant, often called the true value, and each result of measurement is regarded as a realization of a random variable with a sampling distribution. A sampling distribution is a probability distribution that describes the relative frequencies of occurrence for all possible results of measurement when the conditions of measurement are hypothesized to be fixed at the intended levels [4]. The metrologist relates the expected values of the sampling distributions for the results of measurement to the value of the measurand. A classical (frequentist) statistical interpretation is a statement that relates the realized measurements to what one might expect if the key comparison could be repeated infinitely many times and throughout these repetitions the hypothesized sampling distributions continued to apply.
In Bayesian statistics, the measurement data are given constants and the value of the measurand is a random variable. A probability distribution for the value of the measurand is a state-of-knowledge distribution that describes the degrees of belief for all possible values that could be attributed to the measurand [4]. The belief is based on all available information including current results of measurement and scientific judgment based on prior and other data. Similar stateof-knowledge distributions apply to the other parameters involved in assessing the value of the measurand. A Bayesian interpretation is a statement that represents the state-of-knowledge about the value of the measurand based on state-of-knowledge distributions before measurements are made and a likelihood function conditional on the current measurements [4]. The ISO Guide [5] is consistent with a Bayesian interpretation of measurements but not with a classical (frequentist) interpretation [4].
We refer to the results x 1 , …, x n as laboratory results. The laboratory results x 1 , …, x n are regarded as realizations of random variables x 1 , …, x n with sampling distributions 3 . We use the symbols X 1 , …, X n for the expected values E(x 1 ), …, E(x n ) of the sampling distributions of x 1 , …, x n , respectively. We refer to the expected values X 1 , …, X n as the laboratory expected values. We use the symbols σ 1 , …, σ n for the standard deviations S (x 1 ), …, S (x n ) of the sampling distributions of x 1 , …, x n , respectively. Here S(x i ) is the square root of the variance V(x i ) = E[x i -E(x i )] 2 of the sampling distribution of x i for i = 1, 2, …, n. The uncertainties u(x 1 ), …, u(x n ) are statistical estimates of σ 1 , …, σ n , respectively.
References [1] and [2] do not discuss statistical . A statistical analysis of the data from a key comparison and interpretation of its outputs requires assumptions and models about the relationship between the data [x 1 , u(x 1 )], …, [x n , u(x n )] and the value Y of the measurand. In Sec. 2, we discuss two assumptions, labeled as Assumption I and Assumption II, about the relationship between the laboratory expected values X 1 , …, X n and Y. Then we discuss two classical statistics models, a nonexistent laboratory-effects model 440 Volume 108, Number 6, November-December 2003 Journal of Research of the National Institute of Standards and Technology 1 "Key comparison reference value: the reference value accompanied by its uncertainty resulting from a CIPM key comparison [1]." 2 "Degree of equivalence of a measurement standard: the degree to which the value of a measurement standard is consistent with the key comparison reference value. This is expressed quantitatively by the deviation from the key comparison reference value and the uncertainty of this deviation. The degree of equivalence between two measurement standards is expressed as the difference between their respective deviations from the key comparison reference value and the uncertainty of this difference [1]." and a random laboratory-effects model, based on Assumption I. Next, we propose a systematic laboratoryeffects model based on Assumption II. We describe the key comparison reference value, the degrees of equivalence, and their associated uncertainties determined by each of the three statistical models. In Sec. 3 and 4, we discuss statistical interpretations of the pairs ] under the three statistical models. Our conclusion is given in Sec. 5.

Statistical Assumptions and Models for the Relationship Between the Data and the Value of the Measurand
In this section, we discuss statistical assumptions and models for analyzing the data from a simple CIPM key comparison to determine the key comparison reference value, the degrees of equivalence, and their associated uncertainties.

Assumptions About the Relationship Between the Laboratory Expected Values and the Value of the Measurand
One may either assume that the laboratory expected values X 1 , …, X n are all equal or allow for the possibility that X 1 , …, X n may not be equal.

Assumption I:
The expected values X 1 , …, X n are all equal. The Assumption I defined so far does not specify the relationship between the results x 1 , …, x n and Y. Therefore, in concert with Assumption I, it is generally assumed that the common expected value is equal to Y, i.e., X 1 = … = X n = Y. Under Assumption I, the results x 1 , …, x n are subject to intralaboratory variations only.

Assumption II:
The expected values X 1 , …, X n may not be equal, i.e., X i ≠ X j for some i, j = 1, 2, …, n and i ≠ j. Therefore, not all of X 1 , …, X n may equal the value Y of the measurand. The Assumption II defined so far does not specify the relationship between the results x 1 , …, x n and Y. Therefore, in concert with Assumption II, it is generally assumed that Y is either somewhere in the range of results x 1 , …, x n or in the vicinity of this range 4 [6]. Under Assumption II, the results x 1 , …, x n are subject to both the intralaboratory variations represented by the uncertainties u (x 1 ), …, u (x n ) and the interlaboratory variation arising from the dispersion of X 1 , …, X n about Y. The differences (X 1 -Y), …, (X n -Y) are laboratory-effects (biases) due to unrecognized sources of error, denoted by b 1 , …, b n , in the results x 1 , …, x n . The biases are common to all measurements in a particular laboratory but may be different for different laboratories.

Assumption About the Uncertainties Submitted by the Participants
The standard uncertainties u(x 1 ), …, u(x n ) submitted by the participants of a key comparison are estimates obtained by combining various estimated components of uncertainty in determining the value Y of the measurand. A combined standard uncertainty u(x i ) may be unreliable for various reasons. For example, a classical (frequentist) Type A component of u(x i ) calculated from a small number of independent measurements is unreliable 5 [5]. Likewise, if X 2 were equal to Y then the interval [x 2 ± 2u(x 2 )] would represent an approximate range of the plausible values of Y, and so on for X 3 , X 4 , …, X n . It follows from Assumption II that any one or more of the expected values X 1 , …, X n may be close to or equal to Y; therefore, the total interval consisting of the union of intervals [x i ± 2u(x i )], for i = 1, 2, …, n, represents an approximate range of the plausible values of Y. However, most metrologists assign greater belief-probability to the middle than to the ends of the total interval. 5 The unreliability of a classical (frequentist) estimate of uncertainty arising from a small number of measurements is quantified by degrees of freedom [5]. mining the key comparison reference value x R , the degrees of equivalence d i and d i, j , and their associated standard uncertainties. In this paper, we do not discuss the additional uncertainty that arises from the unreliability of u(x 1 ), …, u(x n ).
Classical (frequentist) statistical analyses and interpretations discussed in this paper are based on the assumption that the estimated uncertainties u(x 1 ), …, u(x n ) are equal to the true standard deviations σ 1 , …, σ n of the sampling distributions of x 1 , …, x n , respectively. Most metrologists make this assumption. For example, the expression u( Statistical analyses based on the ISO Guide regard a laboratory expected value X i as a variable with a stateof-knowledge distribution having expected value x i and standard deviation u(x i ). Such analyses require the assumption that the estimated uncertainties u(x 1 ), …, u(x n ) are sufficiently reliable.

Classical (Frequentist) Statistics Models Based on Assumption I
The weighted mean x W = Σ i w i x i / Σ i w i and the expression u(x W ) = 1/√[Σ i w i ], where w i = 1/u 2 (x i ) for i = 1, 2, …, n, are often used as the key comparison reference value x R and its associated standard uncertainty u(x R ), respectively. The use of x W as x R and u(x W ) as u(x R ) is based on the following classical (frequentist) statistics model.

Nonexistent Laboratory-Effects Model
The results are regarded as realizations of the random variables x 1 , …, x n , where (1) and In this model, the parameter Y is identified with the value of the measurand and the errors e 1 , …, e n are mutually independently distributed random variables with sampling distributions. The sampling distributions of e 1 , …, e n are generally assumed to be normal (Gaussian). The expected values of e 1 , …, e n are assumed to be zero and the variances of e 1 , …, e n are assumed to be u 2 (x 1 ), …, u 2 (x n ), respectively. Under model (1) (1), the results x 1 , …, x n are free of laboratory-effects (biases). Therefore, we refer to it as a nonexistent laboratory-effects model. The best least-squares estimate for the parameter Y of the nonexistent laboratory-effects model (1) is the weighted mean The term best least-squares estimate 6 means that the estimate x W has the smallest variance among all estimates of Y that are both linear functions of the results x 1 , …, x n and have the expected value Y. The standard deviation of the sampling distribution of x W is u( The corresponding degrees of equivalence are Note 1: When not all uncertainties u(x 1 ), …, u(x n ) are sufficiently reliable estimates of the true standard deviations σ 1 , …, σ n , the true standard deviation of the sampling distribution of the weighted mean x W may be larger than the true standard deviation of the sampling distribution of the arithmetic mean x A . Thus in this case the weighted mean x W may be an inferior key comparison reference value to the arithmetic mean x A .

Random Laboratory-Effects Model
The classical statistics model based on Assumption I for the situation where the dispersion of results x 1 , …, x n may be more than what can reasonably be attributed to the intralaboratory variances u 2 (x 1 ), …, u 2 (x n ) is as follows. The results are regarded as realizations of the random variables x 1 , …, x n , where , effects model [7]. Here the term random means that the biases b 1 , …, b n are regarded as random variables with the same sampling distribution that is assumed to be normal with expected value zero and variance σ b 2 . Under the random laboratory-effects model (2), the expected value E(x i ) is equal to Y and the variance V(x i ) is equal to σ b 2 + u 2 (x i ) for i = 1, 2, …, n. The nonexistent laboratory-effects model (1) is a special case of the random laboratory-effects model (2) where σ b 2 = 0, which means that the biases b 1 , …, b n are all zero, i.e., A popular estimate for the parameter Y of model (2) is the weighted mean 7  . The estimate s b 2 inflates each of the intralaboratory variances u 2 (x 1 ), …, u 2 (x n ) just enough to account for the dispersion of results x 1 , …, x n that is not accounted for by model (1). Under the assumption that the estimated variances s b 2 + u 2 (x 1 ), …, s b 2 + u 2 (x n ) are regarded as the true variances of the sampling distributions of x 1 , …, x n , the best estimate of the parameter Y of model (2) is the weighted mean x W and the standard uncertainty associated with x W is u( The advantage of model (2) relative to model (1) is that it allows for the possibility that the dispersion of results x 1 , …, x n may be more than what can reasonably be attributed to the intralaboratory variances u 2 (x 1 ), …, u 2 (x n ). When the dispersion of x 1 , …, x n is not more than what can reasonably be attributed to u 2 (x 1 ), …, u 2 (x n ), the estimate s b 2 is zero. In that case, model (2) yields the same x R and u(x R ) as model (1). Therefore, there is no disadvantage to using model (2) in place of model (1).
The random laboratory-effects model (2) of classical statistics is conceptually faulty for the analysis of a CIPM key comparison for the following reasons. First, the participants of a CIPM key comparison are specific NMI laboratories rather than randomly chosen from a large population of laboratories. Therefore, the biases b 1 , …, b n may not be regarded as random variables with the same sampling distribution. Second, the assumption that the sampling distribution of the biases b 1 , …, b n is a normal distribution with expected value zero may not be justified. The next section introduces a new model that does not assume that the biases b 1 , …, b n are random variables with a normal sampling distribution.

A Model Based on Assumption II and the ISO Guide
A statistical analysis of the data from a simple CIPM key comparison based on Assumption II requires one to account for the uncertainty that arises from the unknown bias in a combined result of measurement that is used as an estimate for Y. Before publication of the ISO Guide, there was no generally accepted approach to account for the uncertainty that arises from an unknown bias. The approach proposed by the ISO Guide to account for the uncertainty that arises from an unknown bias is now generally accepted. So we have used the ISO Guide to develop the following systematic laboratory-effects model.

Systematic Laboratory-Effects Model
We start with a combined result of the form ∑ i a i x i , where Σ i a i = 1, that is used as an initial estimate for Y. This estimate requires the assumption that Y is within the range of results x 1 , …, x n . We refer to the initial estimate as the uncorrected combined result (UCR) and denote it by x UCR = Σ i a i x i . If a i = w i / Σ i w i , then x UCR is the weighted mean x W = Σ i w i x i / Σ i w i , where w i = 1/u 2 (x i ) for i = 1, 2, …, n. If a i = 1/n for i = 1, 2, …, n, then x UCR is the arithmetic mean x A = Σ i x i / n. Let X UCR = Σ i a i X i be the expected value of the sampling distribution of x UCR . According to Assumption II, the result x UCR is subject to the bias (X UCR -Y). The ISO Guide recommends that the result x UCR should be corrected to counter its possible bias and the uncertainty associated with the correction should be included in the combined standard uncertainty associated with the corrected result. The bias (X UCR -Y) is an unknown constant but the correction for bias, denoted by C, is a variable with a state-of-knowledge probability distribution. If the expected value and standard deviation of a state-of-knowledge probability distribution for the correction variable C are denoted by c and u(c), respectively, then the correction applied to the result x UCR to counter its possible bias is c and the standard uncertainty associated with the correction is u(c). In order to specify a state-of-knowledge probability distribution for the correction variable C, the laboratory expected values X 1 , …, X n and the value Y of the measurand are regarded as variables with stateof-knowledge distributions and the data x 1 , …, x n and u(x 1 ), …, u(x n ) are regarded as given constants. A state-of-knowledge distribution for X i represents the state of knowledge about the value Y of the measurand in the laboratory labeled i for i = 1, 2, …, n. The expected value E(X i ) and standard deviation S(X i ) of the variable X i are assumed to be x i and u(x i ), respectively, for i = 1, 2, …, n [5], [4]. It follows that X UCR = Σ i a i X i is a variable with a state-of-knowledge probability distribution. The expected value of X UCR is E(X UCR ) = Σ i a i E(X i ) = Σ i a i x i = x UCR . In the expression (Y -X UCR ) for the negative of bias, treated as a variable, we replace X UCR with its expected value x UCR . Then a probability distribution for C represents belief about the possible values of (Y -x UCR ), where x UCR is a constant and Y is the variable. The belief about possible values of Y is based on all available information including results of measurement and scientific judgment. In reference [6], we proposed a triangular distribution for the correction variable C, with peak at 0 and default limits [x (1) -x UCR ] = min{x 1 -x UCR , …, x n -x UCR } and [x (n) -x UCR ] = max{x 1 -x UCR , …, x n -x UCR }. A criticism of the proposed triangular distribution with default limits is that it is determined by the extreme results x (1) = min{x 1 , …, x n } and x (n) = max{x 1 , …, x n }, which are sometimes suspected to be in error.
Here, we propose a discrete-equal-probability distribution that is determined by all of the results x 1 , …, x n . The results x 1 , …, x n are plausible values of Y as determined by competent laboratories. 8 So the known constant differences (x 1 -x UCR ), …, (x n -x UCR ) are plausible values of (Y -x UCR ). These differences are a statistical basis for specifying a probability distribution for C. Let c i = x i -x UCR for i = 1, 2, …, n. Suppose c 1 , …, c n are assigned probabilities p 1 , …, p n . Then the 2 ]. Frequently, the available scientific knowledge is inadequate to assign different probabilities p 1 , …, p n to c 1 , …, c n . Therefore, we propose the discreteequal-probability distribution for which p i = 1/n for i = 1, 2, …, n. The expected value and standard deviation of C based on discrete-equal-probability distribution are c = x A -x UCR and u(c) = √[Σ i (x i -x A ) 2 /n], respectively, where x A = Σ i x i / n is the arithmetic mean of the results x 1 , …, x n .
A measurement equation is required to incorporate correction for possible bias in a combined result of measurement for Y. The measurement equation that corresponds to the bias (X UCR -Y) in the uncorrected combined result x UCR is Y = X UCR + C. This measurement equation is widely applicable in metrology [10]. It suggests the following model for the value Y of the measurand: (3) where a 1 , …, a n are constants such that Σ i a i = 1. In this model, X 1 , …, X n , X UCR , C, and Y are variables with state-of-knowledge distributions. The expected value and standard deviation of X i are the given constants x i and u(x i ), respectively, for i = 1, 2, …, n. A stateof-knowledge distribution for the correction variable C is defined independently of the state-of-knowledge distributions for the variables X 1 , …, X n , after the latter have been specified. In particular, X UCR and C are independently distributed. We refer to model (3) [represented by Eq. (3)] as a systematic laboratory-effects model to distinguish it from the random laboratoryeffects model (2) that regards the biases (systematic errors) b 1 , …, b n as random variables having the same sampling distribution with expected value zero. Suppose the standard deviation of the variable X UCR is S(X UCR ) = u(x UCR ). Then the corrected combined result for Y determined from the systematic laboratory-effects model (3) is y = x UCR + c and its associated standard uncertainty is u(y) = √[u 2 (x UCR ) + u 2 (c)].
The systematic laboratory-effects model (3) allows for the possibility that not all pairs of the variables X 1 , …, X n may be independently distributed. The variance V(X UCR ) = u 2 (x UCR ) is determined from the variances and covariances of the variables X 1 , …, X n . When the distributions of X 1 , …, X n are independent and X UCR is the weighed mean When the distributions of X 1 , …, X n are independent and X UCR is the arithmetic mean X A = Σ i X i / n, then u 2 (x UCR ) = V(X A ) = (1/n 2 )Σ i V(X i ) = (1/n 2 )Σ i u 2 (x i ) 9 .
In order to specify c and u(c), one is free to use any reasonable distribution for C, based on scientific judgment. When the discrete-equal-probability distribution Since the harmonic mean of positive numbers is less than or equal to their arithmetic mean, V(X W ) ≤ V(X A ). When u 2 (x 1 ), … , u In that case, the result of measurement for Y is Following the ISO Guide, the result y and uncertainty u(y) determined from the systematic laboratory-effects model (3) are interpreted as the expected value and standard deviation of a state-of-knowledge distribution for the values that could reasonably be attributed to Y based on the data x 1 , …, x n and u(x 1 ), …, u(x n ) [5], [4], [6]. Thus the key comparison reference value x R based on the systematic laboratory-effects model (3) is y and uncertainty u(x R ) is u(y). The corresponding degrees of equivalence are d i = x i -y and d i, j = x i -x j for i, j = 1, 2, …, n and i ≠ j. The uncertainties u(d i ) and u(d i, j ) are determined from state-of-knowledge distributions for the variables X 1 , …, X n and Y.

Interpretation of the Key Comparison
Reference Value and Its Associated Uncertainty

Classical Statistics Models Based on Assumption I
The nonexistent laboratory-effects model and the random laboratory-effects model are based on classical (frequentist) statistics. In particular, the results x 1 , …, x n are regarded as realizations of random variables with sampling distributions and Y is an unknown constant. Therefore, the key comparison reference value x R is a realization of a random variable with a sampling distribution that has expected value Y and standard deviation u(x R ) = u(x W ) = 1/√[Σ i w i ]. In the nonexistent laboratoryeffects model w i is 1/u 2 (x i ) and in the random laboratoryeffects model w i is 1/[s b 2 + u 2 (x i )] for i = 1, 2, …, n. The interval [x R ± 2u(x R )] determined from a classical statistics model is a confidence interval for Y computed from the data x 1 , …, x n and u(x 1 ), …, u(x n ). Imagine that the CIPM key comparison could be repeated infinitely many times in exactly the same conditions using exactly the same instruments and artifacts. Now imagine that throughout these repetitions exactly the same sampling distributions continued to apply to the random variables x 1 , …, x n . Then the confidence level is the fraction of the infinitely many hypothetical intervals, such as [x R ± 2u(x R )], that would include Y [4].

Systematic Laboratory-Effects Model Based on Assumption II
The key comparison reference value x R and uncertainty u(x R ) determined from the systematic laboratoryeffects model are given constants that represent the ex-pected value and standard deviation of a state-ofknowledge distribution for Y based on the data x 1 , …, x n and u(x 1 ), …, u(x n ). The interval [x R ± 2u(x R )] determined from the systematic laboratory-effects model is an expanded uncertainty interval for Y. The coverage probability (level of confidence) of the interval [x R ± 2u(x R )] is the fraction of a state-of-knowledge distribution for Y that is encompassed by this interval [4].

Classical Statistics Models Based on Assumption I
In the random laboratory-effects model and its special case the nonexistent laboratory-effects model, the expected values of the sampling distributions of x 1 , …, x n , and x R are all equal to Y. Therefore, the expected values of the sampling distributions of all degrees of equivalence d i = x i -x R and d i, j = x i -x j are zero, for i, j = 1, 2, …, n and i ≠ j. This implies that all computed degrees of equivalence, whether small or large, are statistical estimates of zero. In particular, according to these models, all degrees of equivalence published in the key comparison database (KCDB) [11] are estimates of zero.

Systematic Laboratory-Effects Model Based on Assumption II
In the systematic laboratory-effects model, the results x 1 , …, x n are the expected values and the uncertainties u(x 1 ), …, u(x n ) are the standard deviations of state-of-knowledge distributions for the laboratory expected values X 1 , …, X n , treated as variables. It follows that the degree of equivalence d i = x i -x R = x i -y is the expected value of a state-of-knowledge distribution for the laboratory effect (bias) X i -Y for i = 1, 2, …, n, and the degree of equivalence d i, j = x i -x j is the expected value of a state-of-knowledge distribution for the difference X i -X j for i, j = 1, 2, …, n and i ≠ j. The uncertainty u(d i ) is the standard deviation 10 of X i -Y and the uncertainty u(d i, j ) is the standard deviation of X i -X j , for i, j = 1, 2, …, n and i ≠ j.
Volume 108, Number 6, November-December 2003 Journal of Research of the National Institute of Standards and Technology 445 10 The standard deviation of X i -Y depends on the covariance between X i and Y for i = 1, 2, …, n. Since Y = X UCR + C = Σ i a i X i + C and the variable C is distributed independently of the variables X 1 , …, X n , the covariances C(X i , Y), for i = 1, 2, …, n, can be determined from the variances and covariances of X 1 , …, X n . Then u(d i ) = √[V(X i -Y)], where the variance V(X i -Y) is equal to V(X i ) + V(Y) -2×C(X i , Y).

Conclusion
We addressed a simple CIPM key comparison where the common measurand is a physical quantity of stable value during the comparison. We discussed statistical interpretation of the key comparison reference value, the degrees of equivalence, and their associated uncertainties determined from the following three statistical models: nonexistent laboratory-effects model, random laboratory-effects model, and systematic laboratoryeffects model. The first two models are based on classical (frequentist) interpretation of measurements. The systematic laboratory-effects model is based on Bayesian interpretation of measurements.
The key comparison reference value x R and uncertainty u(x R ) determined from the systematic laboratoryeffects model represent the expected value and standard deviation of a state-of-knowledge distribution for the value Y of the measurand. Therefore their statistical interpretation agrees with the ISO Guide. According to the systematic laboratory-effects model, the degree of equivalence d i and uncertainty u(d i ) are, respectively, the expected value and standard deviation of a state-ofknowledge distribution for the laboratory effect (bias) X i -Y, for i = 1, 2, …, n, and the degree of equivalence d i, j and uncertainty u(d i, j ) are, respectively, the expected value and standard deviation of a state-of-knowledge distribution for the difference X i -X j , for i, j = 1, 2, …, n and i ≠ j. Thus the degrees of equivalence determined from the systematic laboratory-effects model quantitate the agreements and disagreements of laboratory results. Therefore, the systematic laboratory-effects model is suitable for the data analysis of a simple CIPM key comparison.