Comprehensive Review and Critical Evaluation of the Half-Life of Tritium

As part of the preparation and calibration of three new National Institute of Standards and Technology (NIST) tritiated-water radioactivity Standard Reference Materials (SRMs), we have performed a comprehensive review and critical evaluation of the half-life of tritium (hydrogen-3). Twenty three experimentally-determined values of the half-life of tritium, reported between 1936 and 2000, were found. Six of these values were updated by later values. Two values were limits. Two values were deemed to be outliers. The 13 remaining values were evaluated in several ways. The results are compared with the results of other recent evaluations and all are found to be in good agreement. Our final recommended value for the half-life of tritium is the average of the adopted values from the four most recent evaluations, (4500 ± 8) d, where 8 d corresponds to one standard uncertainty.


Introduction
The history of tritium (hydrogen-3) is an interesting one [1]. The first measurement of the half-life of tritium was reported by McMillan [2] in 1936, more than 3 years before Alvarez and Cornog [3] reported the discovery of radioactive tritium and made their own measurements of the half-life [4,6]. McMillan measured the rate of decay of the radiation from a beryllium target that had been irradiated with deuterons for about a year in the cyclotron at the University of California at Berkeley. McMillan thought that the radiation might be from beryllium-10. It was realized several years later [5,6] that the radiation was actually from tritium.
Since that time, there have been numerous measurements of the half-life of tritium. As part of the preparation and calibration of three new National Institute of Standards and Technology (NIST) tritiated-water ra-dioactivity Standard Reference Materials (SRM 4361C, SRM 4926E, and SRM 4927F), we have performed a comprehensive review and critical evaluation of the reported half-lives. All of the experimentally-determined values of the half-life of tritium [2,4, known to the evaluators as of March 2000 are shown in Table 1. These measurements were reported between 1936 and 2000. The 23 half-life values listed are the direct result of experimental measurements carried out by the author(s) of the cited references in the table. The most recent direct experimental measurement [26] was performed as part of the calibration of the new NIST standards. In addition, one half-life value was reported [27] (not included in Table 1) that was calculated using published experimental values of the tritium beta end-point energy, published experimental values of the heat output per gram of tritium, and a theoretically derived ratio of the average beta decay energy to the beta end-point energy.

Screening of the Data
The values shown in Table 1 were first screened. The screened values are shown in Table 2. The screening was done as follows: A. We obtained a copy of each publication and carefully read it. B. We verified the values listed in Table 1 as being the reported values. C. We examined the data presented in the publication to obtain the best value of the reported half-life in days. In most cases, the time was actually measured in days and the decay constant was actually computed in terms of reciprocal days or reciprocal seconds.
Where only the half-life in years was reported, it was converted to the half-life in days (by multiplying by 365.2422 d per mean solar year). The preferred unit for the tritium half-life is the day because: (i) it is a well-defined unit, equal to 86 400 s, and the second is a unit of the International System of Units (SI); (ii) it is the most appropriate unit for most calculations, since decay times are almost always actually measured in days; and (iii) it eliminates the conversion and confusion associated with different "years" (calendar, solar, sidereal,etc.). D. We determined the meaning of the author's stated uncertainty (confidence limit, probable error, standard deviation, etc.). (In some cases, it was not possible to determine the meaning of the author's stated uncertainty.) We then calculated the author's equivalent standard uncertainty (i.e., the author's equivalent estimated standard deviation). E. We made an independent estimate of the standard uncertainty of the reported half-life. If the author's equivalent standard uncertainty was within a factor of 2 of our estimate, then we used the author's equivalent standard uncertainty. If not, we used our estimate (see Sec. 3, Reevaluation of Uncertainties). F. We determined whether the reported value updated an earlier reported value, either the half-life or the uncertainty. An earlier value was considered to be updated by a later value if the data upon which the later value was based included the data upon which the earlier value was based. When this was the case, the earlier value was omitted from further evaluation. Six values were omitted because of later updates [4,10,15,17,20,21]. G. We determined whether the reported value was a limit or was an outlier. Two values are limits [2,6]. Two values are clearly outliers [7,9], each having a difference of more than 50 standard deviations from the mean of the remaining distribution. Two other values [8,14] are marginal (see Sec. 4, Test for Normality of Data).

Reevaluation of Uncertainties
In an evaluation such as this, which includes values reported from 1936 to 2000 in Table 1 and from 1947 to 2000 in Table 2, the most difficult problem is to evaluate the uncertainty associated with each measurement in a consistent way. Once one has a set of consistent uncertainty estimates, the various statistical treatments can be carried out and the results of the various treatments can be meaningfully compared.
Since the mid 1980s, most authors have reported their measurement uncertainties more thoroughly and more in accord with internationally-accepted guidelines [28]. Before 1980, most authors reported uncertainties whose meanings were often unstated. Even when stated, the uncertainties varied widely for seemingly similar measurements.
Therefore, as part of this evaluation, we made an independent estimate of the standard uncertainty of each reported half-life. We recognize that there is a large uncertainty associated with each of our estimates. Hence, if the author's equivalent standard uncertainty was within a factor of 2 of our estimate, then we used the author's equivalent standard uncertainty. If not, then we used our estimate.

Test for Normality of Data
We tested the data, both n = 11 data points and n = 13 data points, for normality (strictly speaking, for not non-normality) using the probability plot correlation coefficient test for normality developed by Filliben [29]. The results are shown in Figs. 1 and 2. The test statistic, r , is the normal probability plot correlation coefficient. For n = 11, r = 0.961, and the probability that the data are normally distributed is approximately 0.3. Based upon this probability, the assumption that the data are normally distributed would usually be accepted. For n = 13, r = 0.952, and the probability that the data are normally distributed is approximately 0.15. The assumption that the data are normally distributed is now more marginal, although typically a probability of less than 0.10, or perhaps even less than 0.05, is required before rejecting the hypothesis of normality.
We have included all 13 data points in Table 2. Because of the marginally normal distribution of the data points for n = 13, the statistical calculations were carried out with n = 11 and with n = 13 to see if there was any significant difference in the results. Fig. 1. Normal probability plot for the n = 11 data set. The abscissa is the median order statistic from a normal N (0,1) distribution as given by Filliben [29]. The test statistic r is the normal probability plot correlation coefficient (i.e., the correlation coefficient for the linear regression line that is shown).

Fig. 2.
Normal probability plot for the n = 13 data set. The abscissa is the median order statistic from a normal N (0,1) distribution as given by Filliben [29]. The test statistic r is the normal probability plot correlation coefficient (i.e., the correlation coefficient for the linear regression line that is shown).

Data Evaluation Methods
The values shown in Table 2 were evaluated using three statistical methods, both without (n = 11) and with (n = 13) the first and last entries [8,14]. The results are shown in Table 3. The evaluation methods used were as follows (u denotes the estimated standard uncertainty): A. Determine the median and the estimated standard deviation of the median. This method is very robust with regard to outliers. We have used the method of Müller [30] to obtain the estimated standard deviation of the median. (The Müller paper appears in this issue of the Journal immediately following this paper.) B. Determine the weighted mean using equal weights of w i = (1/u i 2 ) avg and the estimated standard deviation of this mean. The equally-weighted mean (usually called the unweighted mean if using weights w i = 1) is unaffected by the individual stated uncertainties and does not reflect the fact that measurement capabilities have improved over time. The concern with this method is that the results may be influenced too much by the values with stated uncertainties higher than (u i ) avg . The estimated mean is not affected by the actual values of the weights, as long as all of the weights are equal. The reason that we set the weights equal to the average value of 1/u i 2 is so that we can calculate the estimated standard deviation of the mean in the same ways that we use with method C. Table 3. The half-life of tritium and the estimated standard deviation of the mean calculated using three statistical evaluation methods with n = 11 and with n = 13. u denotes the estimated standard uncertainty. The half-lives and uncertainties used are shown in Table 2. See Table 4 Table 4 for our final recommended values) C. Determine the weighted mean using weights w i = (1/ u i 2 ) and the estimated standard deviation of this mean. This method minimizes the estimated variance and emphasizes the stated uncertainties very strongly. The concern with this method is that the results may be influenced too much by the values with the smallest stated uncertainties, some of which may be underestimated.

Formulas Used
The estimated standard deviation of the median was computed using the method of Müller [30]: where MAD is the mean absolute deviation from the median, and n is the number of data points (11 or 13).
The estimated mean, denoted by m , was computed from where the x i are the experimentally-determined values of the half-life of tritium shown in Table 2 and the w i are the corresponding assigned weights. The estimated variance of the mean, denoted by s m 2 , was computed as where v = n Ϫ 1 is the degrees of freedom. The estimated standard deviation of the mean, denoted by s m , is the square root of the estimated variance of the mean. If the quantity is equal to one, then Eq. (3) reduces to simply This will be the case if the weights used are equal to the inverse of the actual variances (i.e., if each w i = 1 / (x i Ϫ m ) 2 ).
We have never seen an experimental data set for which Eq. (4) was actually equal to one (certainly not any data set where the uncertainty of each data point was evaluated by a different experimenter). None-theless, Eq. (5) is often used, perhaps because of computational convenience. The reduced chi-squared, 2 / v , and the Birge ratio, R , are measures of the degree to which the weights used are, in fact, equal to the inverse of the actual variances. If the reduced chi-squared and the Birge ratio are significantly larger than one, then the data are suspect and it is likely that at least some of the weights are overestimated (i.e., at least some of the variances are underestimated). Likewise, if the reduced chi-squared and the Birge ratio are significantly smaller than one, it is likely that at least some of the weights are underestimated.
For example, if we use the n = 11 data set with the author's equivalent standard uncertainties, then we get m = 4496.3 d and Thus Eq. (5) underestimates the variance of the mean by a factor of 18.2 (underestimates the standard deviation of the mean by a factor of 4.3). This is the result of the very low uncertainties in Refs. [13] and [19]. If we use the reevaluated standard uncertainties shown in Table 2 for the n = 11 data set, then we get m = 4496.7 d and Eq. (5) now underestimates the variance of the mean by only a factor of 1.61 (underestimates the standard deviation of the mean by a factor of 1.3).
If we use the reevaluated standard uncertainties shown in Table 2 for the n = 13 data set, then we get m = 4497.0 d and It is our experience that most experimenters tend to underestimate their own uncertainties, so that Eq. (5) almost always gives a smaller value than Eq. (3). In Table 3 we present the estimated standard deviations of the mean calculated using Eqs. (3) and using Eq. (5). As expected, the values calculated using Eq. (5) are significantly smaller than the values calculated using Eq. (3).

Discussion of Results
We can not emphasize strongly enough that estimated uncertainties have large uncertainties . We used the half-lives and the reevaluated standard uncertainties shown in Table 2 to calculate the values shown in Table  3. The estimated standard deviations of the mean vary by a factor of 2 or more (the estimated variances of the mean by a factor of 4 or more), depending upon the equation (and the inherent assumptions) used to calculate them. We think that it is important for experimenters, and reviewers as well, to explicitly state how each estimated uncertainty was obtained. Each adopted value resulting from this evaluation is the grand average of the results obtained using methods A, B, and C with n = 11 and with n = 13 (see Table 3 and Sec. 5, Data Evaluation Methods). Whether based upon 11 data points or 13 data points, the average value obtained for the half-life of tritium is almost exactly the same (4499.3 d and 4499.6 d). The average standard uncertainty (estimated standard deviation of the mean) is slightly larger with n = 13 than with n = 11 (8.7 d vs 7.5 d).

Comparison with Other Evaluations
Others have also compiled and evaluated the half-life of tritium. The first compilation of nuclear data for radioactive isotopes was published by Fea in 1935 [31]. In 1940, Livingood and Seaborg [32] published the first in a series of compilations [32,33,34,35,36,38,44,48] that has become the Table of Isotopes, now in its eighth edition. The first compilation of adopted or recom- Average of adopted values from Refs. [47,48,49] and Table 3. a a Our final recommended value for the standard uncertainty is the average of three of the four most recent adopted uncertainties. The uncertainty given in Ref. [48] was omitted because it appears to be too high by about a factor of 2.
mended values was that of Goldstein and Reynolds [37] in 1966. Estimated uncertainties were given as ranges (<1 %, 1 % to 5 %, >5 %). Adopted values, although not called that, were also given in Refs. [38] and [44], but no uncertainty estimates were provided. Table 4 is a summary of the evaluations that have been published since 1960. As more independent measurements of the half-life of tritium have been reported, the published adopted or recommended values have converged. There seems to be very good agreement among the four most recent evaluations with regard to the adopted half-life of tritium and, except for Ref. [48], with regard to the adopted standard uncertainty . The half-life and uncertainty given in Ref. [48] were taken from the Evaluated Nuclear Structure Data File (ENSDF) [46] and the uncertainty appears to be too high by about a factor of two. We are still trying to determine the origin of these values, which appear to have been in ENSDF since about 1987.

Final Recommended Value
Our final recommended value for the half-life of tritium is the average of the adopted values from the four most recent evaluations, (4500 Ϯ 8) d, where 8 d corresponds to one standard uncertainty. See Table 4.