An International Marine-Atmospheric 222Rn Measurement Intercomparison in Bermuda Part II: Results for the Participating Laboratories

As part of an international measurement intercomparison of instruments used to measure atmospheric 222Rn, four participating laboratories made nearly simultaneous measurements of 222Rn activity concentration in commonly sampled, ambient air over approximately a 2 week period, and three of these four laboratories participated in the measurement comparison of 14 introduced samples with known, but undisclosed (“blind”) 222Rn activity concentration. The exercise was conducted in Bermuda in October 1991. The 222Rn activity concentrations in ambient Bermudian air over the course of the intercomparison ranged from a few hundredths of a Bq · m−3 to about 2 Bq · m−3, while the standardized sample additions covered a range from approximately 2.5 Bq · m−3 to 35 Bq · m−3. The overall uncertainty in the latter concentrations was in the general range of 10 %, approximating a 3 standard deviation uncertainty interval. The results of the intercomparison indicated that two of the laboratories were within very good agreement with the standard additions, and almost within expected statistical variations. These same two laboratories, however, at lower ambient concentrations, exhibited a systematic difference with an averaged offset of roughly 0.3 Bq · m−3. The third laboratory participating in the measurement of standardized sample additions was systematically low by about 65 % to 70 %, with respect to the standard addition which was also confirmed in their ambient air concentration measurements. The fourth laboratory, participating in only the ambient measurement part of the intercomparison, was also systematically low by at least 40 % with respect to the first two laboratories.

As part of an international measurement intercomparison of instruments used to measure atmospheric 222 Rn, four participating laboratories made nearly simultaneous measurements of 222 Rn activity concentration in commonly sampled, ambient air over approximately a 2 week period, and three of these four laboratories participated in the measurement comparison of 14 introduced samples with known, but undisclosed (''blind'') 222 Rn activity concentration. The exercise was conducted in Bermuda in October 1991. The 222 Rn activity concentrations in ambient Bermudian air over the course of the intercomparison ranged from a few hundredths of a Bq и m Ϫ3 to about 2 Bq и m Ϫ3 , while the standardized sample additions covered a range from approximately 2.5 Bq и m Ϫ3 to 35 Bq и m Ϫ3 . The overall uncertainty in the latter concentrations was in the general range of 10 %, approximating a 3 standard deviation uncertainty interval. The results of the intercomparison indicated that two of the laboratories were within very good agreement with the standard additions, and almost within expected statistical variations. These same two laboratories, however, at lower ambient concentrations, exhibited a systematic difference with an averaged offset of roughly 0.3 Bq и m Ϫ3 . The third laboratory participating in the measurement of standardized sample additions was systematically low by about 65 % to 70 %, with respect to the standard addition which was also confirmed in their ambient air concentration measurements. The fourth laboratory, participating in only the ambient measurement part of the intercomparison, was also systematically low by at least 40 % with respect to the first two laboratories.

Introduction
An international measurement intercomparison of instruments used to measure 222 Rn in marine atmospheres was organized by Drexel University and conducted in Bermuda in October 1991. This paper, the second of two in a series, provides the intercomparison results for the participating laboratories. The intercomparison exercise consisted of two components: (1) measurement comparisons among four laboratories of commonly sampled ambient air over approximately a 2-week test period; and (2) measurement comparisons between three of these four laboratories of a select number of introduced samples with known, but undisclosed (i.e., ''blind'') 222 Rn activity concentration that could be related to U.S. national 226 Ra standards.
The first paper in this series [1] provides detailed descriptions of the experimental arrangements, the 226 Ra source calibrations used to obtain known 222 Rn activity concentrations, the methodology for the standardized sample additions, and the protocol for the experimental aspects of the intercomparison.
The standardized sample additions were provided by the National Institute of Standards and Technology, and were made with a commercially-available, 226 Ra source that was calibrated by NIST in terms of the available 222 Rn concentration as a function of constant flow rate through the source. The source was employed in conjunction with a specially-designed manifold, also provided by NIST, that allowed the determination of wellknown dilution factors for the standardized sample additions which were introduced into a common streamline on a sampling tower used by the participants for their measurements. Several confirmatory measurements were also performed during the course of the intercomparison by collecting ''grab samples'' from the manifold and returning them to NIST for assaying the 222 Rn activity concentrations. These confirmatory measurements were made to insure that the radium source and manifold were not performing differently at the test site in Bermuda as at NIST where the calibrations were made.
The participating laboratories in the intercomparison were: (hereafter referred to as Lab F, Lab E, Lab A, and Lab D, respectively). The latter three laboratories (E, A, and D) performed simultaneous measurements from a common stream line on an ambient air sampling tower. Lab F, in contrast, sampled ambient air nearly adjacent to the inlet of the sampling tower and, therefore, did not participate in the intercomparison of standardized sample additions, but only in the intercomparison of ambient air measurements.
The experimental configuration used for the intercomparison is illustrated in Fig. 1. Additional details may be found in Collé, et al. [1]. The mean 222 Rn activity concentrations (over some given sampling-time interval) used for the intercomparisons, and as labelled in the figure, are defined [1] as follows: C 0 is the mean concentration provided by NIST for the standardized sample additions (independently evaluated for each of their measurement/sampling intervals); C A is the mean concentration in ambient air; and C 1 is the mean concentration in the main stream line sampled by Labs E, A, and D (e.g., C 1 = C 0 + C A ).

Measurement Methods of the Participating Laboratories
Each of the instruments and measurement methodologies employed by the four participating laboratories were based on different analytical approaches. The main characteristics for each are summarized in Table 1. The tabulation is comprised of data and information supplied by the participating laboratories. Such a comparative summary is, of course, limiting inasmuch as it is nearly impossible to fully and adequately describe and characterize such complex instruments and methods by constraining them to the broad, general categories of characteristics given in Table 1. Readers interested in further detail on any of the instruments or methods are therefore encouraged to consult the original references or to obtain additional information directly from the respective participating laboratory.
The instrumentation used by Lab E was based on a two-filter 222 Rn measurement method [2]. Simply, the method consists of the following. Sample air is pumped through two filters in series that are separated by a large decay volume to permit decay of the radon. The first filter removes all progeny in the 222 Rn subseries ( 218 Po-214 Pb-214 Bi-214 Po). In the decay volume, some of the 222 Rn in the flow stream undergo radioactive decay, producing new progeny. These progeny are collected on the second, downstream filter and measured with an ␣-sensitive scintillation detector. The 222 Rn concentration in the sample air is derived from the resulting ␣-activity counting rate. The Lab E instrument consists of a 500 L cylindrical decay chamber and utilizes flow rates typically ranging from 350 L и min Ϫ1 to 400 L и min Ϫ1 . The instrument is fully automated. The second filter is configured as a ribbon, capable of forward and backward motions that allow rewinding of the filter [3]. For this intercomparison exercise, the instrument used sampling intervals of 1 h to collect the 222 Rn progeny after which the filter was transported to a ZnS(Ag) detector for measurement intervals of 1 h. While one section of the filter is being counted, another section is used in sampling. Additional characteristics and performance data are summarized in Table 1. The instrument used by Lab A was modelled on that described by Whittlestone [7]. The methodology is based on a modification of the two-filter method that incorporates an aerosol particle generator. Its main principles of operation are as follows. Sample air was continuously pumped at a flow rate of 400 L и min Ϫ1 through a drum of volume 200 L to allow decay of short-lived (55 s half life) 220 Rn (thoron), then through an absolute filter into a large plastic chamber of volume 2000 L. Sub-micron particulate condensation nuclei (CN) were injected into the chamber so that the 222 Rn progeny become attached to the CN rather than to the chamber walls. The attached progeny, after an average residence time of 50 min in the chamber, were filtered onto a membrane filter that was continuously monitored with a ZnS(Ag) scintillator. Average 222 Rn concentrations were inferred from 30 min accumulations of counts from the decay of the 222 Rn progeny on the filter. A calibrated CN counter was incorporated into the system to account for changes in the detector efficiency as a function of CN concentrations. Table 1 contains additional information on the instrument's performance and characteristics.
The 222 Rn measurement results of Lab F were based on inferring 222 Rn concentrations from collection and assay of the short-lived 222 Rn progeny that are attached to aerosols in ambient air. The methodology assumes that the 222 Rn and its progeny are in radioactive equilibrium (or are in a state of known equilibrium ratio).
The assumption of radioactive equilibrium (with an equilibrium ratio of unity) is usually considered to be valid at sampling locations distantly far from continental sources of radon and not influenced by local land masses. The instrument used by Lab F consisted of a large circular filter fixed on a rotating disk that divided the filter into 12 sampling locations. The instrument was automated for simultaneous sampling and measurement intervals of 2 h. After an aerosol sample was collected at one sampling location on the filter, it was automatically rotated to an ␣-sensitive scintillation detector for counting while the next sample was collected. As for previ-ously described instruments, performance data and additional details on this instrument may be found in Table  1.
The methodology utilized by Lab D was a non-filter method based on separating radon from air samples and subsequently assaying it in one of six Lucas-type ZnS(Ag) scintillation cells. The instrument was fully automated and operated with the following sequential steps: An air sample at a flow rate of 28 L и min Ϫ1 was aspirated into the instrument, compressed and dried, and flowed through a cooled charcoal trap where the radon was separated from the air stream by adsorption. Following a sampling interval of 2 h, the collected radon was heat-and vacuum-transferred to a secondary cooled charcoal trap to effect separation with other gases contained in the main charcoal trap. A pre-evacuated scintillation cell, having a pre-determined background counting rate, was then filled by transferring the radon from the secondary trap. Sample separation and processing times were of the order of 2.5 h and, with the sampling interval of 2 h, resulted in a sampling frequency of about one sample every 4.5 h. Reported 222 Rn activity concentrations were derived from the total ␣counting rates from the scintillation cells after about 4 h when the 218 Po and 214 Po progeny follow 222 Rn decay in transient equilibrium. Again, further detail and performance data are provided in Table 1.

Reported Measurement Results
The measurement results of the participating laboratories, as reported by them, over the intercomparison period October 5-17, 1991 are summarized in Figs. 2 through 4. The figures provide reported values of the mean 222 Rn activity concentrations over their individual sampling/measurement intervals for both the standardized sample additions (C 1 ) and for the ambient air measurements (C A ). The times, Greenwich Mean Time (GMT) in units of 1991 Julian date, correspond to approximate mid-point times for each sampling/measurement interval. The concentrations reported by each laboratory were converted to common units of Bq и m Ϫ3 for comparison. Figure 2 provides the first four days of ambient air measurements; Fig. 3 gives the reported concentrations over the course of the 15 standard additions; and Fig. 4 shows the results of four days of ambient measurements following the standard addition period.
A complete tabulation of the reported values for all four laboratories is provided in Table A of Appendix A to this paper.   As indicated previously, the results for Lab F represent only measurements of C A . Lab F reported both mean activity concentrations for the measured 212 Pb daughter activity and the inferred 222 Rn activity for intervals of 2 h. Only the reported 222 Rn concentration values are reported and treated here. Lab F separately noted two conditionals: 222 Rn concentration values that corresponded to 212 Pb concentrations having what were considered by Lab F to be abnormally high values due to local land influences; and values suspected to have been influenced by rainfall (see Appendix). These conditions affected only a small fraction of the data values: 13 and 5, respectively, out of a total of 142 reported values. No effort was made to separately treat these conditional values. The uncertainties associated with the 222 Rn concentrations in the range of about 0.07 Bq и m Ϫ3 to 0.2 Bq и m Ϫ3 was estimated and reported to be approximately plus or minus 20 %. This uncertainty was reported to correspond to two standard deviations for an assumed Poisson-distribution statistical ''counting error'' (based on the square root of the total number of detected counts) as well as contributions due to the uncertainties in detection efficiency and flow-rate measurements.
The results in Figs. 2 through 4 reported by Lab E are for sampling/measurement intervals of 1 h. The uncertainties for these values, which were reported to correspond to a one standard deviation interval for the assumed Poisson-distributed statistical ''counting error,'' are shown in Fig. 5 where the reported uncertainty, expressed in percent, is given as a function of the mean activity concentration. The data plotted in Fig. 5 are reported values for all of the results reported for Lab E. The functional form of the shown data is, of course, just directly proportional to C Ϫ1/2 where C is the reported concentration (since the variance equals the mean C with the assumption of a Poisson distribution). The minor discontinuities in the plotted data arise from the limited number of reported significant figures in the original data set. As indicated, the reported uncertainties for just the statistical ''counting error'' range from well over 100 % at concentrations of a few hundredths of 1 Bq и m Ϫ3 to less than 1 % at concentrations of about 30 Bq и m Ϫ3 . These uncertainties do not necessarily represent the inherent or minimum obtainable precision of the two-filter technique. Those reported here are significantly influenced by the count rate arising from a thoron ( 220 Rn) contamination that is due to trace quantities of thorium in the materials used to construct their decay chamber. This thoron background is treated as part of the overall counting system background. The magnitude of this contribution may be appreciated by considering that the 39 % relative uncertainty at a concentration of 0.1 Bq и m Ϫ3 would decrease to 19 % in the absence of the thoron. This contamination problem can be eliminated, such as is done by Lab A, by using a plastic or fiberglass decay chamber with a conducting inner surface. It has been reported that Lab E has subsequently eliminated the thoron contamination problem by coating the welds in the decay chamber with a white epoxy [6].
The results in Figs. 2 through 4 for Lab A are also for intervals of 1 h. As indicated previously, however, their instrument recorded continuously and gave averaged results that were ''smoothed'' by a time constant of approximately 90 min. As a result of the smoothing, the evaluation of their results could not be as direct, or as subjectively unequivocal, as with that of other laboratories. The measurement uncertainties reported by Lab A for their data set is given in Fig. 6. This uncertainty was stated to correspond to a one standard deviation statistical ''counting error'' combined with the estimated uncertainty in correcting for detection efficiency variations that arise because of changes in particulate concentrations in their delay tank. The lower edge of the data set plotted in Fig. 6 corresponds to the identical kind of C Ϫ1/2 functional form described for Lab E in Fig.  5. The large positive deviations correspond to periods when the delay chamber condensation nuclei concentration was low because of a power supply fault which resulted in a detection efficiency well below optimum. Over the course of the intercomparison, the radon concentration ranged mainly between 0.2 Bq и m Ϫ3 to 40 Bq и m Ϫ3 . The uncertainties for Lab A in this concentration range varied from 12 % to 0.7 %, which is comparable with those of Lab E (20 % to 0.7 %). However, the uncertainties at lower concentrations were markedly less for Lab A, presumably because of the thoron contamination in the Lab E decay chamber. For example, at a concentration of 0.1 Bq и m Ϫ3 the uncertainty for Lab A was 8 %, whereas that for Lab E it was 40 %.
The reported measurement results for Lab D given in Figs. 2 through 4 are mean concentrations averaged over sampling intervals of 2 h. Associated relative uncertainties for a one standard deviation interval were reported to range from Ϯ 3.4 % to 3.9 % for the entire data set, and include contributions from 1) the measurement variability which was stated to have a relative magnitude of about 2.8 % across the range of observed activity concentrations; 2) from the uncertainty associated with a 226 Ra/ 222 Rn reference standard used for calibration; and 3) from the flow meter uncertainty over the range of flow rates used.

Intercomparison of Standardized Sample Additions
Derivation of the mean 222 Rn activity concentrations C 0 for the standardized sample additions provided by NIST to the participating laboratories is treated in extenso in Collé et al. [1]. Fifteen additions, each having a duration of 4 h (except for ն13 which was 3 h), were provided over the period October 9-13, 1991. They are summarized in Table 2. One of them (addition ն4) had to be discarded from the analysis because of an experimental blunder. The activity concentrations C 0 were derived independently from well-known dilution factors that were in turn obtained from simultaneous flow-rate measurements for each sampling/measurement interval (from some given start time t a to a stop time t z ) for each participating laboratory. Lab E, with well-defined t a to t z sampling intervals of 1 h, thus received over the course of the 14 valid additions, 53 standardized samples for comparison. The Lab A additions, because of its averaged ''smoothed'' measurement results, utilized the mean C 0 for the entire 4 h or 3 h t a to t z sample intervals. Lab D, which had only one 2 h t a to t z sampling/measurement interval enveloped within each 4 h or 3 h sample interval, therefore, also received 14 standardized samples for comparison. The concentrations C 0 for the intercomparisons ranged from approximately 2.5 Bq и m Ϫ3 to 35 Bq и m Ϫ3 . Based on a very detailed uncertainty analysis, the overall propagated uncertainty in C 0 was in the range of 6 % to 13 % for a 3 standard deviation uncertainty interval [1].  Table 3 contains the reported measurement results by Lab E for the mean concentrations C 1(E) in the main sampling stream line, and for comparison the mean concentrations C 0 provided by NIST. The derived values of C 0 are given in Collé et al. [1]. The results for C 1(E) and C 0 over the entire course of the standardized sample additions are illustrated in Fig. 7. Inasmuch as, by definition, C 1(E) includes the ambient 222 Rn activity concentration whereas C 0 does not, an assumed C A(E) was selected to compare not only the concentration ratios C 1(E) /C 0 , but also (C 1(E) Ϫ C A(E) )/C 0 . The ambient concentrations C A(E) were very approximate values selected   from the Lab E data set from adjacent measurement intervals that were not believed to be influenced by the standard sample additions. The large intervals between ambient concentration measurements, which might otherwise have very suspect assumed C A(E) values, could be somewhat verified by normalizing the Lab E results to the uninfluenced Lab F results. Figure 8 shows the results of the reported measurements of C A(F) by Lab F (solid line) over the course of the standardized sample additions, along with the reported Lab E values (crossed circles) in the intervals not influenced by the additions. As indicated in the figure, there appears to be scant cause to suspect that there were any masked irregularities in C A during the standardized sample additions; and it seems reasonable that assumed values of C A could be obtained from interpolations. The particular choices of assumed C A(E) values in Table 3 may appear to be high in comparison to ''eye-smoothing'' interpolations with Fig. 8 (particularly for additions ն1, ն2, and ն3). However, the choices were deliberately conservative, such that the two ratios C 1(E) /C 0 and (C 1(E) Ϫ C A(E) )/C 0 in Table 3 were almost extreme limits on the influence of assumed C A values. The first ambient measurement result C A(E) by Lab E following a standardized sample addition was invariably elevated above subsequent C A(E) measurements. This effect is due to the continuous dilution in the decay chamber from one sampling cycle to the next. This incomplete removal of previously introduced activity is most pronounced in changes from very high standard addition concentrations to ambient levels. Based on the decay chamber volume (500 L) and flow rate (400 L и min Ϫ1 ), the removal is virtually complete within a few minutes. Conservatively, therefore, in the absence of any other information, these first C A(E) values were included in the average to obtain the assumed C A(E) values that comprised part of the measured value of C 1(E) . In all cases, the contribution of C A(E) to C 1(E) was sufficiently small (< 10 %) so that the effect on the comparison of the concentration ratios was somewhat insignificant.

Lab E Results
Subsequent to the above analysis by NIST, Lab E independently re-evaluated the background ambient concentrations using a less conservative approach of excluding the first ambient values following a standard addition. The effect of these background choices on the (C 1(E) Ϫ C A(E) )/C 0 results was negligible. The change in (C 1(E) Ϫ C A(E) )/C 0 for any standard addition was typically less than 1 %, and ranged to 2.3 % in the worst case.
The comparisons of Table 3 indicate a remarkably excellent agreement between Lab E and NIST. The mean concentration ratios over the 53 comparisons is 0.97 (excluding ambient influence) and 0.94 (including ambient influences) with a standard deviation of the mean (s m ) of approximately 2 % and a correlation coefficient r = 0.98. Considering the one standard deviation statistical ''counting error'' uncertainty in C 1(E) of several percent alone (see Fig. 5) and the 3 % to 4 % uncertainty in C 0 (for a one standard deviation interval as given in the uncertainty analysis of Collé et al. [1]), the comparisons indicate that there are no statistically significant differences between the Lab E results and NIST standardized sample additions. In addition, no significant systematic trends as a function of 222 Rn concentration were evident. As seen in Fig.  7, the comparative concentrations scale in good agreement over the entire range. If anything, it appears, and surprisingly so, that the agreements (in terms of (C 1(E) / C 0 ) differences) are, in general, better at lower concentrations (4 Bq и m Ϫ3 to 20 Bq и m Ϫ3 ) than at the higher concentrations (25 Bq и m Ϫ3 to 35 Bq и m Ϫ3 ). This is not universally the case, however, if one considers the agreement on a relative basis, such as for standard addition ն15 at the lowest introduced concentration. Additions ն1, ն2, and ն3, and then ն11 and ն12 (see Fig. 7) are all in the general concentration range around 30 Bq и m Ϫ3 to 35 Bq и m Ϫ3 ; and for these cases, the concentration ratios C 1(E) /C 0 and (C 1(E) Ϫ C 1(A) )/C 0 appear to exhibit the greatest variations and deviations from unity. Nevertheless, the agreement is remarkably consistent across the entire concentration range.
One may note that the first measurements of C 1(E) for additions ն1 and ն2 appear to be abnormally high compared to the following three in each series. A question has arisen as to whether the NIST manifold was completely flushed of accumulated radon prior to the commencement of these additions. This could possibly account for the large initial values. Although the incomplete removal of previously accumulated radon can not be absolutely excluded as a possibility (particularly in the first few standard additions, e.g., ն1, ն2, and ն3, when the NIST persona grata was somewhat unfamiliar with the test site's experimental layout), it is not believed to have occurred. Also, these large positive deviations in additions ն1 and ն2 appear to be somewhat matched by the negative deviations of almost comparable magnitude of addition ն11, for example, thereby suggesting that the exhibited deviations are statistical in nature.
One last observation may be made in regard to Fig. 8. The abrupt and large increase in the Bermudian ambient concentration following addition ն15 was similarly exhibited in both the Lab F and Lab E data. The same trend is also seen in the data of Lab A and Lab D (Fig.  4). The introduced 222 Rn activity concentration C 0 for addition ն15 was approximately 2.5 Bq и m Ϫ3 . The observed increase in the natural ambient concentration following addition ն15 went to experimentally-determined C A levels of roughly 1 Bq и m Ϫ3 to 2 Bq и m Ϫ3 . This was a surprisingly unexpected result (based on what the organizers led NIST to believe would be typical ambient concentrations). It was fortuitous in that it occurred at the conclusion of the standardized sample additions; and of good fortune in that it provided an almost complete overlap in the 222 Rn activity concentrations covered in the intercomparison of Bermudian natural ambient air concentration levels C A (< 0.01 Bq и m Ϫ3 to 2 Bq и m Ϫ3 ) and in the intercomparison of introduced C 0 concentrations (Ӎ 2.5 Bq и m Ϫ3 to 35 Bq и m Ϫ3 ). Nature, and her attendant Minerva , cooperated.

Lab A Results
The measurement results of Lab A for the mean 222 Rn activity concentration C 1 compared to the concentrations C 0 provided by NIST for the 14 valid standardized sample additions are given in Table 4. Figure 9, which will greatly assist in the understanding of the analysis and interpretation of the comparisons, illustrates these data. The values of mean C 0 averaged over the entire sample addition interval, as derived and given in Collé et al. [1], are shown as shaded regions. The plotted data points are the reported results of Lab A (see Fig. 3 and Appendix A).
It must be emphasized that the experimental design and protocol used for the standardized additions inherently were very inappropriate for evaluating the Lab A performance. The continuous and slow response of the Lab A instrument is highly suitable for continuous monitoring of slowly varying ambient concentrations. The artificially imposed 222 Rn concentration step functions of the standard additions places the evaluation of Lab A at a serious disadvantage compared to automated instruments that operate with finite sample collection and measurement time intervals.
The analysis of the Lab A data is complicated in that the results C 1 are hourly averages of continuously accumulated data ''smoothed'' by a time constant of 90 min. In effect, the average C 1 (t i ) reported during some arbitrary time-interval period t i is influenced not only by the current concentration C 0 (t i ) in the sampling stream line as it enters the Lab A delay tank, but is also influenced by the concentration C 0 (t i Ϫ j ) that entered the delay tank in previous periods t i Ϫ j (where j = 1, 2, 3, ...). By  numerical integration, the concentration C 1 (t i ) in period t i (excluding influences from ambient concentrations) may be expressed as where ␣ i Ϫ j is a removal function that takes into account the removal of 222 Rn from the tank by decay and by ventilation during the period t i Ϫ j . Thus, to obtain any given concentration C 1 (t i ), the averaged measured response at a given time must be unfolded from all previous measurements of C 1 during the influencing period. Using an assumed time constant of 90 min, as recommended by Lab A, this results in an ensemble of equations which must be simultaneously 2 -minimized and reduced to arrive at an unfolded set of uninfluenced C 1 values. A few preliminary attempts were made to try to ''unfold'' this integral data set, but the attempts were soon abandoned because of the inherently large resulting uncertainties and failures of the minimizations to converge. It was apparent that the quality of the data could not possibly justify the numerical exercise. An alternative analysis procedure was sought. Inspection of Fig. 9, and on consideration of the simple physics involved, one might have the following qualitative expectations: over the course of a 4 h sample addition with mean C 0 , the first hourly-averaged value C 1(A1) would be very low in comparison to C 0 ; the next hourly-averaged value C 1(A2) would be greater than C 1(A1) , but would probably still underestimate C 0 ; the third value C 1(A3) would again be larger than C 1(A2) , but might begin to approximate the range of C 0 ; and the fourth value C 1(A4) might not be very different from C 1(A3) . Thus, a simple and reasonable analysis approach would be to compare C 1(A3) and C 1(A4) to C 0 . In fact, this is the procedure adopted and presented in Table 4. To facilitate understanding the interpretations, the values of C 1(A4) are enlarged in the plot of Fig. 9. It could be argued that this approach is somewhat subjective; but, given the limitations of the adopted standard addition procedure for comparisons with this continuous measurement method, no other analysis procedure seemed feasible.
Even this approach is only partially applicable in the cases of adjoining standard sample addition intervals when the mean C 0 is adjusted dramatically from one interval to the next. Influences from preceding concentrations C 0 require the passage of at least approximately 4 h to 5 h intervals, as clearly seen in the return to ambient concentrations after additions ն1 and ն3 in Fig. 9. This is obviously affecting the results of addition ն7 after the abrupt change from ն6, and that of ն14 following ն13. This is the situation for decreasing step changes in C 0 . There is yet one more complicating fea-ture in the data set of Lab A. It is apparent that the values of C 1(A4) for additions ն5, ն9, and ն10, in which the following additions (ն6, ն10, and ն11) are increasing step changes, are abnormally high. This strongly suggests a difference in timing between that reported for the Lab A data and that for the standard additions, as was discussed at considerable length in Collé et al. [1], in which the onset of activity concentration C 0 for addition նN is reflected in the reported result C 1(A4) for addition ն (N Ϫ 1). It, furthermore, calls into question the results of the comparison C 1(A4) /C 0 for additions ն5, ն9, and ն10 in Table 4. Figure 9 indicates that the effect of the previous concentration on C 1(A3) is not severe for an increasing step change. However, for a decreasing step, the peak contribution remaining at C 1(A3) after a decreasing step can cause a substantial over-estimate of C 0 . This is readily apparent for additions ն7, ն14, and ն15. The relative magnitude of this over-response at C 1(A3) is roughly 10 %.
Interestingly, Lab A, in reporting its results, suggested using the value C 1(A2) for the intercomparison since they concluded that the maximum response C 1 (max) to a step change in concentration C 0 would occur somewhat after 2 h following the time of change and that C 1(A2) would give values within a few percent of C 1 (max). However, the comparison C 1(A2) /C 0 was generally much worse than C 1(A3) /C 0 or C 1(A4) /C 0 , particularly in the cases of decreasing step changes. The suggested timing difference also complicates even this conclusion.
Nonetheless, even with these inherent limitations, the results of Table 4 indicate a reasonably good agreement between Lab A and NIST. The mean value of C 1(A3) /C 0 is approximately 1.10 with a standard deviation of the mean (s m ) of 4 % and a correlation coefficient of r = 0.975. One should not place too much emphasis on this 10 % agreement (or the even better agreement in C 1(A4) /C 0 ) since the actual magnitude could very well be somewhat coincidental considering the magnitudes of the statistical variations and the subjective aspects of the comparison. Figure 9, however, clearly illustrates the general tracking and reasonably good agreement of their ''smoothed'' C 1 data with C 0 .
Lastly, it should be mentioned that considering the limitations of the experimental design for comparing the Lab A data to the standard additions, the quality of the data comparisons did not warrant attempts to account for contributions from ambient concentrations C A .
Lab A, following the original NIST analysis and report of the comparisons, suggested that the data set indeed warrants reconsideration. Lab A's independent re-analysis involved using only C 1(A3) values, making background corrections using the assumed C A(E) values of Table 3, and correcting these (C 1(A3) Ϫ C A(E) ) values by ''subtracting 10 % of the difference between the current and previous value when the previous value was larger, in order to account for the slow response of the detector.'' Lab A thereby concluded that with the background subtraction and the correction on the basis of 10 % of the concentration change, comparison to NIST ''substantially improves,'' resulting in a mean corrected C 1(A3) /C 0 ratio of 1.027 with a relative standard deviation of the mean of 3.3 % based on 14 comparisons and with a correlation coefficient of 0.978. This may be compared to the NIST analysis of Table 4. It must also be emphasized that the 10 % correction suggested by Lab A was not known prior to the intercomparison, and was only derived as a result of the Lab A to NIST standard addition comparisons.

Lab D Results
The reported results for Lab D are summarized in Table 5 which contains the reported values of C 1(D) and estimated C 0(D) that were derived from their own assumed values for C A . The values for C 0 provided by NIST were again taken from Collé et al. [1]. Comparisons of both C 1(D) /C 0 and C 0(D) /C 0 indicate a substantial systematic difference between the results of Lab D and that of NIST. The systematic proportional bias of approximately 0.37 in the concentration ratios (with s m = 4.5 % and r = 0.981) was invariant over the range of concentrations. This effect was attributed by Lab D to be a result of a calibration error introduced by using the assumed calibration factors provided by the manufacturer of a commercially-available, flow-through 226 Ra calibration source that was used by Lab D for their calibrations. Deviations in possibly both the 226 Ra activity content of commercial sources, as well as in the functional form for the available 222 Rn concentration as a function of flow rate for flow-through sources, were demonstrated in Collé et al. [1].
The C 0 values provided by NIST for Lab D and those provided for Lab E are, of course, highly correlated, being derived from slightly different subsets of the identical simultaneous flow-rate measurement data set. Therefore, it is not unexpected that the exclusively positive deviations from the mean concentration ratios as seen in additions ն1, ն2, ն3, ն9, and ն12 (or the almost exclusively negative deviations from the mean as seen in additions ն5, ն6, ն7, ն8, and ն15), although similarly occurring in the comparisons of the data for both Lab D and Lab E, are anything but random variations.

Intercomparison of Ambient Air Concentrations
For the intercomparison of mean 222 Rn activity concentrations C A in ambient Bermudian air among all four participating laboratories, the data set of Lab E was selected to provide the necessary normalization. It was chosen because it was not only the largest and most complete data set available, but it also seemed to be the data set whose C A values were most in the midrange of all reported C A values.
The comparisons were made by selecting pairs of the nearest adjacent values of C A , in terms of their reported midpoint times, for Lab E and for the other participating laboratory. The paired values [C A(E) and C A(LAB) where LAB = F,A,D] could then be used to form a set of comparison ratios C A(LAB) /C A(E) . For example, a C A(F) value for Lab F whose midpoint time was 1700 hours GMT would be compared to the C A(E) values at both 1630 and 1730 for Lab E. The Lab A C A(A) and Lab E C A(E) values typically had the same midpoint times and could be compared directly. No attempt was made to try to account for the ''smoothing'' effect in the Lab A data. The C A(D) results of Lab D similarly were compared to the nearest C A(E) values that were reported both before and after the midpoint time for the C A(D) results. In no cases were any comparisons made for values whose midpoints were separated by more than 1.5 h. Also, every effort was made to avoid the selection of paired C A(E) and C A(LAB) values that might have been influenced by the introduced activity of the standardized sample additions. This pair-wise selection of reported values, then comprised the data sets that were analyzed over the course of the entire 2 week intercomparison period (excluding the intervals for the standardized sample additions).
Before addressing these pair-wise comparisons to the Lab E results, it should be understood that similar independent evaluations were also performed by making direct comparisons of C A(A) to C A(F) , C A(D) to C A(F) , and C A(D) to C A(A) . In all cases, the results of the various comparison estimators (see Table 6) were statistically consistent and redundant with the results that follow.

Lab F to Lab E Comparison
For the comparison between Lab F and Lab E, the data set comprised 202 paired C A(F) and C A(E) values. The ratios C A(F) /C A(E) ranged from a minimum of about 0.06 to a maximum of 2.7, with median and mean values of approximately 0.60 and 0.65, respectively. These descriptive statistical estimates are summarized in Table 6. A frequency distribution for the ratios C A(F) /C A(E) is given in Fig. 10. Clearly, the measurements of C A by Lab F are systematically low with respect to Lab E. This is even more apparent in the scatter diagram of Fig. 11 where the values of C A(F) are plotted as a function of C A(E) . The dotted line in the figure is the ''helping line'' for when the comparison ratios are equal to unity, i.e., for the ''ideality'' C A(F) = C A(E) . As indicated in the figure, the entire set of C A(F) values are below those of C A(E) for all C A(E) values greater than 0.25 Bq и m Ϫ3 . The one glaring datum exception (at C A(E) Ӎ 0.4 Bq и m Ϫ3 and C A(F) Ӎ 1 Bq и m Ϫ3 ) is obviously a fluke. The wide scatter of data in Fig. 11 is indicative of the inherent irreproducibility for these low-level measurements of ambient air, at least on a comparative basis for the results between laboratories. Recalling Fig. 5, the one standard deviation statistical ''counting error'' alone for C A(E) = 0.5 Bq и m Ϫ3 to 1 Bq и m Ϫ3 was in the range of 10 % to 5 %. One of the participant laboratories suggested that the wide dispersion in C A(F) /C A(E) ratios referred to above was not due to an inherent low-level measurement irreproducibility, but rather is a reflection of variations of the radon progeny to radon equilibrium ratios. The Lab F instrument at the relatively high ambient radon concentrations occurring at the time of the intercomparison had very good measurement repeatability (perhaps less than a few percent), yet the Lab F results would be strongly influenced by any changes in the equilibrium ratios which are in turn a function of condensation nuclei concentrations. Explicit correlations between radon concentrations and equilibrium ratios as a function of condensation nuclei concentrations, however, is beyond the scope of this work. Nevertheless, C A(F) and C A(E) are highly correlated with correlation coefficient r = 0.88, and as seen by the general trend of the data points in Fig. 11.    The results of a linear regression on sets of two measurement results which are estimates of the same quantity (e.g., in this case C A(F) = ␣ + ␤ C A(E) for the variables C A(F) and C A(E) ) that are estimating the ''true'' C A are informative. The slope ␤ (for intercept ␣ Ӎ 0) is often a better comparison estimator of the agreement between the two variables than is an estimator of central tendency (e.g., the mean or median) alone. This is particularly true when the variables have large statistical variability. Means of ratios (e.g., C A(F) /C A(E) ) can be substantially influenced by even a few exceedingly high or exceedingly low values, and their exclusive use in comparisons can lead to seriously misleading or distorted results.
(This point will be demonstrated later in the discussion of Lab D comparison results.) The linear regression results for the Lab F comparison are given in Table 6. The intercept ␣ is negligible, and the slope ␤ = 0.59 is in very good agreement with both the mean (0.65) and median (0.60). One can therefore conclude that, for the purposes of this comparison, the measurement results of Lab F are somewhat systematically low with respect to those of Lab E by roughly 40 %. Excessive confidence should not be placed in this rough estimate because of the large data variability. Yet, it clearly is the general trend.
The Lab F measurement results might suggest that 222 Rn was not in secular equilibrium with its daughters, and that the equilibrium ratios substantially differed from unity.

Lab A to Lab E Comparison
Descriptive statistical estimates for the comparison between Lab A and Lab E are also given in Table 6. The comparison consisted of 104 paired C A(A) and C A(E) values. The ratios C A(A) /C A(E) ranged from a minimum of approximately 0.9 to a maximum of 16. The median and mean values are 1.3 and 1.9, respectively, indicating that the Lab A values are in general systematically high with respect to those of Lab E. The frequency distribution of Fig. 12, and, even more so, the scatter diagram of C A(A) versus C A(E) of Fig. 13, confirm this generality. Considering that both Lab E and Lab A seemed to be in fairly good agreement with the standardized sample additions provided by NIST (Tables 2 and 3), their apparent systematic differences for the comparison of ambient air concentrations may seen dichotomous. Part of this may be understandable by inspection of Fig. 13 and the results of the linear regression in Table 6. Note that the slope ␤ is very nearly equal to unity which in itself would indicate good agreement with Lab E (and NIST by inference). However, the intercept of the regression is substantial, ␣ = 0.27, which implies a uniformly offsetting bias of about 0.3 Bq и m Ϫ3 that is more significant at low concentrations (e.g., ambient air) than at higher concentrations (e.g., in the standardized sample additions). Possible reasons for this non-zero offset in terms of the Lab A and Lab E instrumentation are unknown. Both the generally higher results of Lab A and the non-zero offset in comparison to Lab E may, of course, be due to unaccounted thoron ( 220 Rn) background contributions with the Lab A results. Equally possible, the apparent offset may not necessarily be attributable to an offset by the Lab A results, but rather may be due to a negative offset in the Lab E results due to an over-subtraction of background by Lab E. The latter possibility

Lab D to Lab E Comparison
As for the previous comparisons, the comparison between Lab D and Lab E is summarized in Table 6 and in the frequency distribution and scatter diagram of Figs. 14 and 15. In this case, 72 paired C A(D) and C A(E) values were compared yielding for the ratio C A(D) /C A(E) a minimum of 0.3, a maximum of 3.5, a median of 0.4, and a mean of 0.63. Obviously, the Lab D results are systematically low with respect to the Lab E measurements. The results are entirely consistent with the comparisons of the standardized sample additions ( Table 5). The linear regression slope ␤ (0.33) is a much better indicator than the mean (0.63) and is in very good agreement with the mean ratios (0.37 and 0.36) given in Table 5.
The intercept of the regression is 0.076 Bq и m Ϫ3 . If one corrects this value by a reciprocal correction factor of 0.361 obtained from the results of the standardized addition comparisons (Table 5), then one obtains a corrected value of 0.21 Bq и m Ϫ3 for the Lab D to Lab E regression intercept. This value is remarkably close to  that obtained for the Lab A to Lab E comparison intercept (0.27 Bq и m Ϫ3 ) and adds considerable credence to the possibility that the Lab E results have a negative bias due to over correcting for background.
One may conclude, nevertheless, that for the purpose of this intercomparison, the measurement results of Lab D are proportionately biased and systematically low with respect to the measurements of Lab E by roughly 65 % to 70 %, irrespective of possible systematic background-correction offset biases.

Discussion of Findings and Summary Thoughts
The results of the intercomparison itself are nearly self-evident on examination of the summaries in Tables  3, 4, and 5 for standardized sample additions, and Table  6 for the ambient concentrations.
Lab E and Lab A were in excellent agreement with the NIST additions. The vast majority of reported values for both laboratories, over the entire 2.5 Bq и m Ϫ3 to 35 Bq и m Ϫ3 concentration range, were within the 6 % to 13 % relative 3 standard deviation uncertainty interval associated with the NIST additions. The Lab D results for the standardized additions were obviously proportionately biased by a mean factor of about 0.36, which may be attributable to an instrument calibration error. The Lab D reported concentrations, on correcting all of Lab D's values by this common factor, are then in good agreement with those of both Lab E and Lab A. Further, the Lab E data, Lab A data, and corrected Lab D data track the fluctuations of radon concentrations very well over the entire range of concentrations even though the number of data values from the three laboratories was substantially different.
The intercomparison of reported ambient concentration values was performed by normalizations to the Lab E data. The results were statistically invariant of this arbitrary choice-normalizing to any other laboratory set results in redundant and statistically equivalent comparison estimators, e.g., means, linear regression slopes, etc. (see Table 6). These ambient concentration results reinforced the findings obtained from the standardized sample additions. The slope (␤ ) of a linear regression of the reported ambient concentrations from the two laboratories under comparison was considered the best available comparison estimator. Concentration pairs selected for the comparison regressions were the measurement values nearest to each other in terms of their midpoint times. For comparison of Lab A to Lab E, ␤ = 0.97 Ϯ 0.11 which again is indicative of the good agreement between these two laboratories results. This uncertainty interval and all of the uncertainty intervals that follow are assumed to correspond to three standard deviations and are assumed to provide an uncertainty interval having a high level of confidence of roughly 95 % to 99 %. For the Lab D to Lab E comparison, ␤ = 0.33 Ϯ 0.06, which exhibits a proportional bias of the same magnitude found in the comparison to standardized sample additions. Again, if one evaluates the Lab A data, Lab E data, and corrected Lab D data (after correcting all of the Lab D values by a common reciprocal correction factor of 0.36), one finds that all three laboratories are in good agreement in tracking changes in the ambient concentrations from the relatively high 2 Bq и m Ϫ3 ambient levels down to concentrations of a few hundredths of a Bq и m Ϫ3 . Differences were greatest at the very lowest ambient concentrations because of an apparent offset in the regressions among the laboratories (refer to the discussions in Sec. 5). The offset (regression intercept ␣ ) between Lab A and Lab E was ␣ = (0.27 Ϯ 0.11) Bq и Ϫ3 , which has been suggested to be attributable to either an unaccounted 220 Rn (thoron) contamination background for Lab A or an over-corrected background for Lab E. The offset between Lab D and Lab E was nearly negligible, ␣ = (0.076 Ϯ 0.068) Bq и m Ϫ3 . However, if one invokes the same correction as before to the Lab D data, then the corrected intercept ␣ = (0.23 Ϯ 0.22) Bq и m Ϫ3 would seem to support the latter overcorrected background argument. The magnitude of the uncertainty intervals for the intercept values, however, does not make the argument compelling. For the Lab F to Lab E comparison, ␤ = 0.59 Ϯ 0.07, indicating an approximate 40 % disagreement which, as noted, was surprising compared to the reported results of previous intercomparisons. This difference is equally manifest in direct comparisons between the Lab F and Lab A measurement data and in direct comparisons between Lab F data and corrected Lab D data. The Lab F to Lab E intercept, ␣ = (Ϫ 0.04 Ϯ 0.07) Bq и m Ϫ3 , is truly negligible. This result does not support the previously observed offset between Lab A and Lab E even if the Lab F data is renormalized (i.e., ''corrected'') by the reciprocal of the observed ␤ between Lab F and Lab E. Throughout the course of the ambient concentration intercomparison period, over all concentration ranges, the Lab F data also appropriately scaled (by a factor of roughly 0.6) over all concentration ranges with respect to the Lab A data and similarly scaled (by a factor of roughly 1.8) with respect to the Lab D data.
A considerable number of additional statistical analyses beyond those reported here were performed on the data sets. These included: (1) sequential time analyses to determine if there were any time dependencies or correlations in the observed measurement differences between the participating laboratories or with the NIST standardized sample additions; (2) regression analyses between the results of each participating laboratory and the NIST additions; (3) regression analyses between every combination of participating laboratory pair; (4) 2 -tests for all regressions and comparison frequency distributions; (5) divisions of the ambient concentration comparison data for the participating laboratory pairs first into halves (and then into thirds), and testing the resulting subsets of data for differences in the various means using t -tests, and for homogeneity in the various subset sample means and variances using 2 -and Ftests; and (6) sequential two-variable analysis-of-variance (ANOVA) techniques for differences in similarly constructed subset means and variances. The results of these analyses were not reported in an attempt at brevity in an already too-lengthy paper, and since they added nothing to the analyses that were reported nor to the findings and conclusions.
In order to maintain the integrity of the intercomparison, Dr. R. Collé, representing NIST, retained overriding authority among the co-authors with regard to the statements of the results and the conclusions as reported herein. The data interpretations, design of the evaluation procedures, and statistical analyses are his, and his alone. Each of the participating laboratories provided supplemental information, such as descriptions of their respective instruments and measurement methodologies, contributed to the discussion, and had an opportunity to comment on the NIST analyses.
The design of the intercomparison was as near as the investigators could come to conducting a ''blind'' comparative exercise, as described in the proposal submitted by the organizers of the intercomparison, the Drexel University investigators, to the National Science Foundation. The exercise was ''blind'' in several regards: (1) the standard sample additions provided by NIST were introduced with undisclosed 222 Rn activity concentrations, and the NIST results were not disclosed and released until all of the participating laboratories had provided their respective measurement data; (2) the timing and duration of the NIST standard sample additions were also largely unknown to the participating laboratories; (3) NIST participation required that once the participating laboratories reported their measurement values, the data would be analyzed and reported without subsequent modifications or corrections to the originally reported values; (4) although NIST knew the approximate activity concentrations at the time of the introduction of the standardized sample additions, the final mean activity concentrations provided to each participating laboratory during their respective sampling intervals were not known by NIST until after reduction of the extensive flow rate data base; and (5) NIST provided the standardized sample additions largely in complete ignorance of the underlying 222 Rn ambient concentration at the time of the additions.
It must be emphasized that this intercomparison was not intended to critique or evaluate the various instruments or measurement methodologies in terms of their advantages, disadvantages, or suitability for performing continuous 222 Rn monitoring in marine atmospheres. The sole intent was, as stated, to provide an unbiased and refereed intercomparison of 222 Rn activity concentration measurements made by four laboratories, and to provide three of these laboratories, under somewhat inappropriate and limiting experimental test conditions for at least one laboratory, with introduced samples containing known, but undisclosed (i.e., ''blind'') 222 Rn activity concentrations that could be referenced to U.S. national standards. To wit, for purposes of this intercomparison exercise, the various instruments and measurement methodologies were viewed much as how the diversity of prevailing religious modes of worship was regarded by magistrates in the late Roman empire: all equally true; all equally false; all equally useful [14].
This intercomparison was not without imposed limitations. Although all of the principal measurement methodologies used to continuously monitor marine air masses-where typical concentrations are much lower than found in continental air masses and where humidity levels vary-were represented by the participating laboratories, funding and space limitations of the experimental site could not accommodate other groups making atmospheric radon measurements. It is arguable that the intercomparison exercise was too short in time duration. Excluding the period for the intercomparison of standardized sample additions, the time for intercomparison of ambient air concentrations was of the order of 2 weeks, and even this period was not continuous. Without question, continuous intercomparison measurements over longer time intervals, two or more uninterrupted weeks or even months, would have been much better. Equally, it would have been more useful to conduct correlations with meteorological data and with 222 Rn progeny measurements and equilibrium ratios. These correlations would have been useful to discriminate the air masses' origin, to know if they are of direct continental origin, or marine origin, or mixed. In this way, it might have been possible to discern if the apparent discrepancies among some of the participating laboratories were different in relation to differing meteorological situations. The differences between the measurements of Lab A and Lab E, for example, were not apparent in previous intercomparisons [11]. These kinds of efforts, however, were beyond the scope of this work.

Conclusions
This exercise was unique among other environmental intercomparisons, and it fulfilled two major objectives.
Firstly, this work provided an unbiased, refereed basis for comparing the measurement results and performance of four principal instruments (as employed by four different laboratories) that are used to measure 222 Rn activity concentrations for marine-atmospheric studies. Collectively, these instruments and laboratories represent those responsible for a significant fraction of the atmospheric 222 Rn measurements made over the past decade. The intercomparison utilized a common standardized, in situ , reference basis that could be directly related to U.S. national, and internationally recognized, 226 Ra and 222 Rn standards. The findings of the intercomparison may assist various users (e.g., those in the global modelling community) in applying the available and future 222 Rn measurement data bases in a more reliable and effective manner.
Secondly, the work went beyond serving the needs of this particular intercomparison. It also demonstrated the broader utility of the calibration protocol and the methodology for the standardized sample additions that were developed for it [1]. Most environmental measurement intercomparisons of field instruments in actual use merely rely on evaluating the relative performance of the participants, or some comparison to the pooled results. This exercise demonstrated, for the very first time, the capability of providing a standardized reference basis even for such low-level, field-measurement intercomparisons. The developed methodologies presented here could, of course, be adopted with slight modifications to cover other 222 Rn concentration ranges and other applications, and could be employed in many other types of 222 Rn environmental, field-measurement intercomparisons.

Appendix A. Reported Measurement Results of the Participating Laboratories
The complete data set of measurement results from the participating laboratories, as reported by them, over the intercomparison period October 5-17, 1991 is given in Table A. The table includes reported values of the  mean 222 Rn activity concentrations over their individual sampling/measurement intervals for both the standardized sample additions (C 1 ) and for the ambient air measurements (C A ). The times (Table A, column 1) correspond to approximate mid-point times for each sampling/measurement interval. The concentrations reported by each laboratory were converted to common units of Bq и m Ϫ3 for comparison. The results for Lab F represent only measurement of (C A ).