Integrated exposure uptake biokinetic model for lead in children: empirical comparisons with epidemiologic data.

The concept of model validation is evolving in the scientific community. This paper addresses the comparison of observed and predicted estimates as one component of model validation as applied to the integrated exposure uptake biokinetic (IEUBK) model for lead in children. The IEUBK model is an exposure (dose)-response model that uses children's environmental lead exposures to estimate risk of elevated blood lead (typically > 10 micrograms/dl) through estimation of lead body burdens in a mass balance framework. We used residence-specific environmental lead measurements from three epidemiologic datasets as inputs for the IEUBK model to predict blood lead levels, and compared these predictions with blood lead levels of children living at these residences. When the IEUBK modeling focused on children with representative exposure measurements, that is, children who spent the bulk of their time near the locations sampled, there was reasonably close agreement between observed and predicted blood lead distributions in the three studies considered. Geometric mean observed and predicted blood lead levels were within 0.7 microgram/dl, and proportions of study populations expected to be above 10 micrograms/dl were within 4% of those observed.

The widespread potential for environmental and occupational lead exposure, and the variety of associated adverse health effects at relatively low exposure levels have been described extensively in the scientific literature; the findings and literature sources have been reviewed and summarized in a number of U.S. government reports (1)(2)(3)(4)(5). For risk assessment purposes, multiple regression and correlation models relating environmental lead levels and blood lead levels have been difficult to generalize to communities or neighborhoods where such data were not specifically collected. The U.S. Environmental Protection Agency (U.S. EPA) developed the integrated exposure uptake biokinetic (IEUBK) model for lead in children (6,7) as an alternative or complement to these stochastic models, to estimate the potential for blood lead concentrations above a specific level of concern, currently 10 pg/dl (4), among children exposed to lead in their environments. The IEUBK model differs from correlation models in that it is a dose-response model that uses children's lead exposures (doses) over time to estimate likely lead body burdens.
It is essential to demonstrate the usefulness of predictions from models used in support of regulatory decisions. The process of model evaluation involves several distinct principles and activities; most of these principles as they apply to the IEUBK model have been addressed in a variety of publications and are summarized below. The remaining principle, the comparison of model predictions with epidemiologic data, is the primary focus of this paper. This will be addressed through an overview of IEUBK model predictions and their intended use, criteria for relevant data sets for carrying out the empirical comparisons, and the choice of statistical methods for supporting the evaluation.

Background Ovenriew of IEUBK Model Evaluation
The concept of model evaluation has been evolving in the scientific community (8)(9)(10)(11). The U.S. EPA has articulated a set of principles essential in evaluating models for regulatory use, in the U.S. EPA guidance on peer review of environmental regulatory modeling (11), and in the Validation Strategy for the IEUBK Modelfor Lead in Children (12). These principles address several distinct but dependent stages: the soundness of the scientific foundations of the model structure and the adequacy of parameter estimates, verification of translation of mathematical relationships into computer code, and evaluation of whether model predictions are in reasonable agreement with relevant experimental and observational data. The IEUBK model has been evaluated along these lines several times since its inception.
The current version (version 0.99d) is an expansion of models used by the U.S. EPA air and water programs in support of regulations. The earliest version (13), used by the Office of Air Quality Planning and Control, was peer reviewed by the Clean Air Scientific Advisory Committee's Lead Exposure Subcommittee in 1988 and judged to be scientifically sound (14). Predictions generated by this version were confirmed using a cross-sectional study of children in the lead smelter community of East Helena, Montana; this work was Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 described by Johnson and Paul in 1986 (15), Marcus and Cohen in 1988 (16), and in a U.S. EPA Office of Air Quality Planning and Standards staff paper in 1989 (13), and consisted of empirical comparisons of observed and predicted blood lead distributions. This successfully confirmed model was expanded to include a total lead exposure component, with fetal exposure, nonlinear kinetics for plasma/red cell partitioning and for gut absorption, and much greater variety of time-varying lead exposure sources (17). The Science Advisory Board's (SAB) Indoor Air Quality and Total Human Exposure Committee reviewed this version in 1992, and concluded ... we are convinced that the approach followed in developing the UBK model was sound, and constitutes a valuable initiative in dealing with program needs in evaluating and controlling human exposures to lead. It can effectively be applied to many current needs even as it continues to undergo refinement for other applications, based upon experience gained in its use. The refinements will not only improve the scientific basis for evaluating and controlling lead, an essential Agency responsibility, but also provide a basis for the use of the model for other toxicants that present similar challenges. (18) Version 0.99d reflects the recommendations of this second review, including improved guidance materials and documentation of the scientific foundations of the model's structure model, parameters, and equations. More recent experimental data were identified and incorporated into this version as improved parameter estimates, while the overall framework remained the same as that reviewed by the SAB in 1992. The documentation supporting version 0.99d was completed in 1994 (6,7) and is summarized separately in this series (19).
Building on this foundation, an independent code verification and validation exercise has been completed and is also reported separately in this series (20). The main conclusion was that version 0.99d does accurately carry out the operations and calculations that were intended. Preliminary results of empirical comparisons of IEUBK version 0.99d predictions with three datasets were reported in 1995 (21), and are reported here in more detail.

Goal of IEUBK Empirical Comparisons
As elaborated elsewhere (6,7,19), the IEUBK model is a synthesis of many scientific studies of lead biokinetics, contact rates of children with contaminated media, and the presence and behavior of environmental lead. The model was designed to agree with observational, real-world data through its calibration with communityspecific datasets (7). It stands to reason, however, that usefulness of its predictions varies within the broader range and combination of conditions that the model covers because the separate studies providing its parameters were not designed to span completely co-incident ranges of environmental and population-specific conditions. A rangefinding exercise exploring what levels of agreement are possible will help IEUBK model users better understand the strengths and weaknesses of the model and suggest areas for additional research and improvement. For example, child's age is an explicit factor in the IEUBK model. Although the full age range under 84 months is often recommended as a basis for lead risk assessment (22), some applications may apply only to children at one extreme of the range, or only to the most sensitive subpopulation.
In this context, empirical comparisons of model predictions with real-world data involve understanding the IEUBK model's intended use, identifying data that span at least similar conditions, and recognizing the limitations of the observed conditions for model evaluation. The IEUBK model functions primarily to estimate the risk of elevated blood lead levels, i.e., the probability of a given child or group of children having blood lead concentrations exceeding a specified level of concern (6). Currently, U.S. EPA's target is to limit individual risk of exceeding 10 pg/dl to no more than 5% (14,22). The IEUBK's estimated risk of elevated blood lead levels corresponds to cumulative exposure to a multimedia set of environmental lead levels, generally at and around a residence, with which a child or group of children would have contact while living there. This estimated risk is intended to describe the potential for elevated blood lead for any children who would have similar exposure, not just the current residents. For example, a typical application of the model is to estimate the potential for elevated blood lead levels for children who would live in residential developments to be built on currently undeveloped but lead-contaminated land.
The IEUBK model estimates risk of elevated blood lead under the assumption of lognormality of blood lead levels. The model supplies the starting point estimate of blood lead taken as a geometric mean (GM) blood lead level, and generates a blood lead distribution using an individual geometric standard deviation (GSD) derived from community blood lead studies based in children's residential settings (1,6). This individual GSD reflects substantial variability in interindividual behavior (e.g., length of exposure to measured media, extent of mouthing behavior, time since last meal, variability in dietary intake) and biology (e.g., lead absorption rates as affected by genetics or nutritional status, including blood iron level) (6).
As an illustration, consider a situation in which a combination of exposures to lead in soil, dust, water, diet, and air results in an IEUBK-predicted GM blood lead of 5 pg/dl for children under 7 years of age. Using the recommended GSD of 1.6 (6), 95% of children with similar exposure are expected to have blood lead levels between 2.0 and 12.6 pg/dl. Using the same distribution, there is a 7% probability that an individual child exposed to the same conditions would be estimated to have a blood lead greater than 10 pg/dl, or equivalently, 7% of all children exposed to those conditions would be estimated to have a blood lead greater than 10 pg/dl. Note that it is not the goal of the IEUBK model to match the measured blood lead level of a specific child. The IEUBK model is primarily a probabilistic model, not a substitute for medical evaluation of a particular child. Returning to the example above, suppose that two children live at the residence where the lead exposures considered for the model prediction were measured, and that the children's measured blood lead levels were 8 and 11 pg/dl. These are consistent with the model's GM prediction of 5 pg/dl, and 95% confidence interval (CI) of 2.0 to 12.6 pg/dl. Even an observed blood lead level outside the 95% CI, such as 1.5 pg/dl, is consistent with the model prediction. Theoretically, however, there is only a 0.5% chance that such a low blood lead would occur under these conditions. Data most useful for evaluating IEUBK model predictions should ideally involve measurements of both environmental lead levels and the amount of lead taken into the body, as well as the children's body burdens of lead, i.e., lead levels in blood, bone, and other tissues, all at many time points over an extended period. To the best of our knowledge, such detailed data do not exist, and would be difficult and expensive to collect, even if it were acceptable to study children experimentally. Lacking epidemiologic studies that have been carried out for the Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 1 558 purpose of model confirmation, some opportunistic use can be made of observational studies of environmental and blood lead levels that have been conducted for public health evaluations at lead-contaminated sites. Longitudinal measurements would be preferred for IEUBK model confirmation to understand both exposure patterns and changes in blood lead with age, but these have been collected much less often than cross-sectional data. Even so, longitudinal data may not be absolutely necessary if the cumulative exposures and body burdens can be assumed to be comparable to those that must be inferred from cross-sectional data, allowing for the practical difficulties in measuring actual exposure.
An observational study should meet several requirements in order to serve as a basis of comparison with an exposureresponse model such as the IEUBK model (12): * A sufficiently large lead-exposed sample of children 84 months of age and younger (age must be known), selected either by random sampling or near-census, helping assure that a wide enough range of children's behaviors is spanned by the data; * Blood lead levels linked with environmental lead levels, all analyzed by accepted methods and collected within approximately 1 month of each other, at the time of year likely to demonstrate peak blood lead levels (usually late summer) (23), to be on an equivalent basis with other epidemiologic studies including those used to calibrate the IEUBK model; * Environmental lead concentrations in all media to which each child was primarily exposed (usually soil, interior dust, and drinking water) that can be expected to have been reasonably constant over at least the last 3 months preceding the blood lead measurement and that adequately characterized the child's exposure to lead (i.e., no missing data); * Behavioral and demographic data such as time spent outside or away from home; * Documentation of quality assurance and quality control procedures to address reproducibility of measurements; and * Documentation of other sources of lead, such as local data on lead in air and food if possible, but possibly less important because these sources have seen great reductions (4,24); traditional medicines or parents' occupational exposure; residence-specific data concerning X-ray fluorescence (XRF) levels and condition of lead-based paint. The first five attributes are essential. Knowledge of other sources of lead is important, but it may be possible to carry out some useful comparisons without information on these additional sources of lead if the study is large enough and if it can be assumed that the influences of these sources are relatively minimal and randomly distributed throughout the dataset. In the case of lead-based paint, while residence-specific measurements and observations will help interpret local conditions, exposure to lead-based paint is best assessed through dust and soil lead measurements, as discussed below. Datasets already used by the U.S. EPA to calibrate the IEUBK model (7) are specifically excluded, as they already generate good agreement of model predictions, by design.

Study Population
For this first empirical comparisons exercise of IEUBK version 0.99d, we chose a set of studies that conformed well with the selection criteria discussed above and had the additional advantages of using very similar methods of environmental sampling and lead analysis, and of including extensive behavioral and demographic information collected by a standardized questionnaire administered at all sites. This multisite study of lead exposure and blood lead in Palmerton, Pennsylvania; Madison County, Illinois; Jasper County, Missouri; and Galena, Kansas, was designed by the Agency for Toxic Substances and Disease Registry (ATSDR) and the U.S. EPA, and conducted in 1991, to evaluate populations of all ages near these Superfund National Priorities List sites for possible health effects related to chronic, low-level lead and cadmium exposure associated with nearby, but no longer active, smelter operations. Analytical methods, quality control, and quality assurance are described in ATSDR's final report of their health evaluation at these sites (25). ATSDR's Division of Health Studies maintained the database containing all blood lead data and personal identifiers, in order to safeguard confidentiality of the participants. We worked with a subset of this database, configured as a SAS dataset, containing one record for each individual child up to 84 months of age, with environmental lead measurements and questionnaire responses but no information that would permit identification of individuals or residences.
In general, the study populations were random samples, with some minor qualifications. Children 72 to 84 months of age were somewhat underrepresented because children 6 to 71 months of age were oversampled relative to the older participants (25). In the Jasper County sample, all homes where children had elevated blood lead levels were subjected to environmental sampling, but only a randomly selected subset of other homes was sampled; children with high blood lead levels have thus been overrepresented in this portion of the database used for this empirical comparisons exercise. As the Galena, Kansas, and Jasper County, Missouri, datasets had been designed with a common comparison group, we combined them to maximize sample sizes for comparisons within subsets of each dataset. In two of the datasets, a substantial number of siblings were included. These records were retained also to maximize sample sizes for comparisons within subsets of each data set and because the different ages within families lead to somewhat independent exposures and blood lead levels despite the same measured environmental lead levels.
Next, the datasets were trimmed by excluding records with incomplete exposure characterization. From the maximum number of records, those missing any values for child's soil lead, dust lead, water lead, or blood lead were excluded. If there had been children who had lived in their residences less than 3 months, the minimum applicable period for generating IEUBK predictions (6,7), these records would have been excluded as well. Children reported by their parents to be away from home more than 10 hr/week (such as at a babysitter or daycare facility) were excluded because there was no information concerning lead exposure at the secondary locations. The cutoff of 10 hr seemed to be a reasonable acknowledgment of family activities, such as visiting friends and family or going grocery shopping. The cutoff was relaxed to 20 hr/week for the Pennsylvania dataset, however, because of the small sample size.
Individual measured blood lead levels were not examined until after generating IEUBK predictions; this information had no part in identifying the records to be excluded.  Table 2 provides a brief description of the environmental sampling methods for each study. Although these studies were carried out with a high level of consistency, some unavoidable differences among them required judgments concerning which measurements to use as IEUBK inputs: * There were some differences in sources of soil samples between the three datasets. In the Kansas/Missouri and Illinois datasets, composite soil samples avoided the drip-lines of the residences, and emphasized play areas relative to other parts of the yard. In Pennsylvania, soil samples were not composited over the entire yard, so it was necessary to choose a combination of measurements to average that would characterize areas children were more likely to use. The average of the bare and play areas was judged to be as similar to the composite measurements from the other two studies as possible; play area alone was considered but was only available for 26 children away from home no more than 20 hr/week. * Drinking water samples were first-draw only in the Kansas/Missouri and Illinois datasets. First-flush, or overnight stagnation, samples tend to reflect the maximum possible water concentration but not children's typical exposure to lead in drinking water (6). Because the majority of water lead measurements were below the level of detection, however, no adjustment was made to project more typical water lead concentrations from these already very low measurements. For the Pennsylvania set, 30-min stagnation samples were available, and were considered typical for estimating children's water lead consumption (6).
* Indoor dust at each of the sites was sampled using similar low flow rate vacuum methods, but the locations of samples differed across the studies. The Illinois and Pennsylvania composites included dust collected from the entrance to the household, which would be expected to reflect a higher level of lead contamination, if present, from soil tracked inside but where children do not necessarily play (26,27). Most significantly, Illinois samples included dust from window wells and sills, which tend to have higher lead concentrations than floors when leadbased paint and other exterior lead contamination is present (5,28), whereas dust samples from the other sites did not include these locations. * Lead-based paint exposure is best represented for IEUBK predictions by the appropriate dust and soil lead measurements (6), as this is the most common source of exposure for lead-based paint because of children's mouthing behavior (4). Less than 10% of children are expected to exhibit pica for paint chips (29,30).

IEUBK Model Predictions
In addition to the identification of representative exposure inputs to the IEUBK model, the appropriateness of default values for several other input parameters should always be considered for each sitespecific use (6). These parameters include dietary lead intake, dirt ingestion rates, and bioavailability of the lead compounds present at each site. There were no data available concerning dirt ingestion or dietary lead intake that suggested changes to the IEUBK model defaults, either on a community or on an individual basis. A recent swine bioavailability study of Jasper County soils suggested an absolute lead bioavailability slightly higher than 30% for subareas with primarily mining and mill wastes as opposed to smelter wastes (31), and a similar study of Palmerton soils estimated absolute lead bioavailability centering on 30% (32). For the purposes of this exercise, lead bioavailability was kept at the default of 30% for all three datasets; results for the Jasper County dataset can be interpreted by town, according to the relative prevalence of the different soil types. The IEUBK model was run for each dataset, using child-specific inputs for age, soil lead concentration, dust lead concentration, and water lead concentration (6). This step generated a GM blood lead for each child-specific set of lead inputs. The blood lead predictions for each set of inputs were added to the original datasets to facilitate comparisons of observed and predicted blood lead levels according to the categories of demographic and behavioral variables contained in the datasets.

Statistical Analysis
This evaluation was seen as a range-finding exercise, exploring what is possible and suggesting areas for additional research and improvement. Comparison of descriptive measures, here GMs and the probability of exceeding 10 pg/dl, is a straightforward approach. The size of the difference between two measures and the sizes of the associated confidence intervals are more informative than one-dimensional p-values (33) resulting from statistical testing.
For the observed blood lead levels, the percentage exceeding 10 pg/dl was determined by the number of children observed to have blood lead levels > 10 jig/dl among all children in the sample. The associated 95% CI was calculated using exact tabled values (34). For each IEUBK prediction, the probability of exceeding 10 pg/dl was calculated from the GM with a GSD of 1.6 (6). The average of these individual exceedance probabilities was then calculated for each dataset. The average individual exceedance probabilities were treated as binomial probabilities for the purposes of estimating 95% CIs.
In addition to GMs and exceedance probabilities calculated for each study population, we used two approaches to identify ranges or subsets of child populations where IEUBK predictions may be less useful. First, the observed and predicted GM blood lead levels were compared for subgroups determined by factors that contribute to variability in exposure: child's age, locality, presence of lead-based paint, or time away from home or outside. Exceedance probabilities were not calculated for the subgroups.Because of the smaller group sizes, comparisons of upper percentile values could be substantially weaker than for the datasets as a whole. Second, scatter plots of observed and predicted blood lead levels were examined for systematic differences.

Overall Comparisons
Geometric mean IEUBK model predictions compared within 1 pg/dl of GM-observed blood lead levels for all three datasets: for Kansas/Missouri, 0.6 pg/dl less than the observed GM; for Illinois, a 0.0 pg/dl difference, and for Pennsylvania, 0.7 pg/dl greater than the observed GM. Table 3 presents a summary of the correspondence of observed and predicted mean blood lead levels for the three study groups. Note that the 95% CIs for the GMs overlap substantially within each dataset. Figure 1 illustrates these relationships further, in the context of the environmental soil and dust lead levels. These results demonstrate the plausibility of IEUBK  (Table 4). The IEUBK model's predicted incidence of elevated blood lead levels was within 4% of the percentage observed to be above 10 pg/dl: for Kansas/Missouri, 20% observed versus 18% predicted; for Illinois, 19% observed versus 23% predicted; and Pennsylvania, 29% observed versus 31% predicted. Here again, the substantially overlapping CIs for these exceedance probabilities do not indicate any important differences between the observed and predicted exceedance probabilities for these datasets. aChildren away from home < 10 hr/week. bChildren away from home < 20 hr/week.  (Tables 5, 6). The Pennsylvania comparisons are presented for completeness ( Table 7), but the relatively large CIs indicate that the sample sizes available within these subcategories precluded drawing strong conclusions. Age. In the Kansas/Missouri and Illinois datasets, observed blood lead levels for children less than 1 year old (2.9 and 3.8 pg/dl, respectively) were lower than those observed for the other age groups; there were no children under 1 year old in the Pennsylvania set. Observed blood lead levels were generally highest for children 1 to 2 years of age in all three datasets (6.1, 7.3, and 7.3 jg/dl, respectively), then decreased with increasing age. Predicted blood lead levels followed this same pattern. Because soil and dust lead levels were comparable across age groups (not shown), it is reasonable to assume that variability in predicted blood lead levels reflects the model's age-related parameters and algorithms rather than a coincidental gradient of environmental lead levels.
Observed and predicted GM blood lead levels agreed within 0.5 pg/dl for children 1 to 4 years of age in the Kansas/Missouri and Illinois datasets. For children older than 4 years, GM-predicted blood lead levels were consistently lower than observed across all three datasets. Although the difference between observed and predicted was greatest for the group < 1 year old than for the other age groups, this is not a strong result, as the sample sizes are small for the youngest age group and the CIs overlap substantially. Relatively greater uncertainty in such factors as daily dirt ingestion rates (35), lead absorption rates (36), and amount of lead transferred during gestation (37) identifies this age group as deserving further study [see also discussion in SAB report (18)]. In the meantime, it is probably most practical not to generate predictions specifically for this age group but still to include it when generating predictions for all children up to 7 years old. Time Away from Home. In the Kansas/Missouri and Illinois sets, GM-observed blood lead levels were similar whether children were reported to spend all of their time at home or whether they spent up to 10 hr/day away. For children not away from home, GM-predicted blood lead levels tended to correspond to the pattern seen for overall agreement, with the GM prediction lower than the GM observed for Kansas/Missouri children and very similar for Illinois children. In both sets, predictions were about 1.5 pg/dl higher for children away up to 10 hr/week than for those not away, reflecting higher soil lead and dust lead levels measured at these homes relative to those of the other children. CIs for blood lead measurements and predictions all overlapped substantially. Time Spent Outside. Observed blood lead generally increased with increasing hours per day a child was reported to play outside (Tables 5 through 7), indicating some impact of increased exposure to soil lead. In addition, there was some correspondence of higher age with increasing time outside (not shown), but as blood lead levels were observed to decrease with age (in larger groups than here), the overall increase in blood lead with time spent outside appears fairly robust.
Predicted blood lead levels were relatively independent of time spent outside. This result was expected because the IEUBK model assumes that, on average, 45% of the dirt that children typically ingest is soil (6). Also, soil lead and dust lead levels showed no particular association with time-spent-outside categories (not shown). For each category, the CIs for observed and predicted mean blood lead levels overlapped; there is no strong difference between the observed and predicted blood lead levels within each time-outside category.
Takes Food Outside. Observed blood lead levels were about 1 pg/dl higher on average for children who were reported to take food outside with them to play than those who did not, in all datasets. This suggests a higher level of soil ingestion for the children who took food outside relative to that for the other children. GM-predicted blood lead levels were similar for these two categories, as expected, and within 0.4 pg/dl of the GM-observed blood lead level for the children who did not take food outside in the Kansas/Missouri and Illinois datasets.
Locality. In the Kansas/Missouri and Illinois datasets, observed GM blood lead levels varied across localities. For the Kansas/Missouri set, children in Neosho and Duenweg had the lowest observed blood lead levels, and children in Oronogo had the highest, on average. The predicted GM blood levels also follow the same pattern. In several instances the observed and predicted mean blood lead levels differed by more than 1 pg/dl, but the sample sizes available were generally small, and wide CIs overlapped considerably. In the Illinois dataset, observed blood lead levels decreased with distance from the smelter, as did the predicted blood lead levels, on average. In addition, the predicted mean blood lead levels were within 0.9 pg/dl of the mean observed blood lead levels for each sector. Lead-Based Paint. As noted in Table 2, there were XRF measurements of indoor paint for all three datasets. Use of the presence of interior lead-based paint (XRF . 1 mg/cm2) as an indicator of exposure to lead-based paint, however, is incomplete without some knowledge of the condition of the paint. Nevertheless, in case of an overt trend, we compared observed and predicted blood lead levels categorized by presence of lead-based paint.
The datasets were not consistent with respect to observed blood lead levels. Missouri/Kansas children in homes with interior XRF < 1 mg/cm2 had lower observed blood lead levels than those living in homes with interior with XRF > 1 mg/cm2, by about 2 pg/dl, whereas there was no apparent difference for the Illinois children according to presence of leadbased paint. In both datasets, predicted blood lead levels were lower for children in homes with interior XRF < 1 mg/cm2 than for those in homes with interior with XRF . 1 mg/cm2, by about 3 pg/dl. GM-measured dust lead concentrations were higher in both datasets for children in homes with interior lead-based paint than for the other children. Soil lead concentrations were also higher for children in homes with interior lead-based paint, indicating that exterior sources of lead have to be considered simultaneously, in addition to considering the condition of both interior and exterior lead-based paint.

Correspondence for Indiviual Children
Bearing in mind that the IEUBK model is not intended to be used to replicate the observed blood lead levels of specific children, the individual correspondence of observed and predicted blood lead IEUBK-predicted GM blood lead, ,ug/dI Figure 2. Correspondence between observed blood lead levels and IEUBK-predicted blood lead distributions for Kansas/Missouri children away from home . 10 hr/week. Two points were left out because the observed blood lead levels were below the range of the graph, <1 pg/dl. IEUBK-predicted GM.blood lead, jig/di  plots of observed versus predicted blood lead levels on a child-specific basis. These figures follow the same format as Figure 5, with the parallel lines representing 95% IEUBK model prediction limits. Figure 5 illustrates the intended correspondence between observed and predicted blood lead levels, assuming the correctness of IEUBK model parameters and the absence of significant error in measured environmental lead levels. Figure 1 illustrates the intended correspondence between observed and predicted blood lead levels, assuming the correctness of IEUBK model parameters and the absence of significant error in measured environmental lead levels. Approximately 20% of the observed blood lead levels fall outside the prediction limits rather than the 5% expected and illustrated in Figure 5.
Other explanatory variables available in the datasets, such as qualitative behavioral information, may account for some of the differences seen. For example, in the Illinois dataset, among the children whose measured blood lead levels were higher than the IEUBK prediction interval ( Figure 3), 78% took food with them outside to play, compared with 24% of those whose measured blood lead levels were lower than the IEUBK prediction interval, and 45% for the rest of the children. One interpretation of Figures 2 through 4 is that the individual GSD is too low, even though the GSD was intended to include a plausible range of biologic and behavioral variability.
Note that in Figure 3, only three of the IEUBK predictions > 30 pg/dl corresponded to observed blood lead levels  within the prediction limits. Although all of these predictions were included in all of the summary measures and comparisons discussed here, we recommend that the IEUBK model not be relied upon for exposure combinations leading to a predicted mean blood lead level greater than 30 pg/dl because the exact nature of the nonlinear relationship between lead exposure and blood lead is less certain in this range of blood lead levels (18). Since the level of concern is currently 10 pg/dl, this is more an academic issue than a practical limitation.

Discussion
This is the most extensive comparison of a biologically based blood lead model with real-world data of which we are aware. Within the scope of these comparisons, IEUBK-predicted blood lead levels agree with observed blood lead levels within 1 pg/dl, and IEUBK-predicted risk of blood lead greater than 10 pg/dl agrees with observed population exceedances within 4%. We conclude that this is reasonably close agreement.
The agreement of observed and predicted blood lead levels was closer for the subgroup of children with the highest blood lead levels (those 1-4 years of age ), and for children who did not take food with them outside to play in the Kansas/Missouri and Illinois sets. In general, however, it was difficult to draw strong conclusions about most subgroups because of the smaller sample sizes.
The only limit we have identified for IEUBK model predictions is to place less reliance on predictions > 30 pg/dl because of limited supporting data. There are several other reasons for not identifying specific ranges of environmental lead levels as being less or more suitable IEUBK inputs. First, the multisource nature of lead exposure requires consideration of joint distributions of lead from all sources; separate source-specific ranges of environmental lead levels are not useful. Also, variation in bioavailability of lead compounds from those prevalent in these studies would complicate extrapolating levels identified here to other settings. Note that although agreement between GM observations and predictions was somewhat looser across geographic subgroups, they were still consistent with the geographic pattern of soil and dust lead levels observed, further supporting the overall agreement seen across these datasets.
Second, uncertainty in environmental lead measurements is also an important Environmental Health Perspectives * Vol 106. Supplement 6 * December 1998 1 consideration in understanding limitations on model use. Consider again Figures 2 through 4, any one of which might suggest that model predictions tend to be higher than observed at the higher end of the blood lead distribution and lower than observed at the lower end. Given the agreement of the GM blood levels and of the probability of exceeding 10 pg/dl across the three datasets, it is less likely that the default GSD is too low. One plausible explanation is that the exposure estimates both under-and overestimated individual children's cumulative lead exposure due to the cross-sectional measurement of lead levels from limited areas of each child's sphere of activity.
For instance, the predictions in Figure 3 that were > 30 jig/dl corresponded to homes with dust lead measurements > 15,000 ppm. Recall that the composite dust samples in the Illinois dataset included dust from window sills and entryways, areas where children can have exposure but perhaps not on a regular, daily basis. If samples from each subarea were not collected proportionally according to the children's typical activities, the lead measurement of the composite will not represent the actual exposure. This variability in the estimated exposure level, often called measurement error in the statistical literature (38,39), contributes to a reduction in both slope and correlation estimates as a function of the magnitude of this extra variability. It is important to note that the term measurement error is not used here to suggest in any way that these studies were carried out inappropriately. Even when conscientious efforts have been made to identifr play areas, there is still enough variability between children-in the frequency and the type of use of the areas-that data allowing a clear distinction between lead exposure (leading to a most typical blood lead level or GM) and individual variability in response to that exposure are difficult to collect.
We have concentrated on grouped measures of agreement, assuming that measurement error in most environmental lead studies is generally unbiased, generally only increasing the variability in measurements. In addition, there is reason to believe that exceedance probabilities based on errorprone environmental measurements may be biased upward (40). We undertook a sensitivity analysis of the possible impact of measurement error on exceedance probabilities. First, Figure 6 illustrates cumulative distributions of exceedance probabilities corresponding to the range of blood lead levels seen in the Illinois study, as estimated from the measured blood lead levels and from model predictions. Note that the IEUBK model-based exceedance probabilities are somewhat higher than observed for 10 pg/dl and higher blood lead levels.
Next, lacking data for the withinresidence variability of environmental lead levels for this study, we borrowed an estimate from another study having several dust lead measurements for each residence studied (28). Variance in blood lead levels associated with the median withinresidence variability of measured lead levels (GSD = 1.65) was subtracted from the verall variability in the predicted blood lead distribution. This removal of measurement error from the overall variability results in a model-based distribution of exceedance probabilities that agrees quite closely with the observed distribution (Figure 7). This demonstration is intended to serve as an illustration only, as the estimate of measurement error was based on one medium only in an unrelated city. On the other hand, it appears to be a realistic amount of variability, given that the Illinois dataset had relatively variable dust lead measurements.
A number of demographic variables have been associated with children's blood lead levels, e.g., parent's education, * Observed values -Predicted distribution 15 Blood lead,,ug/dl 20 25 30 Figure 6. Comparison of the probability of exceeding specific blood lead levels for observed and predicted blood lead levels in Illinois children. This figure illustrates cumulative distributions of probabilities of exceeding the blood lead levels on the x-axis, corresponding to the range of blood lead levels seen in the Illinois study. The symbols show exceedance probabilities estimated from the measured blood lead levels, and the curve the exceedance probabilities estimated from the IEUBK model predictions summarized in Table 4. Comparison of the probability of exceeding specific blood lead levels for observed and measurementerror-adjusted predicted blood lead levels in Illinois children. In the predicted curve the variance of modeled blood lead levels is reduced by an amount attributable to a within-residence error in measured environmental lead levels (GSD=1.65). This demonstration is intended to serve as an illustration only, as the estimate of measurement error was based on one environmental medium only in an unrelated city (28).
Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 socioeconomic status, child's sex, and race (23,24). Interactions between these factors can be expected to vary across communities and population groups, and have confounded efforts to generalize the results of regression models developed for specific communities. Such variables are difficult to accommodate in an exposure-response model, however, because it seems unlikely that data will become available that will allow, for example, reducing dirt ingestion rates by some fixed amount for each year of graduate school the parents had. The IEUBK-predicted distribution acknowledges the influences of these demographic variables through the individual GSD, which was estimated in a context that limited variability in environmental lead exposure, minimizing the influence of measurement error, and captured all other sources of variability in blood lead levels within the range of available data (6). The IEUBK model is flexible in allowing the use of site-specific model parameters when adequate community-specific measurements are available. The results of these comparisons supported the use of the model defaults for these communities. The procedures used here to evaluate IEUBK model predictions should not be confused with those for generating predictions to be used in risk assessments. An evaluation of model plausibility considers whether, given well-characterized contact with environmental lead, the predicted blood lead distribution agrees with the distribution of observed blood lead, allowing for the limitations associated with both the observations and the model. For risk assessment at a residential level, however, the exposure assessment must consider how the current environmental lead levels at a home could result in a blood lead level exceeding a given value in any child, not just the children currently living there. Specifically, it is not necessary to survey how many children are in day care, for how long, or how long children play outdoors for each community, unless they are expected to differ markedly from children already studied. Although model predictions for residences with children away from home most of the week may not agree with observed blood lead levels (although they should if the secondary exposures are similar to those measured), these predictions still provide an estimate of the relative hazards of lead exposure in these homes if circumstances should change, with children spending much more time there.
It should also be noted that the level of agreement shown in this exercise is somewhat dependent on the environmental sampling methods used in these studies. Alternate collection methods for sampling dust and soil, including XRF measurements of soil, high-flow rate samplers for dust, by different procedures for sieving samples before analysis, may lead to different concentrations from the same areas (28). Environmental lead concentrations generated by other methods may be used in the IEUBK model, but predictions must be interpreted accordingly.
As mentioned earlier, we did not pursue statistical significance testing, even though a number of statistical approaches for comparing observations with predictions are in common use. Also, estimates of sensitivity, specificity, and positive predictive value were considered inappropriate for this exercise because neither the predicted nor the observed blood lead levels are the indicators that these procedures require. Recall that the point of an IEUBK model prediction is not a specific blood lead level but a distribution of plausible blood lead levels leading to a probability of elevated blood lead-not a yes/no indicator. In addition, at the time the studies considered in this evaluation were conducted, the Centers for Disease Control and Prevention expected that a proficient laboratory would measure "blood lead levels to within several micrograms per deciliter of the true value (for example, within 4 or 6 pg/dl of a target value)" (4). Use of individual crosssectional blood lead measurements would lead to some misclassification of elevated blood lead. Statistical significance testing depends upon a well-defined hypothesis to be tested, including levels of practical significance between the quantities being compared. Risk assessors and managers can help determine whether agreement of mean blood leads within 1 pg/dl or risk of elevated blood lead within 5%, for example, will be adequate, depending on the purpose of a particular risk assessment. Statistical significance testing also depends on understanding what sample sizes will allow these identified differences to be detected, if truly present. In an opportunistic mode of using available studies, as in this exercise, some studies will be large enough that a trivial difference (e.g., 0.1 pg/dl between observed and predicted) can be determined to be statistically significant, whereas other studies are small enough that an important difference cannot be substantiated with statistical testing. Appropriate statistical procedures will be more constructive when a study can be designed for the explicit purpose of evaluating a model, including a thorough exposure assessment.
Note that the overall percentage of children exceeding 10 JIg/dl in these datasets ranged from 19 to 29%. The concordance of model predictions with these observations confirms the usefulness of model predictions for a range of environmental lead conditions somewhat higher than those associated with the target of limiting risk of elevated blood lead to no more than 5%. The consistency across these data sets suggests that the useful range of environmental lead levels will be extended beyond those considered here. Future empirical comparisons will include datasets with lower overall exceedances of the blood lead level of concern.