The integrated exposure uptake biokinetic model for lead in children: independent validation and verification.

The U.S. Environmental Protection Agency employs a model, the integrated exposure biokinetic (IEUBK) model for lead in children, for the assessment of risks to children posed by environmental lead at hazardous waste sites. This paper describes results of an effort to verify the consistency of the documentation with the computer model and to test the computer code using a group that is independent from those involved in the model development. This review concluded that the IEUBK model correctly calculates the equations specified in the IEUBK model theory documentation. However, several issues were identified on model documentation, model performance, and the C++ programming language code (i.e., IEUBK model source code) documentation. These issues affect the ability of an independent reviewer to understand the workings of the IEUBK model but not the model's reliability. As a result of these findings, recommendations have been provided for updating documentation to the model as well as associated adjustments to the model documentation.

monograph, the integrated exposure uptake biokinetic (IEUBK) model for lead in children (1) is a mathematical model that has been developed for integrating lead exposure across multiple exposure pathways for children. This model provides a multipathway analysis of the impact of environmental lead levels that relies upon site-specific information. Before adoption of the IEUBK as a tool to assess lead risks at sites, the U.S. Environmental Protection Agency (U.S. EPA) employed a range of 500 to 1000 ppm to support site cleanup decisions. In many instances, interpretation of this range resulted in the use of 500 ppm for residential cleanups and 1000 ppm for industrial cleanups. Today, the U.S. EPA's use of the IEUBK model encourages use of site-specific information for residential settings and often results in cleanup levels that are higher than the 500 ppm that would have been used to protect the health of children.
The IEUBK model relates environmental concentrations of lead with potential blood lead levels in children exposed to contaminated medium. The IEUBK model is structured so that the environmental concentration-blood lead relationship in children is established through four distinct components: exposure, uptake, biokinetics, This paper is based on a presentation at the Workshop on Model Validation Concepts and Their Application to Lead Models held 21-23 October 1996 in Chapel Hill, North Carolina. Manuscript received at EHP 16 January 1998; accepted 12 May 1998.
The views expressed are those of the authors and do not neccessarly reflect those of the U.S. Environmental Protection Agency.
IV&V results presented in this paper are drawn from information developed through U.S. Environmental Protection Agency contract 68W10055 ("Mission Oriented Systems Engineering Support"). The authors are especially grateful to the task leader D. Miller, who managed or performed most of the testing described.
Address correspondence to L. Zaragoza, Office of Solid Waste and Emergency Response (5204G), U.S. Environmental Protection Agency, 401  and blood lead distributions. These four model components are designed to run as distinct but interrelated modules, and can be described as follows: * The exposure components compute an exposure/intake-dose, expressed as micrograms of lead per day, based on media-specific lead concentrations and consumption rates (in cubic meters of air inhaled per day, grams of soil ingested per day, or liters of water ingested per day). * The uptake component estimates the biologic uptake and transfer of lead from the gastrointestinal tract or lungs to the blood (in micrograms per day) for children ages 0 to 7 years (0-84 months). * The biokinetic component estimates the transfer of absorbed lead between blood and other vital tissues and its elimination through excretory pathways. The outcomes are calculated in discrete time fractions for the period of 0 to 84 months. * The probability distribution component produces graphic illustration of the probability of exceeding blood lead levels above the level of concern (default value of 10 jig/dl) for a particular age group (or time period) for up to 84 months. The user can then explore an array of possible changes in exposure media that would reduce the probability that blood lead concentrations would be above this level of concern. The IEUBK model is a simulation model that should be viewed as a tool for making rapid calculations and recalculations of a complex set of equations that includes a large number of exposure, uptake, and biokinetic parameters. In addition to assessing childhood lead exposure and retention, the IEUBK model results can be a useful component of remediation strategies for lead in the environment.
The IEUBK model is a product of many years of development within the U.S. EPA. The initial efforts to model lead emerged from the Office of Air Quality Planning and Standards with the development of the National Ambient Air Quality Standard for Lead and subsequently from the Office of Water in the National Primary Drinking Water Regulation for Lead. Both of these offices employed mathematical modeling to estimate the impact of lead on child blood lead levels.
Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 tS51 Lead is one of the most prevalent toxic chemicals found at Superfund sites. The U.S. EPA data show that lead is among the most frequently used contaminant in the scoring of sites with the Hazard Ranking System, which is the primary tool the U.S. EPA employs to add hazardous waste sites to the National Priorities List (NPL) (2). NPL sites are those hazardous waste sites that are determined to warrant raising their status to have national concerns. At present there are about 1200 proposed and final NPL sites across the United States.
One of the uses of this IEUBK model is to support the implementation of the 1994 interim directive of the Office of Solid Waste and Emergency Response (3) for the assessment of soil lead risks. The interim directive explains how the IEUBK model results can be used as a tool to assist in determining site-specific cleanup levels. In this context, the IEUBK model can be viewed as a predictive tool for estimating changes in blood lead concentrations as exposures are changed. Also, the IEUBK model could be viewed as a useful tool to aid the agency in making more informed choices about the concentrations of lead that might be expected to impact human health. However, it is important to recognize that the outputs of the IEUBK model alone do not determine cleanup levels; other factors as outlined in the National Contingency Plan are also considered in determining cleanup levels.
The need for the review described in this paper is detailed in the validation strategy for the IEUBK model (4). This validation strategy specifically calls for a four-step validation process. These steps are described below. a) The scientific foundations of the model structure. Does the model adequately represent the biological and physical mechanisms of the modeled system? Are the mechanisms understood sufficiently to support modeling? b) Adequacy of parameter estimates.
How extensive and robust are the data used to estimate model parameters? Does the parameter estimation process require additional assumptions and approximations? c) Verification. Are  The work described below has been performed to address step c of the validation process.

Methodology
The IEUBK Independent Validation and Verification (IV&V) project was designed to verify that the computational algorithms can accurately solve the governing equations and parameters and that the code is fully operational (i.e., the code is robust and serviceable).
The IV&V team investigated other validation methodologies to determine whether the approach taken here is consistent with current practices. Although we were unable to locate validation studies for other models that specifically use differential equations, the general validation and verification process applied to the IEUBK model is consistent with software testing standards used. In particular, the standard software test process outlined advocates considering automated scripts or drivers, and interface simulators as tools for software testing. The Mathcad 5.0 software (5), which was used to reproduce the IEUBK model functions, is such a tool. In addition, standardized testing routines and reporting formats, as well as configuration control and problem logs, were used to track issues identified in the validation.
To conduct the IV&V of the IEUBK model, the following activities were planned and implemented: * Review of IEUBK model documentation and other related materials for completeness and accuracy. * Comparisons between the IEUBK model source code equations and parameters with the IEUBK model's theory. * Verification of the IEUBK model source code, including coding of equations, for completeness and correctness. * Examination of the efficiency of the IEUBK model source code. * Review of the IEUBK model for compliance with agency standards for the development of scientific information systems. * Pathway tests that evaluated the performance of the system on various hardware configurations.
The following discussion focuses on the results for the first three items described above. The IV&V work undertaken was based on review of a copy of the source code, publicly available documenta6vn, and a copy of the IEUBK model seurce code, which is commercially available. The U.S. EPA used a group to perform-the work described below that was independent of the parties involved in the original development and programming of the IEUBK model.

Findings
The findings of these activities, summarized below, are drawn from the IV&V report (6). The details of the validation strategy and technical approach can be found in the IV&V report. The IV&V report concludes that the IEUBK model correctly calculates the equations specified in the IEUBK model theory. The

Verification ofthe Execution ofthe Linear Equations in the Code
The execution of the linear equations for calcuJation of environmental pathway exposure and uptake of lead, and calculation of compartmental lead transfer times, fluid volumes, and organ weights are presented in Table 3.

Verification ofthe Execution ofthe Differential and Associated Equations in Isolation
The IV&V report documents the execution of the differential equations in the IEUBK model and their associated equations in the calculation of compartmental lead masses, tissue lead masses, and blood lead concentration at birth, and the overall blood lead concentrations in the TSD. A summary of the results of this activity is presented in Table 4.

Testing ofthe Overali Set ofEquations
The overall set of equations in the IEUBK model source code was verified and the results are summarized in Table 5.

Verification ofthe Input/Output Routines
The JV&V report verified the input/ output routines of the IEUBK model and   Review to determine consistency of parameter coding in the source code modules.
Verify that the assignment of default values to parameters is made in clearly identified blocks of the IEUBK model source code and prior to the use of those parameters.
Verify that units of measurement are identified in comment statements in the IEUBK model source code.
Several inconsistencies between documentation and the IEUBK model parameter values, parameter names, and a default value were identified. To improve the consistency of parameter coding, the IV&V report recommends that all external parameter declarations be made prior to the internal parameter declarations, and that all default value assignments be grouped so that IEUBK model users can easily identify them. Very few of the parameters used in the IEUBK model are defined by comment statements in the source code. The IV&V report recommends grouping parameter declarations into general categories and adding a comment block immediately prior to the parameter declarations that defines the group. The units of measurement for default parameter values are not identified in the IEUBK model source code. Units of measure should be identified as part of the comment statement that defines the parameter. documented all discrepancies in the input/ output routines that control user-entered parameter values. A summary of the results is presented in Table 6. Investigation of Geometric Mean Calculations The IEUBK model source code for the calculation of the geometric mean was reviewed. Two functions were identified in the IEUBK model source code calculation of the geometric mean. These functions use a time-weighted average of monthly blood lead levels to provide a central tendency blood lead level corresponding to each year of age. This time-weighted average is algebraically equivalent to an arithmetic mean of the monthly predicted blood lead levels. The IV&V report noted an apparent conflict between an average calculated as an arithmetic mean, yet subsequently used as a geometric mean. The U.S. EPA considers the algorithm correct for the reasons discussed below. Any deficiency in the approach is due more to a lack of adequate documentation. Documentation will be clarified.
The IEUBK model generates a central tendency estimate of blood lead for a typical individual smoothed over each year of age up to 7 years. The time-weighted average blood lead level is the last step in the deterministic, biokinetic component of the model before the model estimates individual risk of elevated blood lead in its statistical component. The time-weighted average deals with variability over time in lead biokinetics. Given the current level of understanding of these processes, equal weighting of monthly central tendency estimates is the most defensible approach to estimating an overall central tendency blood lead level. There is no evidence that variability over time should be characterized by lognormal variability.
The statistical component of the IEUBK model then generates a lognormal distribution of individual blood lead levels around the best central tendency estimate of blood lead for a given lead exposure scenario. There is strong support for the observation that distributions of individual blood lead levels are adequately characterized by lognormal distributions. It is especially important to note that the variability over time is distinct from the individual variability in blood lead addressed by the statistical component of the model. In addition, the lognormal variability used by the IEUBK model was estimated as a cross-sectional measure of variability.
Comments on die Graphics Presented that generate the graphs of the by the IEUBK Model probability distributions were reviewed.
After reviewing the calculation of the The IV&V report concludes that the means and the use of internal parameters, graphical functions of the IEUBK model each of the IEUBK model subroutines work properly. Table 3. Results for the verification of simple linear equations in the IEUBK model source code, summarized from the IV&V report.

Verification activity Summary of results
Verify the correct representation in the IEUBK model source code of the equations in the TSD equation dictionary.
Verify the correct call of default values of independent parameters and constants into the equation.
Verify the correct execution of the IEUBK model algorithms and correct reporting of the dependent parameters using Mathcad, version 5.0 (5).
The majority of the equations as listed in the TSD equation dictionary are correctly represented in the IEUBK model source code. A few equations were different. However, these differences were either explained in the TSD or were not expected to change the result of the IEUBK model calculation (e.g., typographical errors in the TSD).
The majority of the equations tested called all of their default parameters correctly. The IV&V report recommends that the IEUBK model source code be searched for all occurrences of multiple parameter declarations to ensure that they are consistent. This discrepancy is not expected to significantly affect the results calculated by the IEUBK model. Differences in the parameter declaration cause the values used by the IEUBK model to be different at less than the fourth or fifth decimal place and output is reported to the user to the first or second decimal place. The linear equations execute correctly in the IEUBK model source code. Table 4. Results of the verification of the execution of differential equations, summarized from the IV&V report.
Verification activity Summary of results Verify correct representation in the IEUBK model source code The majority of equations in the TSD and the of the difference equations in the TSD equation dictionary. IEUBK model source code match. Differences do not affect the IEUBK model results. Verify the correct specification of the 4-hr time interval.
The IEUBK model executes each equation 180 times for each monthly period and therefore is correct.
Verify the correct call of default values of independent All default values are called correctly. Values parameters and constants into the equation. computed for each difference and associated equation are correct. Examine use of valid and numerically stable differential The backward Euler solution method used by equation solution methods in implementing the the IEUBK model is a well-tested, widely used difference equations. method of solving differential equations. No change to this solution method is recommended. Compare the results of the multiple test scenarios The IV&V report noted that the IEUBK model source for the Mathcad and the IEUBK models. code was confusing for three variables used to calculate soil and dust exposures but emphasized that the IEUBK model does execute the exposure equations for soil and dust correctly. The logic of the code is simply hard to follow because it is not consistent.

Model Testing
IEUBK model testing also includes a sensitivity analysis that will address work that has not been addressed in these proceedings. However, several questions have been raised and addressed in response to user questions. For example, one user questioned results showing that uptake from water and from diet decreased as soil lead concentrations were increased.
According to the model theory, and as executed in the model source code, media uptake (soil, water, diet, or other) through the gastrointestinal tract is a function of the environmental exposure and of two uptake factors, passive and saturable uptake. The passive, or nonsaturable, uptake is a linear function of exposure and has no limit as environmental concentrations increase. The saturable uptake component is a function of uptake from all media and has an upper limit. Therefore, as uptake from one media (in this case, soil) increases because of increased environmental exposure, uptake from other media (in this case, water and diet) will in fact decrease. Although this is not intuitive at first, it is consistent with model theory. This relationship is shown graphically in Figures 1 and 2. The test variable used in Figure 1 is derived from the uptake equations in the model equation set U-1 as defined in the TSD. PAF represents the passive absorption factor, AVINTAKE(t) is the total available intake, and SATINTAKE(t) is the half-saturation absorbable lead intake, or the potential uptake of lead when the saturable uptake pathway is 50% saturated. As the diagram shows, the test variable decreases to a constant value as the total available intake increases with increased environmental exposure from any media. Each media uptake is proportional to the lead exposure in media and to the test variable, so that as the test variable decreases, uptake from a particular media will decrease if the exposure to that media is constant. That is, the total available intake will increase as soil lead levels increase, but exposures to diet and water will remain constant. Therefore, the total uptake from diet and water will decrease. This relationship is demonstrated for water in Figure 2.
As shown in Figures 1 and 2, the soil lead levels were increased to soil lead levels well beyond the recommended limits for application of the IEUBK model of < 5000 ppm lead in soil. The model results indicated that there would be no upper limit to either the total uptake or the blood lead Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 level as soil lead levels increased. This would also be true as exposure to any other media was increased. The predicted blood lead levels resulting from each soil lead level are provided in Figure 3. No other parameters were changed during this analysis. Given the U.S. EPA's goal of protecting a typical child or similarly exposed group of children to the 95% confidence limit from a blood lead level of 10 pg/dl, the higher soil and blood lead levels were evaluated only to evaluate IEUBK model performance. An important limitation to the application of the IEUBK model is that it has been designed and should be used to evaluate exposure scenarios that are at least similar to the exposure scenarios used for calibration and empirical validation. In general, the soil lead levels used in these studies are below 5000 ppm lead in soil. Similarly, environmental exposures associated with blood lead levels above 30 lIg/dl are above the range of values that have been used in the calibration and empirical validation of this model. Thus, the utility of these figures is to demonstrate the integrity of the mathematical relationships tested.

Discussion
The findings of this work show that the IEUBK model performs calculations in a manner that is consistent with the model documentation. However, this work also shows that the IEUBK model documentation should be improved. The IV&V report recommends that the U.S. EPA Table 6. Results of the verification of the input/output routines, summarized from the IV&V report.

Verification activity
Summary of results Enter modified parameter values into the IEUBK model and use the C debugger to view how the IEUBK model calls and inputs the modified value. Analyze the IEUBK model to determine how results that are generated and stored in the calculation routines are communicated to the user or to output data files.
Observe the output of results to the screen or to data files, and verify that the results correspond to the values actually generated by the IEUBK model source code.
Regardless of the number of digits (up to six) that a user inputs, the data entry screen on the IEUBK model formats the parameter to between one and three decimal places, but only after the cursor is moved off the data entry field, then back to the same field. The user then may not know which figure (the rounded or the actual figure key entered) is used in IEUBK model calculations. The IV&V report recommends that the IEUBK model source code be revised so that actual values entered are used. No quality assurance/quality control checks are coded into the IEUBK model for parameter values that are loaded from existing ASCII data files. The IV&V report recommends that the IEUBK model developers revise the file input routines to include quality assurance/quality control procedures for data loaded from saved ASCII data files. The IEUBK model reuses variable names to improve the readability of the IEUBK model source code. The repeated use of variable names can cause confusion for users attempting to trace function paths in the IEUBK model source code. The IV&V report recommends that identical model variable and parameter names be changed in the IEUBK model source code to clarify the IEUBK model functions. This issue does not affect the results calculated by the IEUBK model. Data for gastrointestinal values/bioavailability may be entered from any media data entry screen. It is not made clear that the user is changing the same parameter values regardless of which media the bioavailability data are entered from. This problem does not affect the results of the IEUBK model, but can allow users to construct model scenarios that are different than the user intended. The IV&V report recommends that a single data entry option for these parameters be created in the data entry menu. The parameters PaintConc and PaintFraction are not reported to the user. The statement that is intended to store values for both PaintConc and PaintFraction instead contains the parameter names OccupConc and OccupFraction. This does not affect the results calculated by the IEUBK model, only the input parameters that are reported to the user. revise documentation. The documentation revisions should also address comments in the IEUBK model source code, which do not affect the integrity of computations, but will improve the transparency of computations (9).
The U.S. EPA is undertaking an effort that will address the documentation issues identified in the IV&V work. However, the U.S. EPA also recognizes that there are performance advantages to reprogramming Soil lead concentration, 9.g/g Environmental Health Perspectives * Vol 106, Supplement 6 * December 1998 n 0 computer code for a Microsoft Windows environment (10). For this reason, the U.S. EPA has initiated work to both address the documentation issues and reprogram the computer code for a Windows environment. The questions from this workshop and other communications with outside scientists and technical experts have also identified a need for more information to better understand the U.S. EPA approach to assessing lead risks for Superfund sites. As such, the technical review work group is developing a lead website. (http://www. epa.govisuperfundi oerr/inLpro/lead/index. htm).