Analysis of pediatric blood lead levels in New York City for 1970-1976.

A study was completed of more than 170,000 records of pediatric venous blood levels and supporting demographic information collected in New York City during 1970-1976. The geometric mean (GM) blood lead level shows a consistent cyclical variation superimposed on an overall decreasing trend with time for all ages and ethnic groups studied. The GM blood lead levels for blacks are significantly greater than those for either Hispanics or whites. Regression analysis indicates a significant statistical association between GM blood lead level and ambient air lead level, after appropriate adjustments are made for age and ethnic group. These highly significant statistical relationships provide extremely strong incentives and directions for research into casual factors related to blood lead levels in children.


Introduction
The occurrence of elevated blood lead levels in children is a problem of major concern to public health officials. In order to locate and treat children whose lead levels are sufficiently high to represent a threat to their health and well-being, local communities have conducted screening programs, often with the support of Federal grants. Data collected by these screening programs (lead levels as well as demographic and environmental characteristics) represent an important source of information for both evaluation and analysis. Such data can be used to measure the progress of programs in controlling lead levels in children and to illuminate relationships between the body burden of lead and demographic/ environmental factors.
Published summary statistics from lead screening programs (1,2) are of little value in addressing these issues. Specifically, these summaries provide no supportive demographic data and scant environmental data. In  ber of cases which are of immediate medical concern: i.e., instances of lead poisoning and elevated lead levels. What is of more interest is the distribution of lead levels in the entire population screened. Furthermore, the definition of a "case" has changed over time as perception ofor concern for the problem has changed (3), making historical comparisons difficult. Detailed analysis is thus possible only by investigating an original data base which is well maintained and available in a convenient form.
The New York City Bureau of Lead Poisoning Control (4) was created in early 1970 and since that time has conducted a large-scale screening program. In addition, the Bureau has maintained records in computer readable form for all children screened by venipuncture from the program's inception to the present. This data base includes a number of demographic characteristics (e.g., date of birth and ethnic group), as well as the date of sampling and an indicator ofgeographic location. Certain analyses of this screening data have been previously performed (4)(5)(6), but only for the period extending through June 1972. These analyses included examination of the influence of ethnic origin and time of sampling, but only for those instances identified as cases.
The present study also examines the New York City screening data, but with emphasis on the variation in lead levels of the entire screened population. Moreover, the expanded analysis presented here is based on over 300,000 records collected between March 1970 and December 1976. This number of records is sufficiently large to support valid statistical analysis of the effects of several important variables, and the period of time covered by the data base is long enough to permit observation of meaningful trends.

Methodology
Blood samples were collected for lead analysis by both public and private health providers. The reason for collection may have been a decision on the part of an individual health provider or the result of an organized large-scale outreach operation. In addition to mobile vans, over 200 fixed drawing points throughout the City served as collection centers for blood samples during the screening program.
Prior to April, 1973, all screening was done on 3-5 ml samples obtained by venipuncture. Each sample was then analyzed for blood lead concentration. After this date, most of the initial screening was performed using micro amounts of capillary blood, which were analyzed for either blood lead or free erythrocyte protoporphyrin (FEP). If this initial test proved "positive" (3), a venous blood sample was obtained for confirmation. However, a large number of venous samples were still drawn for initial screening purposes after April 1973. Venous samples were also obtained for monitoring the progress of treated children previously identified as having elevated blood lead levels.
The data base under consideration consists of 344,512 records for children tested during 1970-1976 on the basis of venous blood samples. This total includes 178,533 records identified on the collection form as first screening tests, 25,293 records representing confirmatory tests of an initial suspected high lead level, and 140,546 records not identified on the collection form as belonging to either of these groups. Of the total data base, 140 records were rejected because of erroneous or inconsistent data elements. Table 1 summarizes by year the number of records in each of these categories. The present analysis reports exclusively on the 178,533 records positively identified as first (venous) screening tests, although a parallel analysis of the 140,546 records not identified as to reason for test shows almost identical results.
All venous blood samples collected were analyzed by the New York City Health Department Laboratory. Lead was analyzed by atomic absorption (7), and the procedure remained the same throughout the time period [1970][1971][1972][1973][1974][1975][1976]. Since the Laboratory participated in the Center for Disease Control (CDC) blood lead proficiency testing program, some indication of  (8). The differences were found to be approximately normally distributed with a mean of-3.0 ,g/100 ml and a standard deviation of 6.2 ,g/100 ml.
The lead analysis results, together with supporting data such as ethnic group, sex, date of birth, FEP, and location and date of sampling, were entered into a computerized data processing system. Blood lead levels (in ug/100 ml) were entered as coded values in 10 ug/100 ml intervals. The intervals 5-14, 15-24, 25-34,. ... were employed prior to January 1, 1975, and the intervals 10-19, 20-29, 30-39,. . . subsequent to this date. For the purposes of our analysis, each such interval has been represented by an integral midpoint (e.g., 10, 20, 30, . . . for records with sampling date prior to January 1, 1975). A detailed investigation of this procedure has shown that our results are fairly insensitive to the particular choice of midpoint.
The ethnic group distribution of the screened children varies somewhat by year of sampling (Table 2), as does the age distribution. In order to insure against possible biases resulting from variations in the ethnic group/age composition of the screened population by year, the data were disaggregated into subpopulations defined by ethnic group (black, white, Hispanic), age group (1-12, 13-24,. . ., 61-72,  1970-1976 1970 1971 1972 1973 1974 1975 1976 White, % >72 months), and quarterly sampling date (Jan-Mar, . . ., Oct-Dec) for each year. Analysis showed that within each such subpopulation, the distribution of blood lead levels is quite closely approximated by a lognormal distribution (9); other studies (10)(11)(12) have also verified that such a distribution is appropriate for characterizing blood lead levels. Accordingly, the geometric mean and geometric standard deviation (9), as opposed to their arithmetic counterparts, were used to summarize blood lead levels in each subpopulation.

Results
Previous investigations of blood lead levels (1, 5, 6, 13) have emphasized the percent of the studied population with blood lead levels exceeding some specified "threshold" value. However, to obtain insight into the status of population subgroups it is more informative to study the actual distributions of blood lead levels in the population.
Blood lead levels in New York City have been found to depend in a systematic way on ethnic group, age, and quarterly sampling date. Therefore, the distributions of blood lead levels were analyzed within the more homogeneous subpopulations defined by these three variables. For example, Figure 1 shows the cumulative distribution curves for black children, aged 25-36 months, who were tested during July-September of 1971 and a similar subgroup tested in 1976. Figure 2 shows normal probability plots of the logarithm of blood lead level for the same two subpopulations. The near-linearity of these probability plots indicates that the response variable (blood lead) can be adequately represented by a lognormal distribution. Analogous figures for the other distinct subpopulation groups likewise reveal a lognormal behavior, and also illustrate the pronounced shift of the distribution curves to lower lead levels with time. This downward trend is reflected in a time-decreasing geometric mean blood lead level for all ages and ethnic groups. By contrast, the geometric standard deviation appears fairly stable, usually remaining in the range of 1.3-1.5 ,ug/100 ml. Figure 3 illustrates a typical time series of geometric mean blood lead levels by quarterly sampling date. The geometric mean blood lead levels exhibit a pronounced cyclic seasonal pattern superimposed on a downward trend. In each of the 28 sampling periods, the geometric mean blood lead level for Hispanics is lower than that for blacks, with the actual difference decreasing in time; 79% of these differences are statistically significant at thep = 0.05 level. The time series for whites (not shown) approximately follows that given here for Hispanics; only 18% of the differences between whites and Hispanics are significant at the p = 0.05 level.-Furthermore, it is observed that the seasonal peak in geometric mean blood lead level invariably occurs in July-September of each year. Plots for other age groupings are qualitatively similar in appearance but vertically displaced depending on age; these plots display the same ethnic group differences, persistent seasonal variation, and long-term downward trend. An interesting observation, common to all ages and The variation of geometric mean blood lead level with age can be seen in Figure 4, which graphs the annual arithmetic average of the quarterly geometric mean blood lead levels against age group for blacks, whites, and Hispanics in 1971. Age profiles for other years are similar in shape but are vertically displaced according to year. Regardless of year, however, such graphs show that blood lead levels are lowest for the 1-12 month group and generally achieve the highest level between 2 and 4 years (usually at 3 years). Figure 4 again illustrates the variation in average blood lead level by ethnic group, with blacks having significantly higher levels than either Hispanics or whites.
Blood lead levels as a function of year, ethnic group, and age are displayed in Table 3, which presents annual averages of blood lead levels for each combination of ethnic group and age group. The averages tabulated represent the unweighted arithmetic means ofthe quarterly geometric means within a given calendar year. The yearly decreases in blood lead levels are also apparent here.
A preliminary analysis of the influence of geographical location on blood lead levels has also been performed using information on the New York City health district where the blood sample was collected. The health district may be assumed to be a reasonable surrogate for the general location and environment of the child. A problem with analysis by location is that at this level of disaggregation the number of subjects in a given time, age, ethnic and location group is rather small. In order to increase the sample size, analysis was carried out for the 18-36 month age Table 3. Annual averages of geometric mean blood lead levels.
Avg. blood lead level, ,ug/l00 ml  The results obtained are consistent with those presented earlier. Namely, the same ethnic group differences are observed, and there is a similar seasonal variation in blood lead levels superimposed on a downward yearly trend. Figure 5 compares the variation of the yearly averages for the different location groups for blacks, and Figure 6 shows a similar graph for Hispanics. It is difficult to arrive at firm conclusions about the influence of location from these results. The most outstanding feature appears to be the more rapid decrease of the average blood lead level with time in Brownsville for both blacks and Hispanics. Another observation which may be noted, at least prior to 1973, is that average blood lead levels in Manhattan were lower than those in Brooklyn.

Discussion
The data base compiled by the New York City Bureau of Lead Poisoning Control provides an especially valuable resource for analyzing blood lead levels of a large number of children over a relatively long time period. The most important findings of our study have been: (1)  for all ages and ethnic groups; (2) a marked similarity in both temporal and seasonal variation of blood lead levels for all ages and ethnic groups; (3) a recurring seasonal pattern in blood lead levels, with an invariable peak in the summer; (4) a pronounced variation in blood lead levels by age, with a maximum level occurring for children 2-4 years old; and (5) a definite variation in blood lead levels by ethnic group. Several of these conclusions are in agreement with those of earlier studies (4-6), in which similar age and seasonal patterns were reported in the proportion of children with blood lead levels exceeding a certain arbitrarily defined threshold value. However, it appears from our analysis that such patterns are characteristic of the entire population, and not simply phenomena peculiar to children with elevated blood lead levels. Indeed, it can be shown that the proportion of a studied population exceeding some arbitrary level is a computable statistical quantity which is dependent upon the parameters of the distribution function. In the case of a lognormal distribution, these parameters are the geometric mean and geometric standard deviation (9). The similarity in blood lead variation for the different age groups is especially interesting and raises certain questions about exposure pathways. Pica has often been afforded a major role in lead intake among children. In addition, normal oral exploration and hand-to-mouth activity have also been cited as contributory factors, with the source of lead primarily being lead-based paint chips and dust (9,14). The highest incidence of pica overlaps that age bracket (18-30 months old) where other studies find the highest incidence of elevated blood lead levels (14), and which is also associated with the highest geometric mean blood lead level in this study. While normal or abnormal ingestion may explain these higher blood lead levels, it does not explain the seasonal variation in blood lead levels, which is virtually the same in very young children (1-12 months) as it is in older children (over 72 months). The latter two age groups are not usually considered to be those in which pica is a common occurrence.
Seasonal variation in the number of children with elevated blood lead levels has been previously observed (15417). Our analysis shows that this fluctuation is a characteristic of the entire screened population, represented by temporal shifts in the population distribution curves. Explanations for this seasonal variation have, for the most part, been speculative. It may be that sunlight increases the absorption of lead from the intestine (16). On the other hand, seasonal variations in blood lead values may also result from seasonal variations in lead exposure. The similarity in seasonal patterns for the different age groups suggests that the reasons for the variation (e.g., ex-posure, metabolism) should be common to all the age groups.
Since there may be several significant sources of lead exposure, the downward trend in blood lead levels observed here does not allow a simple explanation. Several possible contributors to this decrease may be suggested: the active educational and screening program of the New York City Bureau of Lead Poisoning Control; decrease in the amount of lead-based paint exposure as a result of rehabilitation and removal (through demolition, arson, abandonment) of older housing stock; or changes in environmental lead exposure. At the present time, preliminary quantitative data are available on only one potential exposure sourceambient air leadwhich also shows seasonal variation and long-term changes similar to blood lead. The available data are limited to a single monitoring station in New York City for the same time period during which blood lead data were collected, and it is of interest to explore whether or not a statistical relationship exists between these two variables.
Air lead measurements have been recorded since 1963 on a monthly basis at a height of 56 m by the Department of Energy, Health and Safety Laboratory on the West Side of Manhattan (18). Environmental Health Perspectives similarity in cyclic behavior and overall decline. The qualitative similarity to the data presented in Figure  3 is readily apparent. It would be unwise to assume that the air lead level measured at one location in New York City would be the same as levels measured at other locations, nor does it necessarily represent the level to which the general population is exposed. Local variations in sources, traffic patterns, and meteorological factors would be expected to affect ambient air lead levels. However, it is not unreasonable to assume that the measured ambient level at the single sampling station could be used as a general indicator of trends for the entire city.
A regression analysis was performed in order to quantify the statistical relation of various factors to blood lead level variations. The simplified additive model chosen for study was 9 Y = a. + E aX, + e j=1 where Y is the quarterly geometric mean for the subpopulation defined by age dummy variables Xi, ...,X6 and ethnic group dummy variablesX7, X8. In addition, Xs represents the quarterly air lead level (in ,ug/m3), and e is a statistical disturbance term. A least-squares fit was obtained for this model by using the OMNITAB package (19). Table 4 summarizes the results of this regression analysis. It can be seen that all three variables air lead, ethnic group, and age are highly significant in a statistical sense. The regression analysis also confirms the variation of blood lead levels with age and ethnic group, depicted graphically in Figure 4. The most statistically significant explanatory variable is seen to be air lead (t = 19.0, p < 0.0001), followed by the dummary variable for blacks (t = 17.0. p < 0.0001. One must exercise due caution in interpreting this simple model as a quantitative description of casual factors related to blood lead levels in children. This model does not contain variables related to possible sources of lead exposure other than air. Similarly, the use of a single air lead monitoring station (the only data currently available) may not adequately reflect ambient air lead levels in the environment at the point ofexposure. Work is currently underway to obtain analyses for air lead samples collected at 40 monitoring sites throughout the city during 1970-1976, after which the relationship between blood lead levels and air lead levels can be investigated in detail at a local level.
Another reason for exercising caution in quantitatively interpreting such a regression model arises from the fact that the population screened prior to April 1973 may be different from that screened after August 1979 this date. Accordingly, separate regression analyses have been performed for these two populations. In both cases, the variables representing air lead, ethnic group and age group are again highly statistically significant in relation to blood lead level. The "pre-1973" and "post-1973" regression models explained 65% and 63% of the total variation in blood lead levels, and the same ethnic group and age group dependence of blood leads was observed in both models. There are quantitative differences in the model coefficients, which may be representative of differences in the nature of the sampled population. Additional data are currently being sought which may provide useful insight into whether or not the population sampled by venous puncture is indeed different from that screened by the micro method.
The present study represents only the beginning of several systematic analyses that can be conducted using the extensive New York City data base. These results indicate that a highly significant statistical relation exists between blood lead levels and variables representing age, ethnic group, and a measure of ambient air lead. While causal relationships, if any, between these variables and lead levels in children cannot be derived from statistical considerations alone, the present analysis does provide extremely strong incentives and directions for future research.
We are indebted to Dr. Vic Hasselblad (Environmental Protection Agency) and Mr. Ray Beauchesne (Department of Housing and Urban Development) for contributing their particular technical expertise. This study could not have been accomplished without the efforts of Mr. Tom Kaiser and his staff at the New York City Bureau of Lead Poisoning Control.