Ambient Air Pollution and Risk of Congenital Anomalies: A Systematic Review and Meta-analysis

Objective We systematically reviewed epidemiologic studies on ambient air pollution and congenital anomalies and conducted meta-analyses for a number of air pollutant–anomaly combinations. Data sources and extraction From bibliographic searches we extracted 10 original epidemiologic studies that examined the association between congenital anomaly risk and concentrations of air pollutants. Meta-analyses were conducted if at least four studies published risk estimates for the same pollutant and anomaly group. Summary risk estimates were calculated for a) risk at high versus low exposure level in each study and b) risk per unit increase in continuous pollutant concentration. Data synthesis Each individual study reported statistically significantly increased risks for some combinations of air pollutants and congenital anomalies, among many combinations tested. In meta-analyses, nitrogen dioxide (NO2) and sulfur dioxide (SO2) exposures were related to increases in risk of coarctation of the aorta [odds ratio (OR) per 10 ppb NO2 = 1.17; 95% confidence interval (CI), 1.00–1.36; OR per 1 ppb SO2 = 1.07; 95% CI, 1.01–1.13] and tetralogy of Fallot (OR per 10 ppb NO2 = 1.20; 95% CI, 1.02–1.42; OR per 1 ppb SO2 = 1.03; 95% CI, 1.01–1.05), and PM10 (particulate matter ≤ 10 μm) exposure was related to an increased risk of atrial septal defects (OR per 10 μg/m3 = 1.14; 95% CI, 1.01–1.28). Meta-analyses found no statistically significant increase in risk of other cardiac anomalies and oral clefts. Conclusions We found some evidence for an effect of ambient air pollutants on congenital cardiac anomaly risk. Improvements in the areas of exposure assessment, outcome harmonization, assessment of other congenital anomalies, and mechanistic knowledge are needed to advance this field.


Review
There is growing epidemiologic evidence for adverse effects on the fetus and newborn from maternal prenatal exposure to ambient air pollution (Glinianaia et al. 2004b;Maisonet et al. 2004;Šrám et al. 2005). Air pollutants such as carbon monoxide (CO), sulfur dioxide (SO 2 ), and particulate matter (PM) have been associated with increased infant mortality, particularly postneonatal respiratory mortality, low birth weight, and preterm birth (Glinianaia et al. 2004a(Glinianaia et al. , 2004bMaisonet et al. 2004;Šrám et al. 2005). Inconsistencies and uncertainties remain concerning the effects of specific pollutants and pollutant mixtures and critical exposure periods (Ritz and Wilhelm 2008;Woodruff et al. 2009). There are new concerns that air pollution may also play a role in causing congenital anomalies, and such an effect is both biologically plausible and of public health importance (Ritz 2010). Congenital anomalies are a main cause of infant mortality and an important contributor to childhood and adult morbidity. Major congenital anomalies in surviving infants often have serious medical and/or cosmetic consequences that commonly require surgery and lead to reduced survival rates into adulthood (Tennant et al. 2010). Major structural congenital anomalies are diagnosed in 2-4% of births, but in most cases their etiology remains unknown (Weinhold 2009).
Recently, there has been a steep increase in the number of air pollution studies with congenital anomalies as the primary health outcome. The first publication appeared in 2002 (Ritz et al. 2002), the next in 2005 (Gilboa et al. 2005), and more have followed in the last few years (Dadvand et al. 2011a(Dadvand et al. , 2011bDolk et al. 2010;Hansen et al. 2009;Hwang and Jaakkola 2008;Kim et al. 2007;Marshall et al. 2010;Rankin et al. 2009;Strickland et al. 2009). Because the existing studies have few a priori hypotheses and tested many different pollutant-outcome combinations, a systematic assessment of the consistency of associations across studies is needed. Furthermore, in this rapidly evolving field, timely evaluation of methods and results of existing studies can help to inform and improve the design of future research. Here we therefore provide a systematic review of studies on ambient air pollution and congenital anomalies and develop recommendations for future research.
There have been few meta-analyses of other environmental exposures and risk of congenital anomalies (e.g., Nieuwenhuijsen et al. 2009) because of great variability in exposure assessment, outcome ascertainment, data analysis, and result reporting. Air pollution is one area where exposure estimates appear reasonably comparable. To illustrate the challenges faced by meta-analyses in this field and to offer recommendations for improvement, we conducted meta-analyses for a number of air pollutant-anomaly combinations.

Materials and Methods
Search methods. We followed published guidelines for the reporting of this review and meta-analysis (Moher et al. 2009;Stroup et al. 2000). A bibliographic search was carried out in the MEDLINE engine search (National Library of Medicine 2010). The Medical Subject Heading (MeSH) terms "congenital abnormalities," "pregnancy," and "air pollution" and non-MeSH terms "birth defect," "congenital anomalies," "cardiac anomalies," "congenital heart disease," and "oral clefts" were used in the syntax. We also searched references in published articles and reviews on this topic. From this search we selected articles that a) were original epidemiologic studies; b) were written in the English language; c) defined all or subgroups of congenital anomalies, congenital malformations, or birth defects as outcome; and d) studied human prenatal exposure to ambient air pollution using measured concentrations of air pollutants. Studies with purely ecologic exposure assessments (e.g.,maternal residence in a polluted vs. unpolluted area) or studies with quantitative traffic density data but without pollutant data (e.g., Cordier et al. 2004) were not included in the review. We identified one article unpublished at the time of the last MEDLINE search but now published (Dadvand 2011b). We searched conference proceedings in the ISI Web of Science (Thomson Reuters 2010) for abstracts of other unpublished studies, using the same search terms as above, but we found no other studies that fall into this category.
Metaanalysis. We conducted metaanalyses to obtain summary risk estimates for the association between congenital anomaly groups and ambient air pollutant concentrations. Meta-analyses were conducted only if at least four studies had risk estimates [usually odds ratios (ORs)] for the same pollutant and comparable anomaly group. Although no guidelines exist as to the minimum number of studies for meta-analysis, we considered four the minimum number to justifiably run a meta-analysis; this corresponded to having a minimum of 500 cases of congenital anomaly included in each analysis. A list of the specific outcome groups used in the different studies, and the congenial anomaly groups considered comparable for the purposes of these metaanalyses, is available in Supplemental Material, Table 1 (doi:10.1289/ehp.1002946). Metaanalyses calculated two types of summary risk estimates: a) for the comparison of congenital anomaly risk at high versus low exposure level in each study and b) for congenital anomaly risk per unit increase in continuous pollutant concentration. For each of these, we selected risk estimates from the main, confounderadjusted models presented in each study, not those of sensitivity analyses. If both singleand multiple-pollutant models were presented, we selected the single-pollutant model. If studies presented results for more than one pregnancy period (Hwang and Jaakkola 2008;Ritz et al. 2002), we selected the period most appropriate for the development of congenital anomalies and most used in the other studies: month 2 or weeks 3-8 of gestation. The Strickland et al. (2009) study used two groups of ventricular septal defects (VSDs): muscular and perimembranous type. We calculated the combined risk estimate for these two groups and then entered the combined estimate in meta-analyses. Two studies (published in three reports) were located in the same study area, covered overlapping time periods, and analyzed some of the same anomaly groups (VSDs, coarctation of the aorta, tetralogy of Fallot) in relation to SO 2 exposure (Dadvand et al. 2011a(Dadvand et al. , 2011bRankin et al. 2009). We entered risk estimates from these publications separately in the SO 2 meta-analyses, never at the same time. Because the Dadvand et al. (2011b) substudy had risk estimates for other air pollutants and covered a more recent time period, we used this as the main analysis for SO 2 and considered the inclusion of the Dadvand et al. (2011a) and Rankin et al. (2009) studies to be sensitivity analyses.
In the high-versus low-exposure metaanalysis, we selected from the published reports risk estimates for the fourth compared with the first quartile of exposure, if categorical risk estimates were presented (Dadvand et al. 2011a(Dadvand et al. , 2011bGilboa et al. 2005;Marshall et al. 2010;Rankin et al. 2009;Ritz et al. 2002). Some studies presented risk estimates only for continuous exposure (Hansen et al. 2009;Hwang and Jaakkola 2008;Strickland et al. 2009); we converted these to the risk estimate per interquartile increase in exposure, using the interquartile ranges (IQRs) given in the reports, because these were considered to compare most closely with estimates from the categorical studies. Information to calculate midquartile points was not given in these articles, so we could not compare the 12.5th and 87.5th percentiles. One study (Dolk et al. 2010) presented risk estimates for the 90th versus 10th percentile exposure, calculated from a continuous pollutant model, and this estimate was used for the high versus low comparison.
Meta-analyses for continuous exposure summarized risk estimates per unit increase in pollutant concentration, assuming that the natural logarithm of the relative risk of congenital anomaly varied linearly with the ambient air pollutant concentration. From each publication we selected the risk estimate per continuous unit increase in pollutant concentration, if available (Dadvand et al. 2011a;Dolk et al. 2010;Hansen et al. 2009;Hwang and Jaakkola 2008;Ritz et al. 2002;Strickland et al. 2009). Three articles described continuous exposure estimates in the text but did not show the risk estimates (Dadvand et al. 2011b;Gilboa et al. 2005;Marshall et al. 2010); for these we obtained the continuous estimates directly from the authors. To allow comparison of effects among the different studies, units were converted to 10 μg/m 3 for particulate matter with diameter ≤ 10 μm (PM 10 ), 10 ppb for ozone (O 3 ) and nitrogen dioxide (NO 2 ), 1 ppm for CO, and 1 ppb for SO 2 . Exposure estimations that were expressed in a mass per volume unit (e.g., micrograms per cubic meter) instead of parts per million, billion, or hundred million (pphm) were converted to parts per billion using the general equation at 1 atm and 25°C: 24.45 × concentration (micrograms per cubic meter)/molecular weight. An exception was made for PM 10 , which was always expressed in micrograms per cubic meter.
We obtained summary risk estimates in meta-analyses using fixed-or random-effects models. For each pollutant-outcome analysis we first tested for heterogeneity in the risk estimates using the Q-test (Cochran 1954). When the result of the Q-test showed evidence for heterogeneity (p < 0.1), we used a random effect analysis, following the method of DerSimonian and Laird (1986). Otherwise, a fixed-effect analysis was conducted using the Mantel-Haenszel method (Mantel and Haenszel 1959). Meta-analyses calculated summary risk estimates (ORs) weighted by the inverse variance of each study, taking into account whether a fixed or random model was used. We used the R statistical software package for all analyses (version 2.11.0; R Project for Statistical Computing 2010).
We also produced forest plots to show ORs from each of the individual studies included in the meta-analyses and the estimation of the summary OR (Light and Pillemer 1984). The sizes of the markers of each OR in the plots represent the relative weight each study contributed to the summary estimation. To analyze potential for publication bias, we conducted a weighted Egger test, a linear regression in which the response is the estimated effect and the explanatory variable is a precision term (1/SE) (Egger et al. 1997). A large deviation from zero of the slope term suggests publication bias.

Results
We identified 10 studies, one divided into two substudies (Dadvand et al. 2011a(Dadvand et al. , 2011b, published between 2002 and 2011 (Table 1). Four studies were conducted in the United States (in California, Texas, Georgia, and New Jersey), three in England (two in the northern region and one in four English regions including the northern region), and one each in Australia, Taiwan, and South Korea. The studies focused mainly on cardiac anomalies (Dadvand et al. 2011a(Dadvand et al. , 2011bGilboa et al. 2005;Hansen et al. 2009;Ritz et al. 2002;Strickland et al. 2009) and/or orofacial clefts (Gilboa et al. 2005;Hansen et al. 2009;Hwang and Jaakkola 2008;Marshall et al. 2010;Ritz et al. 2002). Only two studies (Dolk et al. 2010;Rankin et al. 2009) included the full spectrum of major structural anomalies. The South Korean study was a prospective birth cohort study that included all congenital anomalies as one group (14 cases) (Kim et al. 2007). Most other studies used a registry-based case-control design, selecting cases from routine congenital anomaly registries and controls from birth registries. One study was a registry-based cohort, using anomaly registries as source of case ascertainment and birth registry data for denominators (Dolk et al. 2010). Strickland et al. (2009) used a time-series design to link daily exposure estimates to daily congenital anomaly rates (for a given conception date); again, data on the congenital anomalies came from a routine register.
Major differences are apparent in the diagnostic coding systems, congenital anomaly grouping methods, and case definitions among the studies [see Supplemental Material,  (2009) based their groupings of cardiac anomalies and orofacial clefts on the anatomic classification used in the first study by Ritz et al. (2002); inclusion and exclusion of chromosomal, syndromic, and multiple anomalies still differed among these studies. The Georgia study (Strickland et al. 2009) used a more detailed system of diagnostic codes for cardiac anomalies. The U.K. studies (Dadvand et al. 2011a(Dadvand et al. , 2011bDolk et al. 2010;Rankin et al. 2009) all used coding and grouping system based on the International Classification of Diseases, 9th Revision (World Health Organization 1975) and the minor anomaly exclusion criteria proposed by the European Surveillance of Congenital Anomalies (EUROCAT 2009).
In all but one study (Dadvand et al. 2011a), exposure assessments were based on the routine measurements of air pollutant concentrations at fixed-site air pollution monitoring stations. Most commonly, exposures were assigned using the monitoring station nearest to the maternal residence at the time of the birth. Distances of residence from the monitors varied and in some studies inclusion of study subjects was limited by their distance to the monitor [e.g., 16 km in the Ritz et al. (2002) study, 10 km in the Rankin et al. (2009) study]. Number and density of monitors also varied, as did the maximum distance of subjects from the nearest monitor (Table 1). Dadvand et al. (2011a) developed a spatiotemporal model in which black smoke and SO 2 concentrations at the maternal residence were predicted using concentrations measured at 56 monitors and data on traffic, meteorology, and land cover. The Georgia study (Strickland et al. 2009) differed from others in that temporal data from one central monitoring station in the county were related to the vulnerable window of each pregnancy in a time-series analysis. Nearly all other studies defined windows of pregnancy susceptible to the development of the congenital anomalies under study (usually weeks 3-8 of gestation) and averaged exposure over those windows. The Dolk et al. (2010) study assigned mean pollutant concentrations measured in one year (1996) to the census ward level of residence and was not able to estimate exposure in pregnancy time windows.
Most studies focused on the most commonly monitored air pollutants: PM 10 (nine studies), SO 2 (eight studies), NO 2 (seven studies), CO (seven studies), and O 3 (seven studies). Less frequently studied pollutants Week of year, and cubic spline for day of follow-up continued next page were black smoke, nitric oxide (NO), nitrogen oxides (NO x ), and particulate matter with aerodynamic diameter ≤ 2.5 μm (PM 2.5 ).
Pollutant concentration distributions showed different patterns across studies (Table 1); for example, average CO concentrations were highest in the California study (Ritz et al. 2002), but O 3 concentrations were highest in the Taiwan study (Hwang and Jaakkola 2008). In most studies where this information was provided, pollutant concentrations did not increase by more than a factor of 2 between the 25th and 75th percentile ( Table 1). Cardiac anomalies. Cardiac anomalies were analyzed in seven studies. Each study included at least six separate cardiac anomaly groups [see Supplemental Material, Table 1 (doi:10.1289/ehp.1002946)] and tested these against three to six pollutant groups. Each study reported only one or a few statistically significantly increased risks with increased exposure among the multiple associations tested; few of these occurred in more than one study: CO was related to higher risk of VSD in two studies, with ORs for fourthversus first-quartile exposure of 2.95 [95% confidence interval (CI), 1.44-6.05 (Ritz et al. 2002)] and 1.66 [95% CI, 1.37-2.02 (Dadvand et al. 2011b)]. The studies in California and Australia both reported raised risks of pulmonary artery and valve defects in association with O 3 exposure [OR quartile 4 vs. quartile 1 = 2.94; 95% CI, 1.00-6.05 (Ritz et al. 2002); OR per 5 ppb = 2.96; 95% CI, 1.34-7.52 (Hansen et al. 2009)]. Various inverse associations (decreasing risks with increasing exposure) were also observed.
We conducted meta-analyses for 18 combinations of pollutants and cardiac anomaly groups for which four or more studies published results [for summary results, see Table 2; for full results, see Supplemental Material, Table 2 (doi:10.1289/ehp.1002946)]. The summary risk estimates from these metaanalyses were generally close to one, with a range of summary ORs for continuous exposure from 0.87 to 1.20, and for high versus low exposure from 0.80 to 1.23. Heterogeneity tests showed evidence for heterogeneity among studies (p < 0.10) in fewer than half of the analyses conducted, most consistently related to analyses of VSDs. Egger test p-values were statistically significant for only 3 of the 68 meta-analyses we conducted (see Supplemental Material, Table 2), indicating that wide-scale publication bias is unlikely. We found statistically significantly increased summary risk estimates for continuous NO 2 exposure and risk of coarctation of the aorta (OR per 10 ppb = 1.17; 95% CI, 1.01-1.36) and tetralogy of Fallot (OR per 10 ppb = 1.20; 95% CI, 1.02-1.42), for continuous PM 10 exposure and atrial septal defect (ASD; OR per 10 μg/m 3 = 1.14; 95% CI, 1.01-1.28), and for continuous SO 2 exposure and risk of coarctation of the aorta (OR per 1 ppb = 1.07; 95% CI, 1.01-1.13) and tetralogy of Fallot (OR per 1 ppb = 1.03; 95% CI, 1.01-1.05) (  (Hansen et al. 2009); OR quartile 4 vs. quartile 1 = 1.6; 95% CI, 1.1-2.2 ]. None of the studies reported increased risks of cleft palate alone. Meta-analyses summarizing risk estimates for exposure to five pollutants and risk of cleft lip with or without cleft palate and cleft palate alone were all close to one, and none reached statistical significance (Table 3). We found evidence for significant heterogeneity (p < 0.1) in the CO and SO 2 analyses. O 3 exposure and cleft lip with or without cleft palate showed an association of borderline statistical significance (OR per 10 ppb = 1.10; 95% CI, 0.99-1.21) (Table 3, Figure 1F).
Other congenital anomalies. Two studies examined a range of anomalies other than cardiac anomalies and orofacial clefts (Dolk et al. 2010;Rankin et al. 2009). They observed statistically significantly increased risks of omphalocele in relation to PM 10 concentrations [90th vs. 10th percentile: OR = 2.17; 95% CI, 1.00-4.71 (Dolk et al. 2010)] and of nervous system anomalies in relation to black smoke concentration [OR = 1.10/mg/m 3 ; 95% CI, 1.03-1.18 (Rankin et al. 2009)]. No other anomaly groups/subtypes were at an increased risk in relation to the pollutants studied.

Discussion
The evidence base for an effect of exposure to ambient air pollutants on congenital anomaly risk is small. Individual studies reported increased risks for some combinations of air pollutants and congenital anomalies, mostly cardiac anomalies, but these occurred among many associations tested in each study. Metaanalyses suggest that NO 2 and SO 2 exposures were related to statistically significant increases in risk of coarctation of the aorta and tetralogy of Fallot, and PM 10 exposure to an increase in risk of ASDs; we based summary risk estimates on few studies (n = 4), but the total numbers of cases included were relatively large (between 655 and 951). Metaanalyses found no statistically significant increase in risk of other cardiac anomalies and orofacial clefts in relation to air pollution exposure. This review and its meta-analysis raise important issues that may guide both the design and presentation of future studies.
A common feature of all the reviewed studies was the use of routine monitoring stations as the basis for exposure assessment. Exposure indices were usually calculated from pollutant measurements at the nearest monitoring station or as a distance-weighted average of measurements of all stations in the area; these methods apply a similar exposure to a relatively large geographic area and thereby measure predominantly community-wide variations in air pollution. This approach will be more appropriate for pollutants that vary at large geographic scale (e.g., SO 2 , O 3 ) than for those that may have a much finer spatial distribution/resolution (e.g., CO, NO 2 ). Traffic exhaust fumes are the main source of air pollutants such as NO 2 and PM 2.5 in urban areas, as well as a main source of suspected causative agents for adverse birth outcomes, such as polycyclic aromatic hydrocarbons (PAHs) (Perera et al. 2003;Ritz and Wilhelm 2008). More precise spatial models, based on dispersion or land-use regression models that take into account local road networks and other predictive variables, are therefore increasingly being recommended and used in research on adverse birth outcomes (Aguilera et al. 2009;Gilliland et al. 2005;Madsen et al. 2010;Nethery et al. 2008;Ritz and Wilhelm 2008;Slama et al. 2007Slama et al. , 2008Wu et al. 2009). Only one of the reviewed studies (Dadvand et al. 2011a) used a spatiotemporal model for black smoke and SO 2 exposure in which concentrations of these pollutants were predicted from monitoring data combined with data from traffic, meteorology, and land use. Application of these types of models to future studies of congenital anomalies would be a step toward more accurate exposure assessment for some of the pollutants of interest. Characterization and quantification of errors in exposure estimates from these models will  be available from validation studies (Nethery et al. 2008), and efforts should be made to also incorporate these in evaluations of congenital anomaly risk. Furthermore, air pollution studies of other birth outcomes have moved toward trying to estimate more specifically the exact pollutants responsible (e.g., transition metals contained in PM, PAHs, aromatic hydrocarbons) (e.g., Aguilera et al. 2009;Perera et al. 2003;Slama et al. 2007), rather than focusing on the regulated pollutants, and this would be a valuable direction for future congenital anomaly studies. Nearly all the reviewed studies were based on registry information. In such studies, residential addresses are available only at birth, not in the first trimester of pregnancy, the most relevant period for causation of congenital anomalies (Moore and Persaud 1998). Furthermore, residential addresses only account for exposures near the home, not in other situations thought to make an important contribution to personal exposure, such as work location, commuting, and indoor air pollution sources (Nethery et al. 2008;Setton et al. 2010). However, congenital anomaly research in this field cannot easily move away from routinely registered data. Large case-control studies with good information on residential history and time-activity patterns may be a useful step forward, if they can be combined with accurate exposure assessments. Pregnancy cohort studies in Europe are currently pooling resources to study effects of air pollution and other birth outcomes using land-use regression exposure models [European Study of Cohorts for Air Pollution Effects (ESCAPE) Project 2009], but few such cohorts are large enough to conduct meaningful analyses of congenital anomalies or include thorough enough ascertainment of congenital anomalies among pregnancy terminations as well as live births. Information from such studies may, however, provide useful information about the effect of including information on residential mobility and time-activity patterns on risk estimates for other birth outcomes (e.g., Aguilera et al. 2009;Chen et al. 2010;Lupo et al. 2010). The most difficult issue in our attempt to combine data from the different studies was the use of very different criteria for definition and classification of congenital anomaly subgroups. Moreover, we found differences in inclusion and exclusion of cases with other anomalies in the same and other organ systems, with chromosomal abnormalities, and with other syndromic conditions. We compared the different groupings and exclusions used [see Supplemental Material, Table 1 (doi:10.1289/ehp.1002946)] but found that in some instances it is not possible to deduce from the articles which inclusions and exclusions were made. For our meta-analyses we selected a few groups of relatively comparable anomaly groups. However, even for these, studies differed in their approach to classification. VSDs, for example, were treated as one anomaly group by some studies (Dadvand et al. 2011a(Dadvand et al. , 2011bGilboa et al. 2005;Hansen et al. 2009;Rankin et al. 2009;Ritz et al. 2002), as four different anomalies by another (Strickland et al. 2009), and were excluded in yet another (Dolk et al. 2010). Notably, we found the largest evidence for heterogeneity for VSDs. In general, congenital heart defects form a very heterogeneous set of conditions, notoriously difficult to classify (Botto et al. 2007;Strickland et al. 2008). Several classification systems have been proposed (Botto et al. 2007;Jacobs et al. 2007), but diagnostics information in routine registries may often not be specific enough to apply these. Further international harmonization efforts in this area [e.g., those undertaken as part of EUROCAT (2010) and the International Clearinghouse for Birth Defects (2010)] will be of great value. Classification of orofacial clefts is more straightforward than that of cardiac anomalies, but even here differences in exclusions have resulted in somewhat nonhomogeneous case groups for the meta-analyses. We encourage future studies to base their classifications and exclusions on those of previous studies in the same field, where possible.
The meta-analysis results should be interpreted with caution because they are based on few studies, and because some were subject to some degree of statistical heterogeneity. On the other hand, the total numbers of cases included in the meta-analyses were large, ranging from 500 to > 3,700, depending on the anomaly group. We found significant results for some pollutant-cardiac anomaly combinations, but only two of these (NO 2 and tetralogy of Fallot, and SO 2 and coarctation of the aorta) were robust to the exclusion of the study with the largest weight, and we could not confirm significant findings from the continuous exposure analyses with high-versus low-exposure analyses. It is not clear which of these latter analyses is the most appropriate: In continuous analyses one assumes a (log) linear relationship between exposure and outcome; in the field of congenital anomaly, linearity cannot automatically be assumed, because selective survival of more viable fetuses related to the same exposure may lead to nonlinearity (Ritz 2010). On the other hand, the high versus low analysis combines categories of very different exposure levels across studies and has lower power because of smaller numbers of cases in these more extreme exposure categories. We recommend that studies report both types of analyses, in annexes of sensitivity analyses where appropriate, in order to aid future meta-analyses.
Heterogeneity in the studies we reviewed may arise from inherent differences between the study settings, as well as from differences in study designs and analysis methods. First, the study areas were different with respect to exposure levels and ranges, pollutant mixtures, and underlying anomaly risks; this may have given rise to different dose-response relationships. With respect to study design, we considered the assessment of exposure to be relatively similar: Studies assessed mostly the same pollutants, same exposure windows, and similar distance-based exposure indices. Heterogeneity in our analyses was not particularly related to studies that deviated from the rest with respect to their approach to exposure assessment such as the Strickland et al. (2009) study with a purely temporal exposure model and the Dolk et al. (2010) study with a purely spatial model. As described above, ascertainment and classification of congenital anomalies differed greatly among studies, and this may well have played a role in the observed heterogeneity. Because all reviewed studies used routinely registered data, selection and recall biases likely did not play a large role in most of the reviewed studies. However, two of the reviewed studies (Gilboa et al. 2005;Hansen et al. 2009) matched controls to cases by, respectively, mother's county of residence and mother's distance of residence from an air pollution monitor. This may introduce bias by reducing exposure contrast between cases and controls.
Covariates included in analyses differed and residual confounding structures may differ among the studies, thus leading to hetero geneity in study results. However, the number of known risk factors for congenital anomalies is extremely small, and none of the suspected risk factors, such as smoking, alcohol, folate deficiency, or socioeconomic status, are likely to explain a large proportion of cases. Maternal smoking, for example, has been established as a risk factor for orofacial clefts with relative risk estimates around 1.2-1.3 (Lie et al. 2008;Little et al. 2004;Meyer et al. 2004), but evidence for an association with risk of other congenital anomalies is weak (EUROCAT 2004). Only two of the reviewed studies controlled for tobacco smoking (Gilboa et al. 2005;Marshall et al. 2010).
Possible mechanisms of teratogenicity of air pollutants at this stage remain speculative, but several mechanisms have been hypothesized for effects of PM on fetal growth, including oxidative stress, placental inflammation, and changes in coagulation, as reviewed by Kannan et al. (2006). Congenital anomalies may be induced through similar effects on early fetal growth, as well as by air pollutants influencing the migration and differentiation of neural crest cells, by interactions with the metabolism and detoxification of other xenobiotics, or by indirect effects through maternal immunologic reactions, such as infection or asthma, or medication related to these conditions (Dolk et al. 2010;Ritz et al. 2002;Sram et al. 2005). In animal studies, maternal exposure to O 3 , NO 2 , and CO has produced embryotoxic effects, as well as teratogenic effects such as skeletal and neuromuscular anomalies (Garvey and Longo 1978;Kavlock et al. 1979;Longo 1977;Singh 1988). Genetic polymorphisms in developmental and detoxification genes have been described as potentially increasing individual susceptibility to the teratogenic effects of maternal smoking (e.g., Chevrier et al. 2008;Lammer et al. 2005); evaluation of similar gene-environment interactions may aid mechanistic understanding in future studies of air pollution and congenital anomalies. Interactions between PM and specific micro-and macronutrients in the diet (antioxidants, antiinflammatory factors) have been proposed to play a role in the causation of low birth weight, fetal growth restriction, and preterm birth, through joint oxidative stress and inflammatory mechanisms (Jedrychowski et al. 2010;Kannan et al. 2006); such interactions also warrant exploration in the causation of congenital anomalies.

Recommendations
This review of 10 studies finds some evidence of an effect of air pollutants on congenital anomaly risk, so it is worth considering directions for further research in this area. Air pollution is a ubiquitous exposure, and small increases in risk may therefore carry a large public health implication; congenital anomalies are rare but serious outcomes that have a large impact on infant mortality and morbidity and on morbidity in later life. The statistically significant increases in relative risk we observed in meta-analyses were between 3% and 20% per unit of exposure. Exposure units used were of the same order of magnitude as, or well within, the IQRs of these pollutants in most study areas (Table 1). Therefore, relative risk increases of this order of magnitude may be considered important for public health. Existing studies have so far not used some of the recent advances in exposure assessment used in other areas of air pollution research. A case can thus be made for well-designed studies that can make improvements in at least some of the following areas.
Exposure assessment. As outlined above, exposure indices with better spatial accuracy (while maintaining an accurate temporal component) should be used to assess exposure to traffic-related air pollution, because this is the main source of air pollution in urban areas. Better identification of the effects of specific exposures and exposure mixtures, and the integration of mobility and timeactivity patterns, is recommended for congenital anomaly research in the same way as for other birth outcomes. Matching of cases and controls by exposure-related variables such as maternal area of residence should be avoided in future studies.
Outcome harmonization. International harmonization of coding and classification of anomalies would greatly aid future systematic evaluations in this field. Any future studies should carefully report the exact classifications and exclusions used and attempt to use the same classifications as previous studies, at least in sensitivity analyses.
Other congenital anomalies. Many of the above studies focused on cardiac anomalies and orofacial clefts. These are two of the largest anomaly groups, and they have been suspected to be related to other environmental exposures (Dolk and Vrijheid 2003). However, studies with sufficient statistical power should also focus on other anomalies for which there may be an environmental etiology (e.g., neural tube defects, limb defects, gastroschisis) ( EUROCAT 2004).
Mechanisms and individual susceptibility. Improved knowledge on potential mechanisms of teratogenic action of air pollutants is needed; this may be achieved by the integration of mechanistic biomarkers. Evaluation of interactions with genetic polymorphisms and dietary factors may also provide further insights into mechanisms.