Case-Crossover Analysis of Air Pollution Health Effects: A Systematic Review of Methodology and Application

Background Case-crossover is one of the most used designs for analyzing the health-related effects of air pollution. Nevertheless, no one has reviewed its application and methodology in this context. Objective We conducted a systematic review of case-crossover (CCO) designs used to study the relationship between air pollution and morbidity and mortality, from the standpoint of methodology and application. Data sources and extraction A search was made of the MEDLINE and EMBASE databases. Reports were classified as methodologic or applied. From the latter, the following information was extracted: author, study location, year, type of population (general or patients), dependent variable(s), independent variable(s), type of CCO design, and whether effect modification was analyzed for variables at the individual level. Data synthesis The review covered 105 reports that fulfilled the inclusion criteria. Of these, 24 addressed methodological aspects, and the remainder involved the design’s application. In the methodological reports, the designs that yielded the best results in simulation were symmetric bidirectional CCO and time-stratified CCO. Furthermore, we observed an increase across time in the use of certain CCO designs, mainly symmetric bidirectional and time-stratified CCO. The dependent variables most frequently analyzed were those relating to hospital morbidity; the pollutants most often studied were those linked to particulate matter. Among the CCO-application reports, 13.6% studied effect modification for variables at the individual level. Conclusions The use of CCO designs has undergone considerable growth; the most widely used designs were those that yielded better results in simulation studies: symmetric bidirectional and time-stratified CCO. However, the advantages of CCO as a method of analysis of variables at the individual level are put to little use.


Review
The first epidemiologic studies on the impact of air pollution on health were undertaken as a consequence of the extreme pollution episodes that took place in the decades from 1930 to 1960. The association between air pollution and certain health variables was made clear by simple graphic representations or by comparisons of mortality rates for these time periods (Firket 1931;Logan 1953). Since that time, air pollution levels have fallen substantially, such that, to evaluate their effects on health, longer time series are required. To this end, epidemiologists began to use dynamic regression models in the 1970s that consisted of models in which the relationship between the dependent and explanatory variables were distributed over time, rather than being expected to occur simultaneously. Moreover, investigators were able to control for residual autocorrelation, with the error being specified by means of autoregressive integrated movingaverage models (ARIMA). The problem with these types of models is that they assume that the dependent variable is distributed normally, which, in fact, is extremely rare in the daily outcome count variables of morbidity and mortality events (Saez et al. 1999).
The early 1990s saw the appearance of linear models based on Poisson regression, in which a parametric approach was used to control for trend and seasonality because the event counts more typically have a Poisson distribution. These models use the variable "time" and its transforms, quadratic and sinusoidal functions (sine or cosine) of different frequency and amplitude, to control for the effect on the dependent variable (mortality or morbidity) of unmeasured variables that may vary seasonally, such as in pollen concentration, meteorological variables, and influenza outbreaks, or that may have a trend, such as changes in a city's population distribution, in order to ascertain the effect of such variables on the dependent variable (Saez et al. 1999). Insofar as changes in a city's population pyramid are concerned, Poisson regression is particularly useful only when cases, rather than the entire population, can be enumerated, because this form of regression analysis does not require knowledge of the denominator as long as population flux is in steady state (Loomis et al 2005).
Nevertheless, Poisson regression poses the problem that, if any of these unmeasured variables follows a cyclical component of varying frequency and width (as might be the case of pollen concentration or influenza), the parametric functions of time or of its sinusoidal transforms cannot be easily "adapted" to such changes. These limitations led to the development of nonparametric Poisson regression with the application of generalized additive models (GAMs) that use nonparametric functions of the variable "time" (Kelsall et al. 1997), which adapt flexibly to the irregular cyclic components of unmeasured variables and allow for flexible fits for important variables, such as temperature, barometric pressure, and relative humidity, thus reducing any potential confounding due to these factors.
One difficulty with this method is that the number of degrees of freedom of the smoothed nonparametric function must be specified by the researcher, with discrepancies arising as to the most appropriate way to calculate this. Because inappropriate determination of the number of degrees of freedom can lead to bias in the estimates of nonparametric Poisson designs, epidmiologists focused on the casecrossover (CCO) design that purported to control time trends.The CCO design was proposed by Maclure (1991) to identify risk factors of acute events; it is characterized by the fact that each subject serves as his or her own control by assessing referent exposure at a point in time prior to the event. By virtue of its design, this type of study controls for the influence of confounding variables that remain constant in the subject at both dates, that of the event and that volume 118 | number 8 | August 2010 • Environmental Health Perspectives of the referent time, such as sex, smoking history, occupational history, and genetics. This design was initially used to assess the effect of exposures measured at an individual level (telephone calls and traffic accidents, physical or sexual activity, and acute myocardial infarction) and was not applicable to exposures with a time trend, such as air pollution. Thus, if an investigator selected exposure control dates before the effect, and there was a trend, prior exposures would be systematically higher or lower than at the date of the effect. To circumvent this bias, Navidi (1998) developed a variant of this design, bidirectional CCO, which is conceptually characterized by having control time periods before and after the event, something that made it possible to control for the effect of long-term trend and seasonality on the variable "exposure." This design was already appropriate for ecologic-type exposures, such as air pollution, because the existence of registries means that the values of such exposure can be ascertained even after the event. In addition, pollution values are not affected by the presence of prior morbidity and mortality events. In the CCO design, the referent time periods represent the counterfactual exposure experience of the individual, had he or she not become sick; because in air pollution pre-and postevent exposure values are independent of the hazardperiod exposure, those that are postevent referent can be appropriate. One advantage of CCO design over Poisson regression is its ability to assess potential effect modification (i.e., statistical interaction) at the individual level rather than at the group level (Figueiras et al. 2005). As an alternative analytic methodology to Poisson regression, the CCO approach allows for direct modeling of interaction terms, rather than depending on multiple subgroup analyses (Figueiras et al. 2005).
We conducted a systematic review of the CCO design used to study the relationship between air pollution and morbidity and mortality, from both a methodologic and an applied standpoint.

Materials and Methods
We conducted a bibliographic search in January 2009 using the MEDLINE (National Library of Medicine, Bethesda, MD, USA) and EMBASE (Elsevier, New York, NY, USA) databases and the key words case-crossover* and pollution*; the time frame was 1999 through 2008. From the total number of papers, we selected a series of reports based on the language used and the topic addressed in the title and/or abstract, thereby eliminating all that were not written in English or Spanish and that did not address the subject targeted for study. All the reports chosen in this way were reviewed, and additional reports were selected from among those cited in the respective references.
The reports retrieved were classified into two major groups: methodology reports in which new CCO designs were described or existing designs compared, generally by means of simulation studies, and application reports, in which some CCO design was applied for the purpose of analyzing the relationship between air pollution and health.
The methodology reports were in turn classified into those that conducted simulation studies to compare CCO designs with one another or with other designs, such as Poisson time-series, and those that described theoretical aspects pertaining to CCO design.

Results
Our review of methodological aspects revealed a trend in CCO bidirectional designs with regard to the choice of control periods (Table 1). The main bidirectional CCO designs, in chronological order of appearance, were as follows: a) full-stratum CCO, one of the designs initially proposed by Navidi (1998), in which all the days of the series  except that of the event were taken as controls; b) random matched-pair CCO, which was also proposed by Navidi (1998) and consisted of taking any day of the series before or after the event, at random; c) symmetric CCO, proposed by Bateson and Schwartz (1999), which consisted of taking 2 days of the series as the controls, one before and one after the event, equidistant from the latter; d) time-stratified CCO, a design proposed by Lumley and Levy (2000), consisting of taking as control one or more days falling within the same time stratum as that in which the event occurred; for example, if "month" is established as the time stratum and the event occurs on, say, a Monday, then this is compared with all the Mondays in that same month; and e) semisymmetric CCO, proposed by Navidi and Weinhandl (2002), which consists of randomly choosing as control only one of the two controls used by symmetric CCO.
Simulation studies compare model predictions based on repeated samples drawn from a data set that represents the entire population of interest and for which true values are known because they were determined by the investigator when the data set was created in order to represent a scenario of interest. They compare the performance of different CCO designs (process or manner of functioning or operating) based on such indicators as efficiency (with relative increases in variance or standard error indicating less efficiency), bias (the difference between the model-estimated value and the true value of the parameter being estimated), and coverage (the proportion of replicate estimates that include the true value of the coefficient within their 95% confidence intervals). Simulation studies yielded the following results, in chronologic order (summarized in Table 2). Navidi (1998), in a simulation scenario based on real data for particulate matter (PM) with aerodynamic diameter ≤ 10 μm (PM 10 ) and an unmeasured confounding variable that generated a long-term trend, conducted a simulation in which unidirectional was compared with bidirectional full-stratum CCO design and observed that the bidirectional design resulted in less bias. Bateson and Schwartz (1999), in a simulation scenario based on real PM 10 data and an unmeasured confounding variable that generated long-term trend and seasonality (short-term trends), conducted a simulation to compare Poisson time-series regression design against different CCO designs, such as unidirectional, full-stratum, random matched pair, and symmetric, with control periods ranging from 1-4 weeks before and after the event. The results of this simulation showed that, whereas the symmetric CCO design performed best in terms of bias, it nevertheless displayed a lower efficiency (66%) than did the Poisson time-series designs. Lumley and Levy (2000) compared symmetric with time-stratified CCO designs in a simulation scenario based on real black smoke data and an unmeasured confounding variable that generated long-term trend and seasonality; they observed better performance with the time-stratified CCO design, although both displayed a small degree of bias. Lee et al. (2000), in a simulation scenario based on real mortality data and an unmeasured confounding variable that generated seasonality, compared unidirectional design with symmetric CCO and found that the latter performed better, although bias increased when the number of seasonality waves was incomplete. Bateson and Schwartz (2001) set out to study the best distance at which to use control days in symmetric CCO design, in a scenario with trend and seasonality, in which all the variables were simulated. They studied control days ranging from 1-28 days before and after the event and observed that confounding was minimized when the spacing was equal to the period of exposure. Levy et al. (2001a), in a simulation scenario based on real black smoke data and an unmeasured confounding variable that generated long-term trend but no seasonality, compared unidirectional with symmetric design, using different numbers of control periods and at different intervals from the event period, as well as the influence of auto correlation (correlation of a temporal series variable with its own previous or posterior values) between control periods and overlapping (bias resulting from the use of incorrect referent periods), and concluded that the symmetric CCO design performed better, with less bias when the distance of the control periods from the event was 7 days and when auto correlation and overlapping were avoided. Navidi and Weinhandl (2002) conducted a simulation in a scenario based on real PM 10 data and an unmeasured confounding variable that generated long-term trend and seasonality, in which they compared Poisson timeseries design with the following CCO designs: symmetric with control periods separated by 7 days with respect to the case date, semisymmetric with the control period separated by 7 days with respect to the case date, random matched pair, and full-stratum. They concluded that the semisymmetric design performed best. Fung et al. (2003) conducted a simulation in a simulation scenario based on real PM with aerodynamic diameter ≤ 2.5 μm (PM 2.5 ) data and an unmeasured confounding variable that generated long-term trend and seasonality, in which they compared Poisson time-series design against uni directional, symmetric, and semisymmetric CCO designs. They concluded that, although the symmetric design displayed a better performance in terms of bias than did the other designs studied, it was none theless similar to that of the Poisson time-series design, which showed a better coverage and statistical power thanks to its greater efficiency. Figueiras et al. (2005), in a simulation study that used a simulation scenario based on real PM 10 data and an unmeasured confounding variable that could generate long-term trend and seasonality, compared the Poisson time-series design with a number of CCO designs: symmetric, semisymmetric, time stratified, full symmetric (14 control periods before and after event) analyzed by longitudinal designs, and full semisymmetric (seven control periods before and after event) analyzed by longitudinal designs. They reported that the full semisymmetric design displayed the least bias together with the best coverage and statistical power but proved unstable when the beta value (strength of association between the pollutant and the event) varied with respect to the usual values. Although semisymmetric CCO displayed fewer biases than did symmetric or time-stratified CCO (both of which yielded similar results), it suffered from the drawback of having a lower statistical power.
It is particularly interesting to note that three of these simulation studies (Bateson and Schwartz 1999;Figueiras et al. 2005;Navidi and Weinhandl 2002) generated data for simulations using the same equations to determine trend and seasonality, before going on to use different real pollution data, such that comparable scenarios were investigated by each set of investigators.
In a separate study, Peters et al. (2006) analyzed a real database by means of a CCO and an alternative design (Poisson time-series design or Cox regression analysis) and then compared the results, observing that the timestratified CCO design yielded results and conclusions similar to those of the Poisson time-series design and Cox regression analysis.
CCO studies of the relationship between pollution and health. CCO designs are increasingly being applied to the task of analyzing the relationship between air pollution and its short-term effects on health (Figure 2). Tables 3-5 provide a detailed description of the studies published to date.
The reports published by Lee and Schwartz (1999) and Neas et al. (1999) were the first studies to report the relationship between air  1981-1993Figueiras et al. 2005 Yes c Yes c PM 10 S Barcelona 1995-1997 -, simulation site only. a BS, black smoke; PM 2.5 , PM with aerodynamic diameter ≤ 2.5 µm; PM 10 , PM with aerodynamic diameter ≤ 10 µm; S, simulated. b S, simulated (variable generated mathematically on the basis of other variables that enter into the simulation); C, created (variable generated artificially, although not on the basis of other variables that enter into the simulation). c The simulations by Bateson and Schwartz (1999), Navidi and Weinhandl (2002), and Figueiras et al. (2005) share the same simulation scenario, in the sense that these authors use the same equation to generate trend and seasonality in the data series. These studies performed a reanalysis of the effects of air pollution and mortality in the cities of Philadelphia and Seoul, respectively, and obtained a relationship that proved statistically significant. These results are similar to those previously obtained with the Poisson time-series design and thus strengthen the relationship of causality, inasmuch as the same relationship was observed when different statistical methods were applied. Analysis of which CCO designs were most commonly used in the published reports showed that 7.7% of these were unidirectional and the remainder bidirectional. The most frequently used bidirectional designs were symmetric (42.2% of studies) and time stratified (48.9% of studies). The semisymmetric bidirectional design was used in only one study. Figure 2 depicts the time trend in the use of the different CCO designs. Although unidirectional designs were used in the initial period, they were gradually discarded. Most of the published studies used a 1-day control period, but six studies used a 1-hr control period.
Most of the studies that employed symmetric CCO designs used day 7 before and after the event as the control days (n = 23), although a variety of other schemes were also used ( Table 3). Studies that used time-stratified CCO typically selected a control day on the same day of the week during the same month as the event, although other schemes (e.g., selecting days during the same month with comparable temperature) were also used ( Table 4). Studies that used uni directional CCO designs used a variety of schemes to select control days (e.g., day 7 before the event) ( Table 5).
The dependent variables studied were mortality related in 25 cases and morbidity related in the remainder: hospital admissions in 35 studies, hospital emergencies in 7 studies, episodes of arrhythmias recorded in pacemakers in 5 studies, telephone calls to medical emergencies in 2 studies, and others based on disease-specific registers, such as stroke (1 study), cardiac arrest (3 studies), and ischemic heart disease (2 studies).
In 77 studies, the air pollutant analyzed was particulate level, mostly measured as PM 10 (61 studies), followed by PM 2.5 (22 studies), black smoke (11 studies), haze coefficient (3 studies), total suspended PM (4 studies), sulfate particles (1 study), and PM with aerodynamic diameter < 7mm (1 study). Insofar as gaseous air pollutants were concerned, sulfur dioxide was used on 47 studies, nitrogen dioxide on 48, ozone on 44, carbon monoxide on 43, and oxides of oxygen (O x ), oxides of nitrogen (NO x ), and nitrogen oxide on 1 study each.
In most cases, the general population was studied. Patients were studied in only 9 studies: cardiac pacemaker carriers in 5, chronic obstructive pulmonary disease patients in 2, and asthma and heart failure patients in 1 study each.
Of all the studies that addressed application of CCO designs, 11 (13.6%) made use of analysis of effect modification of variables at the individual level.
Common steps and requirements for CCO study designs. The procedures followed in conducting a study into the relationship between air pollution and health, taking all reports on CCO design methodology and application into account, are outlined in the Appendix.
In brief, CCO studies begin by confirming that data meet a series of necessary requisites and end with a sensitivity analysis, after passing through a series of intermediate steps that include the transformation of the database into a matrix with CCO structure.

Discussion
This is the first systematic review to cover the application of CCO designs to the study of the health effects of air pollution. Use of CCO designs has risen steeply in recent years and from 2003 in particular, reaching a peak in 2006. Most of the new CCO designs that gradually appeared were based on simulation studies, which in many cases neither relied on the same scenarios nor assessed performance for variables with special characteristics, for example, discontinuous exposures. Most application studies have tended to study the effect of particulates on morbidity, yet few studies have taken advantage of the strength of CCO designs to assess potential effect modifications with individual variables.

CCO versus Poisson.
The increase in the use of the CCO design appears to coincide with problems using Poisson regression models with GAM: as far back as 2002, Dominici et al. (2002) discovered that the most frequently used statistical packages gave rise to unstable estimators due to inadequate convergence criteria that could underestimate standard errors because of the presence of concurvity in the data (Ramsay et al. 2003). =month, =weekday, all the days of the same month as that of the case, which was the same day of the week; =month, =weekday, =hour, hours that coincide with those of the case, on days in the same month as the case, which were the same days of the week; =month, days =temperature, days in the same month as and having a temperature equal to that of the case date; =month, all days but 2 days between, all days in the same month as that of the case except 2 days between each control day. d BS, black smoke; CO, carbon monoxide; NO 2 , nitrogen dioxide; O 3 , ozone; PM 10 , PM with aerodynamic diameter ≤ 10 µm; PM 2.5 , PM with aerodynamic diameter ≤ 2.5 µm; SO 2 , sulfur dioxide; TSP, total suspended PM. e Ep, episode; ETC, emergency telephone calls; HA, hospital admission; HE, hospital emergency; M, mortality.
In part, the CCO design represents a solution to the problems posed by GAM methods, but before it can become generalized, a period of time is required. For instance, we observed no marked increase in the use of these designs until some years after the discovery of GAMrelated problems; a peak in use occurred 2 years after the discovery of the problems of concurvity (analog to collinearity for nonlinear relationships). Currently, other (e.g., geographic) methods are also being used to analyze the link between air pollution and health .

Different CCO designs and their evolution.
We observed an ongoing effort to perfect the CCO design dating from the initial unidirectional design up to the bidirectional designs with their subtypes. Successive simulation studies have focused on studying the designs that yielded the best results in previous simulations. Symmetric bidirectional CCO and time-stratified CCO most often proved to be best in different simulations. In contrast, the semisymmetric design yielded contradictory results: in some simulation studies it proved better than the symmetric design, but other studies gave opposite results (Fung et al. 2003), which could be due to differences in the simulation scenario. One consistent finding, however, is that the statistical efficiency of semisymmetric CCO is low compared with that of the symmetric or timestratified CCO methods.
The rapid adoption of symmetric and time-stratified CCO designs is noteworthy, in that these began to be applied in the very same year in which their methodology was first proposed in the scientific literature. In contrast, the semisymmetric CCO design was first proposed in 2002, yet the first report in which it was used to analyze the relationship between air pollution and health was published in 2004.
One possible explanation for the fact that different designs are used in practice is that they were discovered at different points in time: unidirectional were described before bidirectional methods, and within bidirectional methods, symmetric was described before time-stratified CCO. Unidirectional methods are being used less frequently because of important disadvantages, such as poor control of trends.
Of the three bidirectional methods, semisymmetric is used very little because of its negligible statistical power. Symmetric and time-stratified designs had a similar percentage of use, with a trend toward greater use of time-stratified designs, possibly because, from a theoretical point of view, they solve the "overlap bias" that symmetric designs otherwise display. However, simulation studies are not conclusive when it comes to comparing time-stratified with symmetric designs; for example, in their simulation study, Lumley and Levy (2000) reported that the timestratified method was superior, but Figueiras et al. (2005) did not find this method to be better than the symmetric CCO.
The fact that the CCO designs most often used to analyze the relationship between air pollution and health are symmetric and time stratified, plus the rapid adoption of these same two models (they began to be used in the same year as they were proposed in the literature), together indicate that there is an interest in the correct application of this methodology. Control periods most frequently used for the symmetric design are 7 days before and after case, and for the timestratified design, control periods are all the same days of the week as the case within the same month. Thus, these two approaches prevent problems of autocorrelation, and control for effect of day of the week.
Interpretation of application studies. In studies that use the CCO design to analyze the relationship between air pollution and health, the most frequently used exposure is that of hospital admissions. The greater use of hospital admissions than mortality as an outcome may be because, on the one hand, the hospital admission variable entails a greater number of events, thereby affording greater statistical power, and on the other hand, the time period from exposure until the event is shorter for hospital admissions than for mortality, thereby requiring a smaller number of lags, thus facilitating statistical analysis (American Thoracic Society 1985). The type of pollutant most frequently analyzed with CCO designs is airborne particulates, possibly because these have been widely studied and because exposure data are readily available. In terms of type of population, these studies seldom target diseased populations but focus instead on general popu lations, possibly because of the difficulty of obtaining records for a specific disease popu lation (Filleul et al. 2004).

Lessons learned and new challenges.
Although the application of nonparametric Poisson models amounted to a great advance over earlier designs, enabling more flexible control of unmeasured confounding variables that change over time, the problems detected, such as the difficulty in setting the number of degrees of freedom, seem to have heightened interest in other alternatives, such as CCO. These approaches make it possible to control for the influence of trend and seasonality by design. Initially, these designs resulted in There are no known study characteristics that would favor using one referent period over another, because the heterogeneity of the simulation studies in terms of their scenarios and results renders it impossible to draw any conclusion in this regard. Likewise, simulation studies have tended to concentrate on PM, and no simulation study assesses the latter's behavior in discontinuous exposures (e.g., a high-ozone day). In this type of exposure where high proportions of cases and controls assume a value of zero, Poisson time series might, from a theoretical point of view, perform better than CCO methods, because the comparisons are made in the same person and, when the case and control periods have the same value, provide no statistical power when analyzed with conditional logistic regression. However, we are not aware of any simulation studies that have tested whether this assumption has any relevance in practice.
Theoretically, one of the great advantages of CCO designs is that individual data can be included to estimate effect modifications, but in practice most CCO-based studies on the relationship between air pollution and health do not analyze effect modification at the individual level. The scant use of this advantage might be due to the lack of availability of data at this level (Filleul et al. 2004).
Furthermore, thanks to the CCO design, we have more scientific evidence of the shortterm association between air pollution and health, because at times reanalyses using CCO methodology have been run on data previously analyzed with Poisson methods, and similar results have been obtained (Lee and Schwartz 1999).
One possible challenge is the application of mixed models to the analysis of CCO designs, something that, on the one hand, could furnish greater statistical power and, on the other, could extend CCO designs to spatial-temporal models. Figueiras et al. (2005) attempted to apply longitudinal models to CCO designs but observed that, in the presence of autocorrelation, estimates might be biased. New approaches in this field could solve these problems.
From the standpoint of statistical analysis, Lu et al. (2008) have proposed that CCO models should be checked to see if assumptions for using CCO methodology were satisfied, via a series of diagnostic tools such as plotting the data. In practice, however, we have detected no CCO study on the relationship between air pollution and health that checked the models. Furthermore there are no formulas for calculating sample size (or statistical power) in CCO designs, and indeed, one study (Symons et al. 2006) applied a simulation to calculate the lower bound of detectable effects. A possible risk of CCO designs lies in "model shopping," whereby multiple analyses are performed using different designs, and only the most interesting are then shown (Mittleman 2005). This problem

Appendix: Applying CCO Designs to Study the Relationship between Air Pollution and Health
The steps to be followed to conduct a study into the relationship between air pollution and health, taking all reports on CCO design methodology and application into account, can be summarized as follows: 1. Confirm that the study variables meet the conditions for being able to study the association using a CCO: a. Exposure variables must be transitory (prolonged exposures such as radon would not be valid). b. Event variables must be acute (events such as cancer would not be valid). c. Proportion of missing data must be small. 2. The databases obtained can be classified into one of the following types: a. Contain only ecologic temporal cluster data. b. Contain ecologic temporal and spatial cluster data. c. Individual data available-this enables effect modification to be subsequently studied at the level of variables having characteristics pertaining to individuals. 3. For exposure variables, compute the individual (0, 1, 2, 3) or combined lags (0, 1, 2-3 . . .) depending on the nature of the dependent variable (longer lags are needed for mortality than for morbidity variables). 4. Transform the database into a matrix with a CCO structure, that is, with as many strata as there are events, and in each stratum there is a case period that would be formed by exposure at the time of the event (or the corresponding lag) and one (or more) control periods that would be formed by exposure in the periods selected as controls (e.g., in a symmetric CCO design, these could be day 7 before and after the event). For an ecologic database consisting solely of temporal cluster data, calculations are simplified because: a. There are macros in S-Plus that transform an ecological matrix into a symmetric CCO, semisymmetric CCO, or time-stratified CCO (these may be requested from the corresponding author). b. There is the possibility of conducting CCO studies using an ecologic matrix, with weighting for the daily number of events in the regression models. The advantages are that transformation into a CCO matrix is not necessary, the size of the database is smaller, and computing time is shorter. 5. To relate dependent and independent variables, perform the statistical analysis according to the following steps: a. Construct a baseline model by introducing variables, such as temperature, ambient humidity, and atmospheric pressure. For these types of environmental variables, nonlinear risk exposure relationships might have to be checked. For the purpose, use can be made of different smoothers, such as natural splines, penalized splines, or smoothing splines. To decide whether a variable is retained in the model or the number of degrees of freedom of the smooth function, use the minimization criterion of the Akaike information criterion (Figueiras and Cadarso-Suarez 2001). b. Construct the single-pollutant models by adding the pollutants to the baseline model. c. Construct the multipollutant models by adding those pollutants to the baseline model that have obtained a given p-value in the single-pollutant model. d. Analyze possible effect modification by reference to the statistical significance of the interaction term. e. Analyze statistical power (Symons et al. 2006). 6. Check the models according to the method proposed by Lu et al. (2008). 7. Conduct a sensitivity analysis by analyzing the models using another type of CCO design or even a Poisson time series. 8. Report the results obtained.
can be solved, in part, by means of a sensitivity analysis, in which the authors show the results obtained with different CCO methods, and even compare the results against a generalized linear model with a Poisson response.
Limitations of our review. In assessing the reports that use effect modification with individual data, we encountered difficulties regarding use of different terminologies: some used the term "modification" to classify what is in reality "stratification into subgroups"; others referred to stratification but did not clarify whether different statistical models were used for each group of subjects of the variable "stratification," or whether an interaction term was introduced into the model to assess effect modi fication. Furthermore, as with any systematic review, publication bias may be present.

Conclusions
The CCO design could be an attractive alternative to Poisson time-series analysis with GAM, but its advantages and drawbacks are still in the process of being understood. The use of CCO designs to study the relationship between air pollution and health has experienced a great upsurge, but with few exceptions, full advantage has not been taken in terms of effect modification or spatialtemporal analyses. Moreover, although a number of simulations have been conducted to study the performance of CCO designs, the performance of discontinuous exposures, such as ozone, remains to be studied. A further, very important challenge would be to undertake an in-depth longitudinal analysis of CCO designs, which would enhance their statistical power and enable them to be applied to spatial-temporal models.