MatchIt: Nonparametric Preprocessing for Parametric Causal Inference

MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily ﬁts into existing research practices since, after preprocessing data with MatchIt , researchers can use whatever parametric model they would have used without MatchIt , but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. MatchIt is an R program, and also works seamlessly with Zelig .


What MatchIt does
MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models and reducing model dependence by preprocessing data with semi-parametric and non-parametric matching methods. After appropriately preprocessing data with MatchIt, researchers can use whatever parametric model and software they would have used without MatchIt, without other modification, and produce inferences that are more robust and less sensitive to modeling assumptions. MatchIt reduces the dependence of causal inferences on commonly made, but hard-to-justify, statistical modeling assumptions via the largest range of sophisticated matching methods of any software we know of. The program includes most existing approaches to matching and even enables users to access methods implemented in other programs through its single, unified, and easy-to-use interface. In addition, we have written MatchIt so that adding new matching methods to the software is as easy for anyone with the inclination as it is for us.

Software requirements
MatchIt works in conjunction with the R programming language and statistical software R Development Core Team (2011) and will run on any platform where R is installed (Windows, Unix, or Mac OS X). MatchIt is available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=MatchIt. MatchIt has been tested on the most recent version of R. A good way to learn R, if you do not know it already, is to learn Zelig which includes a self-contained introduction to R and can be used to analyze the matched data after running MatchIt Lau 2011, 2008a).

Overview
MatchIt is designed for causal inference with a dichotomous treatment variable and a set of pretreatment control variables. Any number or type of dependent variables can be used. (If you are interested in the causal effect of more than one variable in your data set, run MatchIt separately for each one; it is unlikely that any one parametric model will produce valid causal inferences for more than one treatment variable at a time.) MatchIt can be used for other types of causal variables by dichotomizing them, perhaps in multiple ways (see also Imai and van Dyk 2004). MatchIt works for experimental data, but is usually used for observational studies where the treatment variable is not randomly assigned by the investigator, or the random assignment goes awry. We adopt the same notation as in Ho, Imai, King, and Stuart (2007). Unless otherwise noted, let i index the n units in the data set, n 1 denote the number of treated units, n 0 denote the number of control units (such that n = n 0 + n 1 ), and x i indicate a vector of pretreatment (or control) variables for unit i. Let t i = 1 when unit i is assigned treatment, and t i = 0 when unit i is assigned control. (The labels "treatment" and "control" and values 1 and 0 respectively are arbitrary and can be switched for convenience, except that some methods of matching are keyed to the definition of the treated group.) Denote y i (1) as the potential outcome of unit i under treatment -the value the outcome variable would take if t i were equal to 1, whether or not t i in fact is 0 or 1 -and y i (0) the potential outcome of unit i under control -the value the outcome variable would take if t i were equal to 0, regardless of its value in fact. The variables y i (1) and y i (0) are jointly unobservable, and for each i, we observe one y i = t i y i (1) + (1 − t i )y i (0), and not the other.
Also denote a fixed vector of exogenous confounders (measured at pretreatment) as X i . These variables are defined in the hope or under the assumption that conditioning on them appropriately will make inferences ignorable; see Ho, Imai, King, and Stuart (2007) for details. Measures of balance should be computed with respect to all of X, even if some methods of matching only use some components.

Preprocessing via matching
If t i and X i were independent, we would not need to control for X i , and any parametric analysis would effectively reduce to a difference in means of Y for the treated and control groups. The goal of matching is to preprocess the data prior to the parametric analysis so that the actual relationship between t i and X i is eliminated or reduced without introducing bias and or increasing inefficiency too much.
When matching we select, duplicate, or selectively drop observations from our data, and we do so without inducing bias as long as we use a rule that is a function only of t i and X i and does not depend on the outcome variable Y i . Many methods that offer this preprocessing are included here, including exact, subclassification, nearest neighbor, optimal, and genetic matching. For many of these methods the propensity score -defined as the probability of receiving the treatment given the covariates -is a key tool. In order to avoid changing the quantity of interest, most MatchIt routines work by retaining all treated units and selecting (or weighting) control units to include in the final data set; this enables one to estimate the average treatment effect on the treated (the purpose of which is described in Section 2.3). MatchIt implements and evaluates the choice of the rules for matching. Matching sometimes increases efficiency by eliminating heterogeneity or deleting observations outside of an area where a model can reasonably be used to extrapolate, but one needs to be careful not to lose too many observations in matching or efficiency will drop more than the reduction in bias that is achieved.
The simplest way to obtain good matches (as defined above) is to use one-to-one exact matching, which pairs each treated unit with one control unit for which the values of X i are identical. However, with many covariates and finite numbers of potential matches, sufficient exact matches often cannot be found. Indeed, many of the other methods implemented in MatchIt attempt to balance the overall covariate distributions as much as possible, when sufficient one-to-one exact matches are not available.
A key point in Ho, Imai, King, and Stuart (2007) is that matching methods by themselves are not methods of estimation: every use of matching in the literature involves an analysis step following the matching procedure, but almost all analyses use a simple difference in means. This procedure is appropriate only if exact matching was conducted. In almost all other cases, some adjustment is required, and there is no reason to degrade your inferences by using an inferior method of analysis such as a difference in means even when improving your inferences via preprocessing. Thus, with MatchIt, you can improve your analyses in two ways. MatchIt analyses are "doubly robust" in that if either the matching analysis or the analysis model is correct (but not necessarily both) your inferences will be statistically consistent. In practice, the modeling choices you make at the analysis stage will be much less consequential if you match first.

Checking balance
The goal of matching is to create a data set that looks closer to one that would result from a perfectly blocked (and possibly randomized) experiment. When we get close to attaining this goal, we break the link between the treatment variable and the pre-treatment controls, which makes the parametric form of the analysis model less relevant or irrelevant entirely. To break this link, we need the distribution of covariates to be the same within the matched treated and control groups.
A crucial part of any matching procedure is, therefore, to assess how close the (empirical) covariate distributions are in the two groups, which is known as "balance." Because the outcome variable is not used in the matching procedure, any number of matching methods can be tried and evaluated, and the one matching procedure that leads to the best balance can be chosen.
MatchIt provides a number of ways to assess the balance of covariates after matching, including numerical summaries such as the mean Diff. (difference in means) or the difference in means divided by the treated group standard deviation, and summaries based on quantilequantile plots that compare the empirical distributions of each covariate. The widely used procedure of doing t-tests of the difference in means is highly misleading and should never be used to assess balance; see Imai, King, and Stuart (2008b). These balance diagnostics should be performed on all variables in X, even if some are excluded from one of the matching procedures.

Conducting analyses after matching
The most common way that parametric analyses are used to compute quantities of interest (without matching) is by (statistically) holding constant some explanatory variables, changing others, and computing predicted or expected values and taking the difference or ratio, all by using the parametric functional form. In the case of causal inference, this would mean looking at the effect on the expected value of the outcome variable when changing T from 0 to 1, while holding constant the pretreatment control variables X at their means or medians. This, and indeed any other appropriate analysis procedure, would be a perfectly reasonable way to proceed with analysis after matching. If it is the chosen way to proceed, then either treated or control units may be deleted during the matching stage, since the same parametric structure is assumed to apply to all observations. In other instances, researchers wish to reduce the assumptions inherent in their statistical model and so want to allow for the possibility that their treatment effect varies over observations. In this situation, one popular quantity of interest used is the average treatment effect on the treated (ATT). For example, for the treated group, the potential outcomes under control, Y i (0), are missing, whereas the outcomes under treatment, Y i (1), are observed, and the goal of the analysis is to impute the missing outcomes, Y i (0) for observations with T i = 1. We do this via simulation using a parametric statistical model such as linear regression, logit, or others (as described below). Once those potential outcomes are imputed from the model, the is a predicted value of the dependent variable for unit i under the counterfactual condition where T i = 0. The insample average treatment effect for the treated individuals can then be obtained by averaging this difference over all observations i where in fact T i = 1. Most MatchIt algorithms retain all treated units, and choose some subset of or repeated units from the control group, so that estimating the ATT is straightforward. If one chooses options that allow matching with replacement, or any solution that has different numbers of controls (or treateds) within each subclass or strata (such as full matching), then the parametric analysis following matching must accommodate these procedures, such as by using fixed effects or weights, as appropriate.
(Similar procedures can also be used to estimate various other quantities of interest such as the average treatment effect by computing it for all observations, but then one must be aware that the quantity of interest may change during the matching procedure as some control units may be dropped.) The imputation from the model can be done in at least two ways. Recall that the model is used to impute the value that the outcome variable would take among the treated units if those treated units were actually controls. Thus, one reasonable approach would be to fit a model to the matched data and create simulated predicted values of the dependent variable for the treated units with T i switched counterfactually from 1 to 0. An alternative approach would be to fit a model without T by using only the outcomes of the matched control units (i.e., using only observations where T i = 0). Then, given this fitted model, the missing outcomes Y i (0) are imputed for the matched treated units by using the values of the explanatory variables for the treated units. The first approach will usually have lower variance, since all observations are used, and the second may have less bias, since no assumption of constant parameters across the models of the potential outcomes under treatment and control is needed. See Ho, Imai, King, and Stuart (2007) for more details.
Other quantities of interest can also be computed at the parametric stage, following any procedures you would have followed in the absence of matching. The advantage is that if matching is done well your answers will be more robust to many small changes in parametric specification.

Quick overview
The main command matchit() implements the matching procedures. A general syntax is: R> m.out <-matchit(treat~x1 + x2, data = mydata) where treat is the dichotomous treatment variable, and x1 and x2 are pre-treatment covariates, all of which are contained in the data frame mydata. The dependent variable (or variables) may be included in mydata for convenience but is never used by MatchIt or included in the formula. This command creates the MatchIt object called m.out. Passing the name of the object will produce a quick summary of the results:

Examples
To run any of the examples below, you first must load the library and and data: R> library("MatchIt") R> data("lalonde") Our example data set is a subset of the job training program analyzed in Lalonde (1986) and Dehejia and Wahba (1999). MatchIt includes a subsample of the original data consisting of the National Supported Work Demonstration (NSW) treated group and the comparison sample from the Population Survey of Income Dynamics (PSID). 1 The variables in this data set include participation in the job training program (treat, which is equal to 1 if participated in the program, and 0 otherwise), age (age), years of education (educ), race (black which is equal to 1 if black, and 0 otherwise; hispan which is equal to 1 if hispanic, and 0 otherwise), marital status (married, which is equal to 1 if married, 0 otherwise), high school degree (nodegree, which is equal to 1 if no degree, 0 otherwise), 1974 real earnings (re74), 1975 real earnings (re75), and the main outcome variable, 1978 real earnings (re78).

Exact matching
The simplest version of matching is exact. This technique matches each treated unit to all possible control units with exactly the same values on all the covariates, forming subclasses such that within each subclass all units (treatment and control) have the same covariate values. Exact matching is implemented in MatchIt using method = "exact". Exact matching will be done on all covariates included on the right-hand side of the formula specified in the MatchIt call. There are no additional options for exact matching. (Exact restrictions on a subset of covariates can also be specified in nearest neighbor matching; see below.) The following example can be run by typing demo("exact") at the R prompt, R> m.out <-matchit(treat~educ + black + hispan, data = lalonde, + method = "exact")

Subclassification
When there are many covariates (or some covariates can take a large number of values), finding sufficient exact matches will often be impossible. The goal of subclassification is to form subclasses, such that in each the distribution (rather than the exact values) of covariates for the treated and control groups are as similar as possible. Various subclassification schemes exist, including the one based on a scalar distance measure such as the propensity score estimated using the distance option (see Section 4.1.3). Subclassification is implemented in MatchIt using method = "subclass".
The following example script can be run by typing demo("subclass") at the R prompt, R> m.out <-matchit(treat~re74 + re75 + educ + black + hispan + age, + data = lalonde, method = "subclass") The above syntax forms 6 subclasses, which is the default number of subclasses, based on a distance measure (the propensity score) estimated using logistic regression. By default, each subclass will have approximately the same number of treated units.
Subclassification may also be used in conjunction with nearest neighbor matching described below, by leaving the default of method = "nearest" but adding the option subclass. When you choose this option, MatchIt selects matches using nearest neighbor matching, but after the nearest neighbor matches are chosen it places them into subclasses, and adds a variable to the output object indicating subclass membership.

Nearest neighbor matching
Nearest neighbor matching selects the r (default = 1) best control matches for each individual in the treatment group (excluding those discarded using the discard option). Matching is done using a distance measure specified by the distance option (default = logit, i.e., a logistic regression model is used to estimate the propensity score, defined as the probability of receiving treatment, conditional on the covariates). Matches are chosen for each treated unit one at a time, with the order specified by the m.order command (default = largest to smallest). At each matching step we choose the control unit that is not yet matched but is closest to the treated unit on the distance measure.
Nearest neighbor matching is implemented in MatchIt using the method = "nearest" option.

Optimal matching
The default nearest neighbor matching method in MatchIt is "greedy" matching, where the closest control match for each treated unit is chosen one at a time, without trying to minimize a global distance measure. In contrast, "optimal" matching finds the matched samples with the smallest average absolute distance across all the matched pairs. Gu and Rosenbaum (1993) find that greedy and optimal matching approaches generally choose the same sets of controls for the overall matched samples, but optimal matching does a better job of minimizing the distance within each pair. In addition, optimal matching can be helpful when there are not many appropriate control matches for the treated units.
Optimal matching is performed with MatchIt by setting method = "optimal", which automatically loads an add-on package called optmatch (Hansen 2004). The following example can also be run by typing demo("optimal") at the R prompt. We conduct 2:1 optimal ratio matching based on the propensity score from the logistic regression.

Full matching
Full matching is a particular type of subclassification that forms the subclasses in an optimal way (Rosenbaum 2002;Hansen 2004). A fully matched sample is composed of matched sets, where each matched set contains one treated unit and one or more controls (or one control unit and one or more treated units). As with subclassification, the only units not placed into a subclass will be those discarded (if a discard option is specified) because they are outside the range of common support. Full matching is optimal in terms of minimizing a weighted average of the estimated distance measure between each treated subject and each control subject within each subclass.
Full matching can be performed with MatchIt by setting method = "full". Just as with optimal matching, we use the optmatch package (Hansen 2004), which automatically loads when needed. The following example with full matching (using the default propensity score based on logistic regression) can also be run by typing demo("full") at the R prompt: R> m.out <-matchit(treat~age + educ + black + hispan + married + + nodegree + re74 + re75, data = lalonde, method = "full")

Genetic matching
Genetic matching automates the process of finding a good matching solution (Diamond and Sekhon 2010). The idea is to use a genetic search algorithm to find a set of weights for each covariate such that the a version of optimal balance is achieved after matching. As currently implemented, matching is done with replacement using the matching method of Abadie and Imbens (2006) and balance is determined by two univariate tests, paired t-tests for dichotomous variables and a Kolmogorov-Smirnov test for multinomial and continuous variables, but these options can be changed.
Genetic matching can be performed with MatchIt by setting method = "genetic", which automatically loads the Matching (Sekhon 2011) package. The following example of genetic matching (using the estimated propensity score based on logistic regression as one of the covariates) can also be run by typing demo("genetic"): R> m.out <-matchit(treat~age + educ + black + hispan + married + + nodegree + re74 + re75, data = lalonde, method = "genetic")

Quick overview
To check balance, use summary(m.out) for numerical summaries and plot(m.out) for graphical summaries.
The summary() command The summary() command gives measures of the balance between the treated and control groups in the full (original) data set, and then in the matched data set. If the matching worked well, the measures of balance should be smaller in the matched data set (smaller values of the measures indicate better balance).
The summary() output for subclassification is the same as that for other types of matching, except that the balance statistics are shown separately for each subclass, and the overall balance in the matched samples is calculated by aggregating across the subclasses, where each subclass is weighted by the number of units in the subclass. For exact matching, the covariate values within each subclass are guaranteed to be the same, and so the measures of balance are not output for exact matching; only the sample sizes in each subclass are shown.
Balance statistics: The statistics the summary() command provides include means, the original control group standard deviation (where applicable), mean differences, standardized mean differences, and (median, mean and maximum) quantile-quantile (Q-Q) plot differences. In addition, the summary() command will report (a) the matched call, (b) how many units were matched, unmatched, or discarded due to the discard option (described below), and (c) the percent improvement in balance for each of the balance measures, defined as 100((|a| − |b|)/|a|), where a is the balance before and b is the balance after matching. For each set of units (original and matched data sets, with weights used as appropriate in the matched data sets), the following statistics are provided: 1. Means Treated and Means Control show the weighted means in the treated and control groups 2. SD Control is the standard deviation calculated in the control group (where applicable) 3. Mean Diff. is the difference in means between the groups 4. The final three columns of the summary output give summary statistics of a Q-Q plot (see below for more information on these plots). Those columns give the median, mean, and maximum distance between the two empirical quantile functions (treated and control groups). Values greater than 0 indicate deviations between the groups in some part of the empirical distributions. The plots of the two empirical quantile functions themselves, described below, can provide further insight into which part of the covariate distribution has differences between the two groups.
Additional options: Three options to the summary() command can also help with assessing balance and respecifying the propensity score model, as necessary. First, the interactions = TRUE option with summary() shows the balance of all squares and interactions of the covariates used in the matching procedure. Large differences in higher order interactions usually are a good indication that the propensity score model (the distance measure) needs to be respecified. Similarly, the addlvariables option with summary() will provide balance measures on additional variables not included in the original matching procedure. If a variable (or interaction of variables) not included in the original propensity score model has large imbalances in the matched groups, including that variable in the next model specification may improve the resulting balance on that variable. Because the outcome variable is not used in the matching procedure, a variety of matching methods can be tried, and the one that leads to the best resulting balance chosen. Finally, the standardize = TRUE option will print out standardized versions of the balance measures, where the mean difference is standardized (divided) by the standard deviation in the original treated group.

The plot() command
We can also examine the balance graphically using the plot() command, which provides three types of plots: jitter plots of the distance measure, Q-Q plots of each covariate, and histograms of the distance measure. For subclassification, separate Q-Q plots can be printed for each subclass. The jitter plot for subclassification is the same as that for other types of matching, with the addition of vertical lines indicating the subclass cut-points. With the histogram option, 4 histograms are provided: the original treated and control groups and the matched treated and control groups. degree line indicate differences in the empirical distribution. The jitter plot (top panel) shows the overall distribution of propensity scores in the treated and control groups. In the jitter plot, which can be created by setting type = "jitter", the size of each point is proportional to the weight given to that unit. Observation names can be interactively identified by clicking the first mouse button near the units. The histograms (lower right panel) can be plotted by setting type = "hist".

Conducting analyses after matching
Any software package may be used for parametric analysis following MatchIt. This includes any of the relevant R packages, or other statistical software by exporting the resulting matched data sets using R commands such as write.csv() and write.table() for ASCII files or write.dta() in the foreign package for a Stata binary file. When variable numbers of treated and control units have been matched to each other (e.g., through exact matching, full matching, or k:1 matching with replacement), the weights created by MatchIt should be used (e.g., in a weighted regression) to ensure that the matched treated and control groups are weighted up to be similar. Users should also remember that the weights created by MatchIt estimate the average treatment effect on the treated, with the control units weighted to resemble the treated units. See below for more detail on the weights. With subclassification, estimates should be obtained within each subclass and then aggregated across subclasses. When it is not possible to calculate an effect within each subclass, again the weights can be used to weight the matched units.
In this section, we show how to use Zelig with MatchIt.Zelig (Imai, King, and Lau 2011) is an R package that implements a large variety of statistical models (using numerous existing R packages) with a single easy-to-use interface, gives easily interpretable results by simulating quantities of interest, provides numerical and graphical summaries, and is easily extensible to include new methods.

Quick overview
The general syntax is as follows. First, we use match.data() to create the matched data from the MatchIt output object (m.out) by excluding unmatched units from the original data, and including information produced by the particular matching procedure (i.e., primarily a new data set, but also information that may result such as weights, subclasses, or the distance measure).
R> m.data <-match.data(m.out) where m.data is the resulting matched data. Zelig analyses all use three commandszelig, setx, and sim. For example, the basic statistical analysis is performed first: R> z.out <-zelig(Y~treat + x1 + x2, model = mymodel, data = m.data) where Y is the outcome variable, mymodel is the selected model, and z.out is the output object from zelig. This output object includes estimated coefficients, standard errors, and other typical outputs from your chosen statistical model. Its contents can be examined via summary(z.out) or plot(z.out), but the idea of Zelig is that these statistical results are typically only intermediate quantities needed to compute your ultimate quantities of interest, which in the case of matching are usually causal inferences. To get these causal quantities, we use Zelig's other two commands. Thus, we can set the explanatory variables at their means (the default) and change the treatment variable from a 0 to a 1: R> x.out0 <-setx(z.out0, treat = 0) R> x1.out0 <-setx(z.out0, treat = 1) and finally compute the resulting estimates of the causal effects and examine a summary: Model-based estimates In our first example, we conduct a standard parametric analysis and compute quantities of interest in the most common way. We begin with nearest neighbor matching with a logistic regression-based propensity score, discarding with the hull.control option Zeng 2006, 2007): R> m.out0 <-matchit(treat~age + educ + black + hispan + nodegree + + married + re74 + re75, method = "nearest", discard = "hull.control", + data = lalonde) Then we check balance using the summary and plot procedures (which we do not show here). When the best balance is achieved, we run the parametric analysis (two variables are dropped because they are exactly matched): R> z.out0 <-zelig(re78~treat + age + educ + black + nodegree + + re74 + re75, data = match.data(m.out0), model = "ls") and then set the explanatory variables at their means (the default) and change the treatment variable from a 0 to a 1: R> x.out0 <-setx(z.out0, treat = 0) R> x1.out0 <-setx(z.out0, treat = 1) and finally compute the result and examine a summary: R> s.out0 <-sim(z.out0, x = x.out0, x1 = x1.out0) R> summary(s.out0) Average treatment effect on the treated We illustrate now how to estimate the average treatment effect on the treated in a way that is quite robust. We do this by estimating the coefficients in the control group alone.
We begin by conducting nearest neighbor matching with a logistic regression-based propensity score: R> m.out1 <-matchit(treat~age + educ + black + hispan + nodegree + + married + re74 + re75, method = "nearest", data = lalonde) Then we check balance using the summary and plot procedures (which we do not show here). We reestimate the matching procedure until we achieve the best balance possible.
The running examples here are meant merely to illustrate, not to suggest that we've achieved the best balance. Then we go to Zelig, and in this case choose to fit a linear least squares model to the control group only: R> z.out1 <-zelig(re78~age + educ + black + hispan + nodegree + + married + re74 + re75, data = match.data(m.out1, "control"), + model = "ls") where the "control" option in match.data() extracts only the matched control units and ls specifies least squares regression. In a smaller data set, this example should probably be changed to include all the data in this estimation (using data = match.data(m.out1) for the data) and by including the treatment indicator (which is excluded in the example since its a constant in the control group.) Next, we use the coefficients estimated in this way from the control group, and combine them with the values of the covariates set to the values of the treated units. We do this by choosing conditional prediction (which means use the observed values) in setx(). The sim() command does the imputation.
R> x.out1 <-setx(z.out1, data = match.data(m.out1, "treat"), + cond = TRUE) R> s.out1 <-sim(z.out1, x = x.out1) Finally, we obtain a summary of the results by R> summary(s.out1) Average treatment effect (overall) To estimate the average treatment effect, we continue with the previous example and fit the linear model to the treatment group: R> z.out2 <-zelig(re78~age + educ + black + hispan + nodegree + + married + re74 + re75, data = match.data(m.out1, "treat"), + model = "ls") We then conduct the same simulation procedure in order to impute the counterfactual outcome for the control group, R> x.out2 <-setx(z.out2, data = match.data(m.out1, "control"), + cond = TRUE) R> s.out2 <-sim(z.out2, x = x.out2) In this calculation, Zelig is computing the difference between observed and the expected values. This means that the treatment effect for the control units is the effect of control (observed control outcome minus the imputed outcome under treatment from the model). Hence, to combine treatment effects just reverse the signs of the estimated treatment effect of controls.
R> ate.all <-c(s.out1$qi$att.ev, -s.out2$qi$att.ev) The point estimate, its standard error, and the 95% confidence interval is given by R> mean(ate.all) R> sd(ate.all) R> quantile(ate.all, c(0.025, 0.975)) Subclassification In subclassification, the average treatment effect estimates are obtained separately for each subclass, and then aggregated for an overall estimate. Estimating the treatment effects separately for each subclass, and then aggregating across subclasses, can increase the robustness of the ultimate results since the parametric analysis within each subclass requires only local rather than global assumptions. However, fewer observations are obviously available within each subclass, and so this option is normally chosen for larger data sets.
We begin this example by conducting subclassification with four subclasses, R> m.out2 <-matchit(treat~age + educ + black + hispan + nodegree + + married + re74 + re75, data = lalonde, method = "subclass", + subclass = 4) When balance is as good as we can get it, we then fit a linear regression within each subclass by controlling for the estimated propensity score (called distance) and other covariates. In most software, this would involve running four separate regressions and then combining the results. In Zelig, however, all we need to do is to use the by option: R> z.out3 <-zelig(re78~re74 + re75 + distance, data = + match.data(m.out2, "control"), model = "ls", by = "subclass") The same set of commands as in the first example are used to do the imputation of the counterfactual outcomes for the treated units: R> x.out3 <-setx(z.out3, data = match.data(m.out2, "treat"), fn = NULL, + cond = TRUE) R> s.out3 <-sim(z.out3, x = x.out3) R> summary(s.out3) It is also possible to get the summary result for each subclass. For example, the following command summarizes the result for the second subclass.

R> summary(s.out3, subset = 2)
How adjustment after exact matching has no effect Regression adjustment after exact one-to-one exact matching gives the identical answer as a simple, unadjusted difference in means. General exact matching, as implemented in MatchIt, allows one-to-many matches, so to see the same result we must weight when adjusting. In other words: weighted regression adjustment after general exact matching gives the identical answer as a simple, unadjusted weighted difference in means. Arguments for all matching methods formula: formula used to calculate the distance measure for matching (e.g., the propensity score model). It takes the usual syntax of R formulas, treat~x1 + x2, where treat is a binary treatment indicator, and x1 and x2 are the pre-treatment covariates. Both the treatment indicator and pre-treatment covariates must be contained in the same data frame, which is specified as data (see below). All of the usual R syntax for formulas work here. For example, x1:x2 represents the first order interaction term between x1 and x2, and I(x1^2) represents the square term of x1. See help("formula") for details.
data: the data frame containing the variables called in formula.
verbose: a logical value indicating whether to print the status of the matching algorithm (default = FALSE).

Additional arguments for specification of distance measures
The following arguments specify distance measures that are used for matching methods. These arguments apply to all matching methods except exact matching.
distance: the method used to estimate the distance measure (default = "logit", logistic regression) or a numerical vector of user's own distance measure. Before using any of these techniques, it is best to understand the theoretical groundings of these techniques and to evaluate the results. Most of these methods (such as logistic or probit regression) define the distance by first estimating the propensity score, defined as the probability of receiving treatment, conditional on the covariates. Available methods include: -"mahalanobis": the Mahalanobis distance measure.
distance.options: optional arguments for estimating the distance measure. The input to this argument should be a list. For example, if the distance measure is estimated with a logistic regression, users can increase the maximum IWLS iterations by distance.options = list(maxit = 5000). Find additional options for general linear models using help("glm") or help("family"), for general additive models using help("gam"), for neutral network models help("nnet"), and for classification trees help("rpart"). discard: specifies whether to discard units that fall outside some measure of support of the distance measure (default = "none", discard no units). Discarding units may change the quantity of interest being estimated by changing the observations left in the analysis. Enter a logical vector indicating which unit should be discarded or choose from the following options: -"none": no units will be discarded before matching. Use this option when the units to be matched are substantially similar, such as in the case of matching treatment and control units from a field experiment that was close to (but not fully) randomized (e.g., Imai 2005), when caliper matching will restrict the donor pool, or when you do not wish to change the quantity of interest and the parametric methods to be used post-matching can be trusted to extrapolate.
-"hull.both": all units that are not within the convex hull will be discarded. See Zeng (2006, 2007) for information about the convex hull in this context and as a measure of model dependence.
-"both": all units (treated and control) that are outside the support of the distance measure will be discarded.
-"hull.control": only control units that are not within the convex hull of the treated units will be discarded.
-"control": only control units outside the support of the distance measure of the treated units will be discarded. Use this option when the average treatment effect on the treated is of most interest and when you are unwilling to discard non-overlapping treatment units (which would change the quantity of interest).
-"hull.treat": only treated units that are not within the convex hull of the control units will be discarded.
-"treat": only treated units outside the support of the distance measure of the control units will be discarded. Use this option when the average treatment effect on the control units is of most interest and when unwilling to discard control units.
reestimate: If FALSE (default), the model for the distance measure will not be reestimated after units are discarded. The input must be a logical value. Re-estimation may be desirable for efficiency reasons, especially if many units were discarded and so the post-discard samples are quite different from the original samples.
Additional arguments for subclassification sub.by: criteria for subclassification. Choose from: "treat" (default), the number of treatment units; "control", the number of control units; or "all", the total number of units. Changing the default will likely also signal a change in your quantity of interest from the average treatment effect on the treated to other quantities.
subclass: either a scalar specifying the number of subclasses, or a vector of probabilities bounded between 0 and 1, which create quantiles of the distance measure using the units in the group specified by sub.by (default = subclass = 6).
Additional arguments for nearest neighbor matching m.order: the order in which to match treatment units with control units.
-"largest" (default): matches from the largest value of the distance measure to the smallest.
-"smallest": matches from the smallest value of the distance measure to the largest.
replace: logical value indicating whether each control unit can be matched to more than one treated unit (default = replace = FALSE, each control unit is used at most once -i.e., sampling without replacement). For matching with replacement, use replace = TRUE. After matching with replacement, the weights can be used to reflect the frequency with which each control unit was matched.
ratio: the number of control units to match to each treated unit (default = 1). If matching is done without replacement and there are fewer control units than ratio times the number of eligible treated units (i.e., there are not enough control units for the specified method), then the higher ratios will have NA in place of the matching unit number in match.matrix. exact: variables on which to perform exact matching within the nearest neighbor matching (default = NULL, no exact matching). If exact is specified, only matches that exactly match on the covariates in exact will be allowed. Within the matches that match on the variables in exact, the match with the closest distance measure will be chosen. exact should be entered as a vector of variable names (e.g., exact = c("X1", "X2")).
caliper: the number of standard deviations of the distance measure within which to draw control units (default = 0, no caliper matching). If a caliper is specified, a control unit within the caliper for a treated unit is randomly selected as the match for that treated unit. If caliper != 0, there are two additional options: calclosest: whether to take the nearest available match if no matches are available within the caliper (default = FALSE).
mahvars: variables on which to perform Mahalanobis-metric matching within each caliper (default = NULL). Variables should be entered as a vector of variable names (e.g., mahvars = c("X1", "X2")). If mahvars is specified without caliper, the caliper is set to 0.25.
subclass and sub.by: See the options for subclassification for more details on these options. If a subclass is specified within method = "nearest", the matched units will be placed into subclasses after the nearest neighbor matching is completed.
Additional arguments for optimal matching ratio: the number of control units to be matched to each treatment unit (default = 1).
min.controls: The minimum ratio of controls to treatments that is to be permitted within a matched set: should be nonnegative and finite. If min.controls is not a whole number, the reciprocal of a whole number, or zero, then it is rounded down to the nearest whole number or reciprocal of a whole number.
max.controls: The maximum ratio of controls to treatments that is to be permitted within a matched set: should be positive and numeric. If max.controls is not a whole number, the reciprocal of a whole number, or Inf, then it is rounded up to the nearest whole number or reciprocal of a whole number.

Output values
Regardless of the type of matching performed, the matchit output object contains the following elements: 3 call: the original matchit() call.
formula: the formula used to specify the model for estimating the distance measure.
model: the output of the model used to estimate the distance measure. summary( m.out$model) will give the summary of the model where m.out is the output object from matchit().
match.matrix: an n 1 × ratio matrix where: the row names represent the names of the treatment units (which match the row names of the data frame specified in data).
each column stores the name(s) of the control unit(s) matched to the treatment unit of that row. For example, when the ratio input for nearest neighbor or optimal matching is specified as 3, the three columns of match.matrix represent the three control units matched to one treatment unit).
-NA indicates that the treatment unit was not matched.
discarded: a vector of length n that displays whether the units were ineligible for matching due to common support restrictions. It equals TRUE if unit i was discarded, and it is set to FALSE otherwise.
distance: a vector of length n with the estimated distance measure for each unit.
weights: a vector of length n with the weights assigned to each unit in the matching process. Unmatched units have weights equal to 0. Matched treated units have weight 1. Each matched control unit has weight proportional to the number of treatment units to which it was matched, and the sum of the control weights is equal to the number of uniquely matched control units.
subclass: the subclass index is an ordinal scale from 1 to the total number of subclasses as specified in subclass (or the total number of subclasses from full or exact matching).
Unmatched units have NA.
q.cut: the subclass cut-points that classify the distance measure.
treat: the treatment indicator from data (the left-hand side of formula).
X: the covariates used for estimating the distance measure (the right-hand side of formula). When applicable, X is augmented by covariates contained in mahvars and exact.

summary: Numerical summaries of balance
The summary() command returns numerical summaries of balance diagnostics. addlvariables: additional variables on which to calculate the diagnostic statistics (in addition to the variables included in the matching procedure) (default = NULL). addlvariables: a data frame containing additional variables whose balance is examined. The data should come with the same number of units and units in the same order as in the data set used for matchit().

Output values
The output from the summary() command includes the following elements, when applicable: The original assignment model call.
sum.all: a data frame that contains variable names and interactions down the row names, and summary statistics on all observations in each of the columns. The columns in sum.all contain: means of all covariates X for treated and control units, where Means Treated= T =0 X i , standard deviation in the control group for all covariates X, where applicable, balance statistics of the original data (before matching), which compare treated and control covariate distributions. If standardize = FALSE, balance measures will be presented on the original scale. Specifically, mean differences (Mean Diff.) as well as the median, mean, and maximum value of differences in empirical quantile functions for each covariate will be given (eQQ Med, eQQ Mean, and eQQ Max, respectively). If standardize = TRUE, the balance measures will be standardized. Standardized mean differences (Std. Mean Diff.), defined as , as well as the median, mean, and maximum value of differences in empirical cumulative distribution functions for each covariate will be given (eCDF Med, eCDF Mean, and eCDF Max, respectively).
sum.matched: a data frame which contains variable names down the row names, and summary statistics on only the matched observations in each of the columns. Specifically, the columns in sum.matched contain the following elements: weighted means for matched treatment units and matched control units of all covariates X and their interactions, where Means Treated= µ wX|T =1 = 1 weighted standard deviations in the matched control group for all covariates X, where applicable, where SD = s wX = 1 n i (w i X i − X * ) 2 , where X * is the weighted mean of X in the matched control group, and balance statistics of the matched data (after matching), which compare treated and control covariate distributions. If standardize = FALSE, balance measures will be presented on the original scale. Specifically, mean differences (Mean Diff.) as well as the median, mean, and maximum value of differences in empirical quantile functions for each covariate will be given (eQQ Med, eQQ Mean, and eQQ Max, respectively). If standardize = TRUE, the balance measures will be standardized. Standardized mean differences (Std. Mean Diff.), defined as , as well as the median, mean, and maximum value of differences in empirical cumulative distribution functions for each covariate will be given (eCDF Med, eCDF Mean, and eCDF Max, respectively).
where w represents the vector of weights.
reduction: the percent reduction in the difference in means achieved in each of the balance measures in sum.all and sum.matched, defined as 100(|a| − |b|)/|a|, where a was the value of the balance measure before matching and b is the value of the balance measure after matching.
nn: the sample sizes in the full and matched samples and the number of discarded units, by treatment and control.
q.table: an array that contains the same information as sum.matched by subclass.
qn: the sample sizes in the full and matched samples and the number of discarded units, by subclass and by treatment and control.
match.matrix: the same object is contained in the output of matchit().

plot: Graphical summaries of balance
Plot options for the matchit object The plot() command allows you to check the distributions of propensity scores and covariates in the assignment model, squares, and interactions, and within each subclasses if specified. Syntax plot(m.out, discrete.cutoff = 5, type = "QQ", numdraws = 5000, interactive = TRUE, which.xs = NULL, ...) Arguments type: type of output graph. type = "QQ" (default) outputs empirical quantile-quantile plots of each covariate to check balance of marginal distributions. Alternatively, type = "jitter" outputs jitter plots of the propensity score for treated and control units. Finally, type="hist" outputs histograms of the propensity score in the original treated and control groups and weighted histograms of the propensity score in the matched treated and control groups.
discrete.cutoff: For quantile-quantile plots, discrete covariates that take 5 or fewer values are jittered for visibility. This may be changed by setting this argument to any other positive integer.
interactive: If TRUE (default), users can identify individual units by clicking on the graph with the left mouse button, and (when applicable) choose subclasses to plot.
which.xs: For quantile-quantile plots, specifies particular covariate names in a character vector to plot only a subset of the covariates.
subclass: If interactive = FALSE, users can specify which subclass to plot. Histograms: This graph plots histograms of the estimated propensity scores in the original treated and control groups and weighted histograms of the estimated propensity scores in the matched treated and control groups. Plots can be compared vertically to quickly check the balance before and after matching.

Plot options for the matchit summary object
You can also send a matchit summary object to the plot() command, to obtain a summary of the balance on each covariate before and after matching. The summary() object must have been created using the option standardize = TRUE. The idea for this plot came from the twang package by McCaffrey, Ridgeway, and Morral.

Output values
Line plot of standardized differences in means before and after matching. Numbers plotted are those output by the summary() command in the sum.all and sum.matched objects.

Usage
To extract the matched data set for subsequent analyses from the output object (as described in Section 3.3), we provide the function match.data(). This is used as follows: m.data <-match.data(object, group = "all", distance = "distance", weights = "weights", subclass = "subclass") The output of the function match.data() is the original data frame where additional information about matching (i.e., distance measure as well as resulting weights and subclasses) is added, restricted to units that were matched.

Arguments
match.data() takes the following inputs: 1. object is the output object from matchit(). This is a required input.
2. group specifies for which matched group the user wants to extract the data. Available options are "all" (all matched units), "treat" (matched units in the treatment group), and "control" (matched units in the control group). The default is "all".
3. distance specifies the variable name used to store the distance measure. The default is "distance".
4. weights specifies the variable name used to store the resulting weights from matching. The default is "weights". See Section 3.3.2 for more details on the weights. 5. subclass specifies the variable name used to store the subclass indicator. The default is "subclass".

Examples
Here, we present examples for using match.data(). Users can run these commands by typing demo("match.data") at the R prompt. First, we load the Lalonde data, data("lalonde") The next line performs nearest neighbor matching based on the estimated propensity score from the logistic regression, R> m.out1 <-matchit(treat~re74 + re75 + age + educ, data = lalonde, + method = "nearest", distance = "logit") To obtain matched data, type the following command, R> m.data1 <-match.data(m.out1) It is easy to summarize the resulting matched data, R> summary(m.data1) To obtain matched data for the treatment or control group, specify the option group as follows, R> m.data2 <-match.data(m.out1, group = "treat") R> summary(m.data2) R> m.data3 <-match.data(m.out1, group = "control") R> summary(m.data3) We can also use the function to return unmatched data: R> unmatched.data <-+ lalonde[!row.names(lalonde) %in% row.names(match.data(m.out1)), ] We can also specify different names for the subclass indicator, the weight variable, and the estimated distance measure. The following example first does a subclassification method, obtains the matched data with specified names for those three variables, and then print out the names of all variables in the resulting matched data.