Panacea or poison: Assessing how well basic propensity score modeling can replicate results from randomized controlled trials in criminal justice research

Campbell, Christopher M.; Labrecque, Ryan M.

doi:10.1007/s11292-022-09532-y

Panacea or poison: Assessing how well basic propensity score modeling can replicate results from randomized controlled trials in criminal justice research

Published: 29 October 2022

Volume 20, pages 229–253, (2024)
Cite this article

Journal of Experimental Criminology Aims and scope Submit manuscript

Christopher M. Campbell¹ &
Ryan M. Labrecque²

340 Accesses
1 Citation
9 Altmetric
Explore all metrics

Abstract

As a substitute, randomized controlled trial (RCT) researchers increasingly rely on propensity score modeling (PSM) to estimate causal effects. However, some warn about the dangers of placing too much blind faith in the abilities of PSM. This study tests the reliability and validity of seven common PSM methods in their ability to remove an artificial selection bias and replicate results from several RCTs in criminal justice data. Findings suggest PSM can be an effective means for simulating RCT results. Meta-analyses reveal the average difference between PSM and RCT estimates were relatively small. Ultimately, our findings suggest that PSM can be an effective means for simulating an RCT while also harboring reason for concern. Researchers and policy-makers should approach the use and interpretation of PSM with cautious optimism as it appears to provide a reliable and valid estimate of the treatment effect most of the time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evolutionary regression? Assessing the problem of hidden biases in criminal justice applications using propensity scores

Article 12 November 2015

Propensity score methods for causal inference and generalization

Article 27 October 2023

On Propensity Score Methodology

Notes

The NACJD is a subsection of the Inter-university Consortium of Political and Social Research (ICPSR), which is a data-sharing, online repository. ICPSR and NACJD partner with the federal government and public funding institutions to ensure that any data collected under the auspices of such funding are publicly available.
Many of these records were duplicates given the similarities in the keyword search terms.
Although seemingly inherent in the term RCT, there were many studies identified by this keyword that did not involve true random assignment over the course of the project. For instance, some researchers may have been required to stop random assignment in the middle of an evaluation due to ethical issues, such as upon evidence that the rehabilitation program under study was effective in improving behavior. These studies may have introduced bias into the treatment effects, and therefore, we opted not to include them in the current investigation.
This number is based on what was needed to prepare the data for PSM. A power analysis indicated that to detect a true difference of a medium effect size (using Cohen’s d = .5), with approximately .80 power, required at least 65 cases per group (Cohen, 1988). To ensure there are at least twice as many comparison cases available for matching once we introduced the 50% selection bias to the treatment group, our analysis required a minimum of 130 cases per group in the original study.
We ran our analyses with and without this study, and there were no substantive or statistical differences in the results.
We estimated the standardized percent bias using Austin’s (2011) two formulas, with continuous measures calculated as \(d=\frac{{\overline x}_{treatment}-{\overline x}_{control}}{\sqrt{\displaystyle\frac{s_{treatment}^2+s_{control}^2}2}}\) where \(\overline x\) denotes the mean of the respective groups (treatment or control), and S² denotes sample variance, and for dichotomous measures\(d=\frac{{\widehat P}_{treatment}-{\widehat P}_{control}}{\sqrt{\displaystyle\frac{{\widehat P}_{treatment}\left(1-{\widehat P}_{treatment}\right)+{\widehat P}_{Control}\left(1-{\widehat P}_{Control}\right)}2}}\) where \(\widehat P\) denotes the proportion of the measure’s respective group.
Although others have argued that standardized differences can vary from 10 to 25% (Stuart et al., 2013), we opted to rely on the original standards set by Rosenbaum and Rubin (1985) as the 20% is a slightly more conservative ceiling.
The rest of the PSM studies used a different form (covariate balancing propensity score estimation or machine learning), while others (nine studies) did not mention the technique used to condition the score at all.
We refer readers to Guo and Fraser (2014) for a more detailed description of these techniques and their assumptions.
It is worth noting that while Austin (2010) highlighted .20 rather than .25 as being a good standard, the .25 we use here is based on the original conceptualization of Rosenbaum and Rubin (1985), and likely captures a wider practice by researchers in both academia and imbedded in justice agencies.
The minimum number of matched controls was represented by \(minimum\;n=\frac{\frac{1-\left({\textstyle\frac t{t+c}}\right)}2}{\displaystyle\frac t{t+c}}\) where t is the number of treatment cases in the biased sample and c is the number of comparison cases from which to draw matches. Similarly, the maximum number of controls to match to each treatment was represented by \(maximum\;n=\frac{2\left(1-\left(\frac t{t+c}\right)\right)}{\displaystyle\frac t{t+c}}\).
This is an important distinction because some 1-many matching schemes force a matched set to achieve a certain number of cases. If the researcher determines each treatment case should have three matched controls, then each matched set will have invariably four cases. Any treatment case that cannot achieve three matched controls is either lost (when a caliper is employed) or is forced to match with an otherwise incompatible control (when a caliper is not used; see Ming & Rosenbaum, 2000).
It is still possible that adequate matches are not found for both control and treatment cases.
As its name suggests, the IPTW applies a weight to control cases that is equal to the inverse of the case’s odds of being in the treatment. This weight (\(\omega\)) for each case \((x\)) is calculated using \(\omega \left(t,x\right)=t+(1-t)\frac{Pr}{1-Pr}\) where t is the treatment measure (1 for treated cases, 0 for untreated), and Pr is the propensity score (see Guo & Fraser, 2014, citing Hirano & Imbens, 2001; Hirano, Imbens, & Ridder, 2003).
After accounting for some degree of common support, the control weight is calculated as \({\omega }_{s} =\frac{{n}_{z,s}}{{n}_{{z}^{^{\prime}},s}}\) where \({n}_{z,s}\) is the number of units assigned to the treatment group within each stratum (\(s)\), and \({n}_{{z}^{^{\prime}}, s}\) is the number of units in the control group within stratum s (see Hong, 2010, p. 519). All treatment units would receive no weight (i.e., equal to 1).
It is reasonable to expect that the original RCT samples possess relatively low AUC values (e.g., < .600) whereas the biased samples should yield much higher AUC values (e.g., > .800). The closer the AUC value of a PSM sample gets to .500, the more it can be said that the propensity score can no longer differentiate between the treatment and control cases (i.e., the two groups are balanced). To calculate an AUC for the unbiased, experimental data, we fit a logistic regression model to the original dataset with the same measures used in the biased samples’ propensity score. All AUC statistics were calculated using the DeLong et al. (1988) approach, and compared using Hanley and McNeil (1982)’s test of significance for independent sample curves. While the AUC does not address any misspecification of the propensity-score conditioning logistic regression, we assessed and ensured the fit of the logit model individually.
In tandem importance with the reduction of bias is the estimation of hidden bias. Due to the nature of PSM being a quasi-experimental design, there is always the potential that an unobserved covariate may have impacted the findings if it had been observed. To test for this, we use the sensitivity analysis posited by Rosenbaum (2002, 2005) which focuses on the difference in the matched and unmatched outcomes. Specifically, we used the user-written codes for Stata of mhbounds for dichotomous outcomes and rbounds for continuous outcomes. These tests assess how sensitive the findings are to the potential of hidden bias by simulating the ability of an unobserved covariate to predict assignment to the treatment condition in the form of gamma. Gamma is essentially a measure of the degree to which an unobserved measure must improve the prediction of treatment assignment compared to the current propensity score models. As gamma increases, the findings are understood as robust to hidden bias.
Cohen’s d is the standardized difference between two means, and it is calculated as the difference in means between two groups divided by their pooled standard deviation.
If the two 95% CIs did not overlap, we considered them statistically different from one another (Cumming & Calin-Jageman, 2016).
We interpreted r values of .1, .3, and .5 as indicative of small, medium, and large correlations (Cohen, 1988).
For this analysis, we averaged the outcomes within study to produce only one ES per unique sample. This process ensured that our meta-analysis would be weighted by sample size and not by the number of outcomes included.
The random-effects model was selected a priori on conceptual grounds because this method can be used to extend the results of the meta-analysis to a wider population of studies when it cannot be determined with any degree of certainty that the current population of studies is functionally similar (see Bornstein et al., 2009).
Although it is common for meta-analysts to test study heterogeneity with the Q test, the Q statistic only informs about the presence or absence of heterogeneity, not the extent of such heterogeneity. In contract, the I² statistic serves to quantify the degree of heterogeneity between studies and is presented in easily comparable percentage terms. We interpreted I² according to Higgins and Thompson’s (2002) guidelines, where values of around 25%, 50%, and 75% are indicative of low, medium, and high levels of heterogeneity among the ESs, respectively.
We also conducted the same set of analyses by averaging the ESs within each study first and then making comparisons (n = 11). This process yielded similar findings to those presented here.
The .24 difference in d was established in the education literature and specific to educational performance in relation to various intervening practices. That said, it is related to many, if not all, of the outcomes we used in that the educational measures are typically behavioral and/or scaled attitudinal tests, with some of the study samples involving educational settings for crime prevention programs.

References

Apel, R. J., & Sweeten, G. (2010). Propensity score matching in Criminology and Criminal Justice. In A. R. Piquero & D. Weisburd (Eds.), Handbook of Quantitative Criminology (pp. 543–562). New York: Springer.
Chapter Google Scholar
Austin, P. C. (2008). A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Statistics in Medicine, 27(12), 2037–2049.
Article MathSciNet PubMed Google Scholar
Austin, P. C. (2009). Some methods of propensity-score matching had superior performance to others: Results of an empirical investigation and Monte Carlo simulations. Biometrical Journal, 51(1), 171–184.
Article MathSciNet PubMed Google Scholar
Austin, P. C. (2010). Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on the propensity score. American Journal of Epidemiology, 172(9), 1092–1097.
Article PubMed PubMed Central Google Scholar
Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424.
Article PubMed PubMed Central Google Scholar
Austin, P. C., & Stuart, E. A. (2015). Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Statistics in Medicine, 34(28), 3661–3679.
Article MathSciNet PubMed PubMed Central Google Scholar
Braga, A. A., Piehl, A. M., & Hureau, D. (2009). Controlling violent offenders released to the community: An evaluation of the boston reentry initiative. Journal of Research in Crime and Delinquency, 46(4), 411–436.
Article Google Scholar
Bornstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. John Wiley & Sons.
Book Google Scholar
Campbell, C. M., Labrecque, R. M., Mohler, M. E., & Christmann, M. J. (2022). Gender and community supervision: Examining differences in violations, sanctions, and recidivism outcomes. Crime & Delinquency, 68(2), 284–325.
Article Google Scholar
Campbell, C. M., Abboud, M. J., Hamilton, Z. K., vanWormer, J., & Posey, B. (2019). Evidence-based or just promising? Lessons learned in taking inventory of state correctional programming. Justice Evaluation Journal, 1(2), 188–214.
Article Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Routledge.
Google Scholar
Cole, S. R., Platt, R. W., Schisterman, E. F., Chu, H., Westreich, D., Richardson, D., & Poole, C. (2010). Illustrating bias due to conditioning on a collider. International Journal of Epidemiology, 39(2), 417–420.
Article PubMed Google Scholar
Cumming, G., & Calin-Jageman, R. (2016). Introduction to the New Statistics: Estimation, Open Science, and Beyond (Reprint edition). Routledge.
Book Google Scholar
Dehejia, R. H., & Wahba, S. (1999). Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs. Journal of the American Statistical Association, 94(448), 1053–1062.
Article Google Scholar
Dehejia, R. H., & Wahba, S. (2002). Propensity score-matching methods for nonexperimental causal studies. Review of Economics and Statistics, 84(1), 151–161.
Article Google Scholar
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44(3), 837. https://doi.org/10.2307/2531595
Article CAS PubMed Google Scholar
Diamond, A., & Sekhon, J. S. (2012). Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. The Review of Economics and Statistics, 95(3), 932–945.
Article Google Scholar
Dong, N., & Lipsey, M. W. (2018). Can propensity score analysis approximate randomized experiments using pretest and demographic information in pre-k intervention research? Evaluation Review, 42, 34–70.
Article PubMed Google Scholar
Freedman, D. A., & Berk, R. A. (2008). Weighting regressions by propensity scores. Evaluation Review, 32(4), 392–409.
Article PubMed Google Scholar
Gaes, G. G., Bales, W. D., & Scaggs, S. J. A. (2016). The effect of imprisonment on recommitment: An analysis using exact, coarsened exact, and radius matching with the propensity score. Journal of Experimental Criminology, 12, 143–158.
Article Google Scholar
Gottfredson, D. C., Cook, T. D., Gardner, F. E., Gorman-Smith, D., Howe, G. W., Sandler, I. N., & Zafft, K. M. (2015). Standards of evidence for efficacy, effectiveness, and scale-up research in prevention science: Next generation. Prevention Science, 16(7), 893–926.
Article PubMed PubMed Central Google Scholar
Guo, S., & Fraser, M. W. (2014). Propensity score analysis: Statistical methods and applications (2nd ed.). SAGE Publications Inc.
Google Scholar
Hamilton, Z. K., Campbell, C. M., van Wormer, J., Kigerl, A., & Posey, B. (2016). The impact of swift and certain sanctions: An evaluation of Washington State’s policy for offenders on community supervision. Criminology & Public Policy, 15(4), 1009–1072.
Article Google Scholar
Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747
Article CAS PubMed Google Scholar
Hansen, B. B. (2004). Full matching in an observational study of coaching for the SAT. Journal of the American Statistical Association, 99(467), 609–618.
Article MathSciNet Google Scholar
Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. Statistics in Medicine, 21(11), 1539–1558.
Article PubMed Google Scholar
Hill, J. (2008). Discussion of research using propensity-score matching: Comments on ‘A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003’ by Peter Austin. Statistics in Medicine, 27(12), 2055–2061.
Article MathSciNet PubMed Google Scholar
Hirano, K., & Imbens, G. W. (2001). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology, 2(3–4), 259–278.
Article Google Scholar
Hirano, K., Imbens, G. W., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71(4), 1161–1189.
Article MathSciNet Google Scholar
Hong, G. (2010). Marginal mean weighting through stratification: Adjustment for selection bias in multilevel data. Journal of Educational and Behavioral Statistics, 35(5), 499–531.
Article Google Scholar
Hong, G. (2012). Marginal mean weighting through stratification: A generalized method for evaluating multivalued and multiple treatments with nonexperimental data. Psychological Methods, 17(1), 44.
Article PubMed Google Scholar
Hong, H., Aaby, D. A., Siddique, J., & Stuart, E. A. (2019). Propensity score-based estimators with multiple error-prone covariates. American Journal of Epidemiology, 188(1), 222–230.
Article PubMed Google Scholar
Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (statistical Methodology), 76(1), 243–263.
Article MathSciNet Google Scholar
Kim, R. H., & Clark, D. (2013). The effect of prison-based college education programs on recidivism: Propensity Score Matching approach. Journal of Criminal Justice, 41(3), 196–204.
Article Google Scholar
King, G., & Nielsen, R. (2016). Why propensity scores should not be used for matching. Political Analysis, 27(4), 435–454.
Article Google Scholar
Labrecque, R. M., Mears, D., & Smith, P. (2019). Gender and the effect of disciplinary segregation on prison misconduct. Advanced on-line publication.
Google Scholar
LaLonde, R. J. (1986). Evaluating the econometric evaluations of training programs with experimental data. The American Economic Review, 76(4), 604–620.
Google Scholar
Loughran, T. A., Wilson, T., Nagin, D. S., & Piquero, A. R. (2015). Evolutionary regression? Assessing the problem of hidden biases in criminal justice applications using propensity scores. Journal of Experimental Criminology, 11(4), 631–652. https://doi.org/10.1007/s11292-015-9242-y
Article Google Scholar
Luellen, J. K., Shadish, W. R., & Clark, M. H. (2005). Propensity scores: An introduction and experimental test. Evaluation Review, 29(6), 530–558.
Article PubMed Google Scholar
Lunt, M. (2014). Selecting an appropriate caliper can be essential for achieving good balance with propensity score matching. American Journal of Epidemiology, 179(2), 226–235.
Article MathSciNet PubMed Google Scholar
MacDonald, J., Stokes, R. J., Ridgeway, G., & Riley, K. J. (2007). Race, neighbourhood context and perceptions of injustice by the police in Cincinnati. Urban Studies, 44(13), 2567–2585.
Article Google Scholar
McCaffrey, D., Ridgeway, G., & Morral, A. (2004). Propensity score estimation with boosted regression for evaluating adolescent substance abuse treatment. Psychological Methods, 9(4), 403–425.
Article PubMed Google Scholar
McNiel, D. E., & Binder, R. L. (2007). Effectiveness of a mental health court in reducing criminal recidivism and violence. American Journal of Psychiatry, 164(9), 1395–1403.
Article PubMed Google Scholar
Ming, K., & Rosenbaum, P. R. (2000). Substantial gains in bias reduction from matching with a variable number of controls. Biometrics, 56(1), 118–124.
Article CAS PubMed Google Scholar
Nagin, D. S., & Sampson, R. J. (2019). The real gold standard: Measuring counterfactual worlds that matter most to social science and policy. Annual Review of Criminology, 2(1), 123–145.
Article Google Scholar
Peikes, D. N., Moreno, L., & Orzol, S. M. (2008). Propensity score matching: A note of caution for evaluators of social programs. The American Statistician, 62(3), 222–231.
Article MathSciNet Google Scholar
Ridgeway, G., & McCaffrey, D. F. (2007). Comment: Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4), 540–543.
Article MathSciNet Google Scholar
Rosenbaum, P. R. (1984). From association to causation in observational studies: The role of tests of strongly ignorable treatment assignment. Journal of the American Statistical Association, 79(385), 41–48.
Article Google Scholar
Rosenbaum, P. R. (2002). Observational studies. Springer.
Book Google Scholar
Rosenbaum, P. R. (2005). Heterogeneity and causality. The American Statistician, 59(2), 147–152. https://doi.org/10.1198/000313005X42831
Article MathSciNet Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.
Article MathSciNet Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1985). Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 39(1), 33–38.
Article Google Scholar
Rubin, D. B. (2006). Matched sampling for causal effects. Cambridge University Press.
Book Google Scholar
Shadish, W. R. (2013). Propensity score analysis: Promise, reality and irrational exuberance. Journal of Experimental Criminology, 9(2), 129–144.
Article Google Scholar
Shadish, W. R., Clark, M. H., Steiner, P. M., & Hill, J. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. Journal of the American Statistical Association, 103(484), 1334–1350.
Article MathSciNet CAS Google Scholar
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.
Smith, J. A., & Todd, P. E. (2005). Does matching overcome LaLonde’s critique of nonexperimental estimators? Journal of Econometrics, 125(1–2), 305–353.
Article MathSciNet Google Scholar
Smith, J., & Todd, P. (2001). Reconciling conflicting evidence on the performance of propensity-score matching methods. American Economic Review, 91(2), 112–118.
Article Google Scholar
Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250–267.
Article PubMed Google Scholar
Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science : A Review Journal of the Institute of Mathematical Statistics, 25(1), 1–21.
Article MathSciNet PubMed Google Scholar
Stuart, E. A., Lee, B. K., & Leacy, F. P. (2013). Prognostic score-based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. Journal of Clinical Epidemiology, 66(8), S84-S90.e1.
Article PubMed PubMed Central Google Scholar
ten Bensel, T., Gibbs, B., & Lytle, R. (2014). A propensity score approach towards assessing neighborhood risk of parole revocation. American Journal of Criminal Justice, 40(2), 377–398.
Article Google Scholar
Ury, H. K. (1975). Efficiency of case-control studies with multiple controls per case: Continuous or dichotomous data. Biometrics, 31(3), 643–649.
Article CAS PubMed Google Scholar
van Wormer, J. G., & Campbell, C. (2016). Developing an alternative juvenile programming effort to reduce detention overreliance. Journal of Juvenile Justice, 5(2), 12.
Google Scholar
Vito, G. F., Higgins, G. E., & Tewksbury, R. (2017). The effectiveness of parole supervision: Use of propensity score matching to analyze reincarceration rates in Kentucky. Criminal Justice Policy Review, 28(7), 627–640.
Article Google Scholar
Wooldridge, J. M. (2005). Violating ignorability of treatment by controlling for too many factors. Econometric Theory, 21(5), 1026–1028.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Shenyang Guo, Zachary Hamilton, Stephen Vaisey, and Ozcan Tunalilar for their valuable feedback during this process.

Funding

This project was supported by a grant from the National Institute of Justice (Award #2016-R2-CX-0030). The opinions, findings, and conclusions expressed in this article are those of the authors and do not necessarily reflect those of the Department of Justice.

Author information

Authors and Affiliations

Portland State University, Portland, OR, USA
Christopher M. Campbell
RTI International, Durham, NC, USA
Ryan M. Labrecque

Authors

Christopher M. Campbell
View author publications
You can also search for this author in PubMed Google Scholar
Ryan M. Labrecque
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher M. Campbell.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(DOCX 15.5 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Campbell, C.M., Labrecque, R.M. Panacea or poison: Assessing how well basic propensity score modeling can replicate results from randomized controlled trials in criminal justice research. J Exp Criminol 20, 229–253 (2024). https://doi.org/10.1007/s11292-022-09532-y

Download citation

Accepted: 04 October 2022
Published: 29 October 2022
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11292-022-09532-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Panacea or poison: Assessing how well basic propensity score modeling can replicate results from randomized controlled trials in criminal justice research

Abstract

Access this article

Similar content being viewed by others

Evolutionary regression? Assessing the problem of hidden biases in criminal justice applications using propensity scores

Propensity score methods for causal inference and generalization

On Propensity Score Methodology

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Panacea or poison: Assessing how well basic propensity score modeling can replicate results from randomized controlled trials in criminal justice research

Abstract

Access this article

Similar content being viewed by others

Evolutionary regression? Assessing the problem of hidden biases in criminal justice applications using propensity scores

Propensity score methods for causal inference and generalization

On Propensity Score Methodology

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation