Measurement errors in control risk regression: A comparison of correction techniques

Abstract Control risk regression is a diffuse approach for meta‐analysis about the effectiveness of a treatment, relating the measure of risk with which the outcome occurs in the treated group to that in the control group. The severity of illness is a source of between‐study heterogeneity that can be difficult to measure. It can be approximated by the rate of events in the control group. Since the estimate is a surrogate for the underlying risk, it is prone to measurement error. Correction methods are necessary to provide reliable inference. This article illustrates the extent of measurement error effects under different scenarios, including departures from the classical normality assumption for the control risk distribution. The performance of different measurement error corrections is examined. Attention will be paid to likelihood‐based structural methods assuming a distribution for the control risk measure and to functional methods avoiding the assumption, namely, a simulation‐based method and two score function methods. Advantages and limits of the approaches are evaluated through simulation. In case of large heterogeneity, structural approaches are preferable to score methods, while score methods perform better for small heterogeneity and small sample size. The simulation‐based approach has a satisfactory behavior whichever the examined scenario, with no convergence issues. The methods are applied to a meta‐analysis about the association between diabetes and risk of Parkinson disease. The study intends to make researchers aware of the measurement error problem occurring in control risk regression and lead them to the use of appropriate correction techniques to prevent fallacious conclusions.

: Bias and standard deviation (SD) of the estimates of β 0 , and average of the estimated standard errors (SE) obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk normally distributed.     Figure S1: Mean squared error of the estimators of β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk normally distributed.  Figure S2: Mean squared error of the estimators of β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals.  Figure S3: Mean squared error of the estimators of β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.  Figure S4: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk normally distributed.  Figure S5: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals.  Figure S6: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.           Figure S9: Mean squared error of the estimators of β 1 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.   Figure S11: Mean squared error of the estimators of τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals. Figure S12: Mean squared error of the estimators of τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal. Table S9: Bias and standard deviation (SD) of the estimates of β 0 , β 1 , τ 2 , and average of the estimated standard errors (SE) obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i), with n = 100. Underlying risk normally distributed.   Figure S14: Mean squared error of the estimators of β 1 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i), with n = 100. Underlying risk normally distributed.  Figure S15: Mean squared error of the estimators of τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i), with n = 100. Underlying risk normally distributed.  Figure S16: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario i), with n = 100. Underlying risk normally distributed.  Figure S17: Empirical coverage probabilities of confidence intervals for β 1 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario i), with n = 100. Underlying risk normally distributed.

Web Appendix B: Simulation results for the log event rate case
This web appendix reports a portion of the results of the simulation study carried out to compare the performance of the competing approaches in control rate regression, as described in Section 4 of the main manuscript. The reference is to simulation scenario ii) in the main manuscript.    Figure S18: Mean squared error of the estimators β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk normally distributed.  Figure S19: Mean squared error of the estimators β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals.  Figure S20: Mean squared error of the estimators β 0 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.    Figure S22: Mean squared error of the estimators β 1 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals.  Figure S23: Mean squared error of the estimators β 1 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.    Figure S24: Mean squared error of the estimators τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk normally distributed.  Figure S25: Mean squared error of the estimators τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a mixture of Normals.  Figure S26: Mean squared error of the estimators τ 2 obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario i). Underlying risk distributed as a Skew-Normal.  Figure S27: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk normally distributed.  Figure S28: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk distributed as a mixture of Normals.  Figure S29: Empirical coverage probabilities of confidence intervals for β 0 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk distributed as a Skew-Normal. Table S19: Empirical coverage probabilities (multiplied by 1,000) of confidence intervals for β 0 and associated Monte Carlo standard error (in parentheses, multiplied by 1,000) obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario ii). Underlying risk distributed as a Normal, Skew-Normal, or Mixture of Normals.  Figure S30: Empirical coverage probabilities of confidence intervals for β 1 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk normally distributed.  Figure S31: Empirical coverage probabilities of confidence intervals for β 1 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk distributed as a mixture of Normals.  Figure S32: Empirical coverage probabilities of confidence intervals for β 1 from uncorrected approach (NAIVE), likelihood approach under a Normal specification (LIKELIHOOD) or a Skew-Normal specification (SKEW-NORMAL) for the underlying risk distribution, SIMEX, corrected score and conditional score, on the basis of 1, 000 replicates of simulation scenario ii).
Underlying risk distributed as a Skew-Normal. Table S20: Empirical coverage probabilities (multiplied by 1,000) of confidence intervals for β 1 and associated Monte Carlo standard error (in parentheses, multiplied by 1,000) obtained from naive analysis, likelihood analysis under a Normal or a Skew-Normal specification of the distribution of ξ, corrected score, conditional score, SIMEX, on the basis of 1, 000 replicates of simulation scenario ii). Underlying risk distributed as a Normal, Skew-Normal, or Mixture of Normals.