A Framework for Leveraging “Big Data” to Advance Epidemiology and Improve Quality: Design of the VA Colonoscopy Collaborative

Objective: To describe a framework for leveraging big data for research and quality improvement purposes and demonstrate implementation of the framework for design of the Department of Veterans Affairs (VA) Colonoscopy Collaborative. Methods: We propose that research utilizing large-scale electronic health records (EHRs) can be approached in a 4 step framework: 1) Identify data sources required to answer research question; 2) Determine whether variables are available as structured or free-text data; 3) Utilize a rigorous approach to refine variables and assess data quality; 4) Create the analytic dataset and perform analyses. We describe implementation of the framework as part of the VA Colonoscopy Collaborative, which aims to leverage big data to 1) prospectively measure and report colonoscopy quality and 2) develop and validate a risk prediction model for colorectal cancer (CRC) and high-risk polyps. Results: Examples of implementation of the 4 step framework are provided. To date, we have identified 2,337,171 Veterans who have undergone colonoscopy between 1999 and 2014. Median age was 62 years, and 4.6 percent (n = 106,860) were female. We estimated that 2.6 percent (n = 60,517) had CRC diagnosed at baseline. An additional 1 percent (n = 24,483) had a new ICD-9 code-based diagnosis of CRC on follow up. Conclusion: We hope our framework may contribute to the dialogue on best practices to ensure high quality epidemiologic and quality improvement work. As a result of implementation of the framework, the VA Colonoscopy Collaborative holds great promise for 1) quantifying and providing novel understandings of colonoscopy outcomes, and 2) building a robust approach for nationwide VA colonoscopy quality reporting.


Appendix C. Structured Variable Development and Validation Process
We have implemented a stepwise approach to estimate the sample size needed for manual chart review based on one-sided confidence lower bounds for positive predictive value (PPV) and negative predictive value (NPV). Bonferroni correction was used for multiple comparison adjustment. That is, to ensure an overall 95 percent confidence, a one-sided 97.5 percent confidence lower bound was calculated for PPV and NPV, respectively. We implemented a sensitivity analysis by considering a range of sample sizes (100-250 potential cases and 100-250 potential controls) and a range of estimated PPV/NPV (0.85-0.95) and adopted the following validation process: 1. Take a random sample of 100 putative cases and 100 putative controls for the predictor, exposure or outcome of interest. If the estimated PPV (based on 100 putative cases) and NPV (based on putative 100 controls) are 0.95 or greater, the confidence lower bounds for PPV and NPV will both be greater than 0.90 and therefore, we can claim the true PPV and NPV are greater than 0.90 with 95 percent confidence. An application of Bayes" theorem shows that with the above estimated PPV and NPV, the sensitivity and specificity will be at least 0.68 if the prevalence of the outcome is 0.10-0.90.

If estimated PPV or NPV in
Step 1) are lower than 0.95, we will assess the source of errors, modify the algorithm to improve the PPV and NPV and manually review a random sample of 150 putative cases and 150 putative controls in this step. If estimated PPV and NPV are >0.90, the confidence lower bounds for PPV and NPV will be greater than 0.85 and we will claim the true PPV/NPV are greater than 0.85 with 95 percent confidence. With the above estimated PPV and NPV, the estimated sensitivity and specificity will be at least 0.69 if the prevalence is 0.20-0.80.

If estimated PPV or NPV in
Step 2) are < 0.90, we will assess the source of errors and modify the algorithm again and manually review another random sample of 150 putative cases and 150 putative controls, estimate PPV, NPV, sensitivity, and specificity and the validation process is completed.
Importance: Bowel preparation refers to the quality with which the colon was cleansed, as observed at the time of a colonoscopy procedure. Quality of bowel preparation impacts ability of the colonoscopist to see polyps and cancers, and impacts recommendations for follow up. For example, a suboptimal bowel preparation might prompt a recommendation for an early 5 year instead of 10 year colonoscopy in a person with otherwise normal examination.
Variation: There is variation in the terminology used to define bowel preparation. Sometimes, more than one description is providede.g., bowel prep was excellent and adequate; OR bowel prep was good except in ascending colon, where the prep was fair. 1=yes, NULL=not found entire colon was clean prep_fairly clean 1=yes, NULL=not found colon was fairly clean prep_optimal 1=yes, NULL=not found quality of bowel prep was optimal prep_suboptimal 1=yes, NULL=not found suboptimal prep prep_boston 0-9; NULL=not found Boston bowel prep score equal to 9 prep_ottawa 0-14; NULL=not found Ottawa bowel prep score: 14 prep_unspecified varchar(100) none of the above, value other than above * It is possible for the same report to have multiple values; for example, the same report might say that the prep was good and adequate, or good except for fair in the ascending colon; analytically, we will note these based on procedures that are associated with a "1" coded for more than one bowel prep variable.

Desired output: norm_inadequateprep norm_adequateprep norm_unknown
Presence of any normalized inadequate criteria, including at least one of the following: Absence of all normalized inadequate criteria, plus at least one of the following: Study cohort will be randomly split into training and validation sets with 2:1 ratio. Model development will be conducted using the training set. Risk factors that are significantly (P<0.15) associated in univariate analysis will be considered as potential predictors for CRC and high-risk polyps in a multivariable logistic regression model. We will use a L1-regularized logistic regression model 1, 2 and Bayesian model averaging 3 for variable selection, which are considered superior to traditional stepwise model selection approaches. 4,5 Discrimination and calibration of the selected models will be assessed using the Area under the Receiver Operating Characteristic Curve (AUC) and the Hosmer-Lemeshow goodness-of-fit test. 6 We will then use the predicted probability of CRC and high-risk polyps from the selected best model to determine a cut-point above which a patient would be identified as at high risk for CRC and high-risk polyps. We will make an a priori plan to identify two risk stratification cut-points that improve sensitivity and specificity of current US Multi-Society Task Force on Colorectal Cancer guidelines, as previously described. 7 Defining sensitivity as the proportion of individuals with subsequent CRC or high-risk polyps who are classified as high risk at baseline, we will target the first cut-point to improve the sensitivity of US Multi-Society Task Force guidelines by 10 percentage points. Defining specificity as the proportion of individuals without subsequent CRC or high-risk polyps who were classified as low risk at baseline, we will target the other cutpoint to improve the specificity of US Multi-Society Task Force guidelines by 10 percentage points. The population sensitivity and specificity of US Multi-Society Task Force guidelines were estimated at 68 percent and 54 percent, the median sensitivity and specificity observed in published literature. [7][8][9][10][11]

Model Validation and Comparison of Estimated Clinical Benefit in Validation Set
Model validation will be conducted using the validation set. Model discrimination will be assessed by the AUC. Model calibration will be assessed by Hosmer-Lemeshow goodness-of-fit test as well as comparing the predicted risk and observed risk of CRC and high-risk polyps for 10 deciles of risk groups. Potential clinical benefit will be assessed in the validation data using the model coefficients and cut-points identified in the training set, by estimated sensitivity and specificity for CRC and high-risk polyps and estimated rates of over-and under-use of colonoscopy. Overuse of surveillance colonoscopy will be defined as the proportion of individuals classified as high risk at baseline who did not develop CRC or high-risk polyps on follow-up. 7 Underuse of surveillance colonoscopy will be defined as the proportion of individuals classified as low risk at baseline who did not develop CRC or high-risk polyps on follow-up. 7 Improvement in specificity and sensitivity using the predictive model on the validation set will be assessed by McNemar test, and improvement in overuse and underuse will be assessed by 95 percent confidence intervals. Clinical benefit of using the predictive model over current guidelines will also be assessed using net reclassification improvement. 11