The National Patient-Centered Clinical Research Network (PCORnet) Bariatric Study Cohort: Rationale, Methods, and Baseline Characteristics

Background: Although bariatric procedures are commonly performed in clinical practice, long-term data on the comparative effectiveness and safety of different procedures on sustained weight loss, comorbidities, and adverse effects are limited, especially in important patient subgroups (eg, individuals with diabetes, older patients, adolescents, and minority patients). Objective: The objective of this study was to create a population-based cohort of patients who underwent 3 commonly performed bariatric procedures—adjustable gastric band (AGB), Roux-en-Y gastric bypass (RYGB), and sleeve gastrectomy (SG)—to examine the long-term comparative effectiveness and safety of these procedures in both adults and adolescents. Methods: We identified adults (20 to 79 years old) and adolescents (12 to 19 years old) who underwent a primary (first observed) AGB, RYGB, or SG procedure between January 1, 2005 and September 30, 2015 from 42 health systems participating in the Clinical Data Research Networks within the National Patient-Centered Clinical Research Network (PCORnet). We extracted information on patient demographics, encounters with healthcare providers, diagnoses recorded and procedures performed during these encounters, vital signs, and laboratory test results from patients’ electronic health records (EHRs). The outcomes of interest JMIR Res Protoc 2017 | vol. 6 | iss. 12 | e222 | p.1 http://www.researchprotocols.org/2017/12/e222/ (page number not for citation purposes) Toh et al JMIR RESEARCH PROTOCOLS


Introduction
As severe obesity has increased in prevalence, the use of bariatric surgery has expanded considerably over the past 20 years. Because of this expansion and the rapid shifts in the types of bariatric procedures performed in recent years-from predominantly Roux-en-Y gastric bypass (RYGB) in the early 2000's, shifting towards greater use of adjustable gastric banding (AGB) by early 2010's, and then to predominantly sleeve gastrectomy (SG) currently [1][2][3]-long-term data comparing the effectiveness and safety of different procedures on sustained weight loss, comorbidities, and adverse effects are limited. In addition, prior studies have included insufficient numbers of patients to examine differential outcomes within important patient subgroups. More data are needed in larger, more broadly representative samples with long-term follow-up to help inform clinical decisions about bariatric procedure selection in various patient sub-populations (eg, individuals with diabetes, older patients, adolescents, and minority patients).
In 2014, the Patient-Centered Outcomes Research Institute (PCORI) launched the National Patient-Centered Clinical Research Network (PCORnet) to support studies that address questions important to patients [4]. PCORnet is a distributed data network that includes 13 Clinical Data Research Networks (CDRNs) and 20 Patient-Powered Research Networks, making it one of the largest research consortia in the United States. It currently includes electronic health record (EHR) or administrative claims data from more than 100 million individuals and has access to over 40 million patients who could be recruited into pragmatic clinical trials. PCORnet data is stored at individual participating sites in a common data format [5].
Initiated in 2016, the PCORnet Bariatric Study (PBS) is one of the first 2 multi-CDRN observational studies conducted within the network [6]. A group of patients, clinicians, and researchers developed the study aims [7]. The cohort was set up with 2 major goals. The first was to evaluate the comparative effectiveness and safety of AGB, RYGB, and SG, the 3 most commonly performed bariatric procedures in contemporary clinical practice. The second goal was to demonstrate PCORnet's potential as a national resource for evidence generation. Here, we describe the design and early descriptive results of the study.

Data Sources
A total of 42 health systems from 11 CDRNs participated in this descriptive study (Textbox 1). Of the 2 non-participating CDRNs, 1 outpatient-focused network deferred due to insufficient number of bariatric patients and the other network was not yet founded at the time that the PBS was proposed. The participating health systems are geographically diverse and provide care to demographically heterogeneous populations.
As part of its efforts to facilitate rapid and efficient studies drawing from multiple data sources, PCORnet standardized the EHR data from the participating health systems by implementing a common data model (CDM). Table 1 describes the specific data domains extracted from the EHRs. These domains include patient demographics, encounters with healthcare providers, diagnoses recorded and procedures performed during these encounters, vital signs, laboratory test results, and mortality (obtained from other sources in some CDRNs).

Textbox 1. Eleven participating PCORnet Clinical Data Research Networks and data-contributing sites in the PCORnet Bariatric Study. Johns Hopkins
University and Health System, UPMC Health Plan, and Boston HealthNet did not contribute data for this paper but will for future analyses.
Clinical Data Research Network (CDRN) and the corresponding data-contributing sites Encounter type are used to identify initial bariatric procedures and all subsequent complications and procedures during the follow-up period. We have captured data from all encounter types including inpatient, outpatient, and emergency room visits. Contains 1 record for each time a patient sees a provider in ambulatory setting or is hospitalized; multiple encounters per day are possible if they occur with different providers or in different care settings.

Encounter
Diagnosis codes and associated encounter dates are used to establish medical history prior to surgery a .
Contains all uniquely recorded diagnoses for all encounters. Each diagnosis is associated with a specific patient and encounter.

Diagnosis
Procedure codes and associated encounter dates are used to establish bariatric surgery dates and any re-operations, revisions, or operative complications.
Contains all uniquely recorded procedures for all encounters. Each procedure is associated with a specific patient and encounter.

Procedure
Height and weight are captured for body mass index; blood pressure and tobacco use information is also available.
Contains 1 record per height or weight result. Multiple measurements per encounter are recorded as separate measures.

Vitals
The common data model currently contains a limited number of laboratory tests; glycated hemoglobin (HbA1c) is being collected and is required to identify diabetes outcomes in an ongoing analysis.

Contains 1 record per laboratory result. Lab Results
Some health systems have existing linkages to state and national death indices; others will be funded to conduct linkages. Contains 1 record per patient for those who died. Death a We focused on extracting data on obesity-associated comorbidities and health conditions used to calculate the Charlson-Elixhauser combined comorbidity score.

Cohort Identification
We identified adults (20 to 79 years old) and adolescents (12 to 19 years old) who underwent a primary (first observed) AGB, RYGB, or SG procedure between January 1, 2005 and September 30, 2015 in any of the 42 participating health systems. To be eligible for cohort inclusion, patients must have (1) at least 1 body mass index (BMI) measurement of 35 kg/m 2 or more recorded in their EHRs in the year prior to the surgery (ie, baseline); (2) no prior revision bariatric procedure code during baseline; (3) no recorded gastrointestinal cancer diagnosis or fundoplasty procedure during baseline; (4) no multiple conflicting bariatric procedure codes on the same day; and (5) no emergency room encounter on the day of the index procedure ( Figure 1). We excluded patients with missing baseline BMI, insufficient height or weight data to calculate baseline BMI, or baseline BMI less than 35 kg/m 2 because guidelines recommend consideration of bariatric surgery for adult patients with severe obesity (BMI 40 kg/m 2 or greater) or BMI 35 kg/m 2 or greater plus comorbidity [8]. We identified bariatric procedures using the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) codes, Current Procedure Terminology codes (CPT-4), and Healthcare Common Procedure Coding System (HCPCS) codes (list available from authors by request). There were relatively few additional eligibility criteria ( Figure 1) in order to maximize the representativeness of the cohort.
We extracted information on patient demographics (eg, age, sex, race/ethnicity), height, weight, BMI, blood pressure, and select comorbidities (eg, diabetes, sleep apnea) from the standardized data domains described in Table 1. Comorbidities were identified by ICD-9-CM diagnosis and procedure codes and the Systematized Nomenclature of Medicine (SNOMED) codes. We also calculated a combined comorbidity score that merges the Charlson and Elixhauser comorbidity scores [9]. The score, calculated based on 20 conditions identified by ICD-9-CM and SNOMED codes in the year prior to surgery, was initially developed to predict mortality. It has been shown to be a good proxy for general health status and has been used in prior analysis of bariatric patients [10].

Follow-Up
Patients were followed as part of routine clinical care in each participating health system. We used BMI measurements after the index bariatric procedure as a proxy for follow-up. Because the United States transitioned from the ICD-9-CM coding system to the ICD-10-CM system on October 1, 2015, we ended follow-up on September 30, 2015 to avoid changes related to coding of diagnoses and procedures.

Analysis
The baseline characteristics of the study cohort were compared by procedure type. The temporal trends in bariatric procedures during the study period were also assessed. Within the study cohort, the characteristics of patients with and without a BMI measurement during follow-up were further compared. We also examined the length of follow-up by procedure type. Finally, we compared the study cohort with patients who were excluded from the study due to missing baseline BMI measurement. We performed all the comparisons separately for the adult and adolescent subcohorts.

Stakeholder Engagement
In addition to the extensive clinical data and research infrastructure necessary to collect the data for the study, a unique aspect of the PCORnet Bariatric Study is the engagement of a broad range of stakeholders. As part of our initial work to formulate the proposal, we identified 4 key stakeholder groups that would be critical to the success of our project: patients and caregivers, healthcare providers, healthcare system or organizational leaders, and community and advocacy groups. Each participating network was asked to engage a stakeholder as part of their research team and representatives from each of these groups formed an Executive Stakeholder Advisory Board and advised the scientific investigators on all aspects of the conduct of the study.

Adult Subcohort
There were 65,093 adult bariatric patients in the PBS cohort, more than 10 times the size of the well-established Longitudinal Assessment of Bariatric Surgery study cohort [11]. These adult patients had a mean age of 45 years and were predominantly female (79.30%, 51,619/65,093) ( RYGB was the most common bariatric procedure in this subcohort (49.48%, 32,208/65,093), followed by SG (45.62%, 29,693/65,093) and AGB (4.90%, 3192/65,093) ( Table 2). The SG patients appeared to be slightly younger at the time of surgery while the AGB patients had a lower mean maximum baseline BMI. The frequency of numerous comorbidities differed by procedure type, with RYGB patients typically showing higher prevalence of pre-operative comorbidity than patients with other procedure types. There was racial and ethnic variation in the type of procedures received-the proportion of Black patients ranged from 15.65% (4515/28,845) in the RYGB group to 27.05% (6894/25,485) in the SG group, and the proportion of Hispanic patients ranged from 11.53% (352/3,054) in the AGB group to 24.58% (7144/29,059) in the RYGB group. h The combined comorbidity score merges the Charlson and Elixhauser comorbidity scores [9]. It is calculated based on 20 conditions identified by ICD-9-CM and SNOMED codes in the year prior to surgery. The score ranges from -2 to 26, with a higher score generally indicating poorer health status. i Identified by one or more ICD-9-CM or SNOMED diagnosis code in the year prior to surgery.

Adolescent Subcohort
The PBS cohort also included 777 adolescent bariatric patients, more than twice the size of the largest published bariatric study of an adolescent population ( The mean age and baseline BMI were quite similar across the three treatment groups. The RYGB patients appeared to have more comorbid conditions recorded in their EHRs than the other 2 groups, but these prevalence estimates may be less reliable than in the adult subcohort.  h The combined comorbidity score merges the Charlson and Elixhauser comorbidity scores [9]. It is calculated based on 20 conditions identified by ICD-9-CM and SNOMED codes in the year prior to surgery. The score ranges from -2 to 26, with a higher score generally indicating poorer health status. i Identified by one or more ICD-9-CM or SNOMED diagnosis code in the year prior to surgery.

Temporal Trends in Bariatric Procedures Performed in the Study Cohort
We observed dramatic shifts in the type of procedures performed in adults between 2005 and 2015 ( Figure 2). Almost all of the bariatric procedures performed in the health systems contributing to the dataset in 2005 were RYGB. SG became increasingly popular starting in 2010 and was the most commonly performed bariatric procedure by 2013. Although the shifts in the type of procedures performed in the adult subcohort are consistent with other studies, it is worth noting that while all 11 participating CDRNs contribute data to all study years, not all 42 participating health systems within these CDRNs have data in all years. We also found substantial variability in the type of procedures performed in adults across CDRNs during the study period ( Figure 3). RYGB was the most commonly performed procedure in 3 CDRNs while SG was the primary procedure in 8 CDRNs. The proportion of RYGB procedure ranged from 16% to 69% across CDRNs.     h The combined comorbidity score merges the Charlson and Elixhauser comorbidity scores [9]. It is calculated based on 20 conditions identified by ICD-9-CM and SNOMED codes in the year prior to surgery. The score ranges from -2 to 26, with a higher score generally indicating poorer health status. i Identified by one or more ICD-9-CM or SNOMED diagnosis code in the year prior to surgery.

Adult Subcohort
We excluded 12,510 adult bariatric patients with missing baseline BMI in the EHR who met the other eligibility criteria of the study and an additional 1918 patients whose baseline BMI were less than 35 kg/m 2  . Not surprisingly, patients without a baseline BMI measurement also had a much higher proportion of missing blood pressure measurements (85.82%, 10,736/12,510 versus 6.15%, 4006/65,093). Ongoing and future analyses will account for the differences in the patient characteristics.

Adolescent Subcohort
We excluded 127 adolescent patients with missing baseline BMI in the EHR who met the other eligibility criteria of the study and an additional 54 patients whose baseline BMI were less than 35 kg/m 2 . Compared to the adolescent patients in the PBS cohort, patients with missing baseline BMI information were more likely to have undergone AGB or RYGB procedure and more likely to have their bariatric procedures performed in earlier study years (

Adult Subcohort
Within the adult subcohort, 71.45% (46,510/65,093) patients had one or more BMI measurements beyond 6 months of post-operative follow-up. However, follow-up ended on September 30, 2015, so not all patients were eligible to be followed for 1, 3, or 5 full years. For example, only patients who had a bariatric procedure on October 1, 2010 or earlier could be followed for 5 complete years during the study's timeframe. The proportion of eligible patients with at least one BMI measurement in the follow-up windows of interest was 84.31% (44,978/53,351) at 6 to 18 months, 68.09% (20,783/30,521) at 30 to 42 months, and 68.56% (7159/10,442) at 54 to 66 months after surgery (Table 6). Long-term follow-up varied by treatment group, with SG patients being most likely to have a BMI measurement at years 3 and 5, followed by RYGB patients and AGB patients.   h The combined comorbidity score merges the Charlson and Elixhauser comorbidity scores [9]. It is calculated based on 20 conditions identified by ICD-9-CM and SNOMED codes in the year prior to surgery. The score ranges from -2 to 26, with a higher score generally indicating poorer health status. i Identified by one or more ICD-9-CM or SNOMED diagnosis code in the year prior to surgery.  For example, only patients who had a bariatric procedure on October 1, 2014 or earlier would be eligible for having one complete year of follow-up information. However, the number of eligible patients was an estimate because we did not request actual dates for the analysis for privacy consideration-all patients who had their procedure performed in 2013 or earlier and 3/4 of patients who had their procedure performed in 2014 will be eligible for at least one year of follow-up. c AGB: adjustable gastric banding.

Principal Findings
In this large, population-based, retrospective cohort study using the national PCORnet data infrastructure, we have identified 65,093 adults and 777 adolescents who underwent 1 of the 3 most common bariatric procedures, AGB, RYGB, and SG, in 42 geographically diverse health systems. Over the time frame of the study (2005 to 2015), we observed a dramatic shift in bariatric procedure use (Figure 2), with a sharp decline in the proportions of patients undergoing RYGB and AGB and increase in the proportion undergoing SG. In particular, the large number of SG patients in this cohort (29,693 adults and 469 adolescents) makes this a valuable resource for comparative effectiveness research. We also observed heterogeneity in bariatric procedure preferences across the 11 participating CDRNs (Figure 3), which underscore the need for better comparative effectiveness research evidence to inform patient and provider decisions about bariatric surgery.

Strengths
The ongoing PBS is one of the largest cohorts of patients with bariatric procedures in the United States. Patients are geographically and demographically diverse, which improves the generalizability of the research findings and allows examination of treatment effect heterogeneity. This, in turn, may result in findings that can more easily be applied to clinical decision-making. The ability to use real-world data collected as part of healthcare delivery not only allows us to collect long-term follow-up data efficiently and at a lower cost but also to learn from the routine practice of medicine.
A unique strength of the PBS study is the depth and diversity of its stakeholder involvement, which includes not only several patients as study team members, but also multiple pediatric and adult bariatric surgeons from different institutions, primary care and specialty physicians, researchers, and leaders of patient-level policy and advocacy organizations. Stakeholders are fully engaged in all stages of the protocol development, including formulating the research questions and the study aims, selecting outcomes that are of interest to the patients, and identifying methods to study these outcomes (eg, prioritization of variables for heterogeneity of treatment effect analyses). They are also actively involved in monitoring study conduct, interpreting data in the context of local patient populations and coding practices, and designing and implementing dissemination plans. This robust engagement strategy helps ensure that the products of this research study are meaningful to patients, clinicians, and policy makers.
By having sites translate source data into the CDM in PCORnet, researchers can distribute one query to all sites and receive back standardized output (eg, identical variable names and categories) from disparate data sources. Using the CDM avoids much of the redundant preparatory work that would otherwise be needed to assemble cohorts or count potential events and other endpoints. Code lists and query programs developed as part of this study can also be used for future studies that leverage the PCORnet CDM. The CDM and distributed data network framework has been shown to improve the efficiency of the conduct of multi-database studies [13][14][15][16][17].
The PBS employs an efficient ethical review process. Adherence to human subjects protections and regulations was addressed at the CDRN level. Some participating networks obtained Institutional Review Board (IRB) approval for the study's protocol using an IRB reliance agreement across their sites; others created and relied on a central IRB [18]. At some CDRNs, individual site's IRB determined that these analyses of de-identified data did not qualify as human subjects research. The Kaiser Permanente Washington Health Research Institute, the lead site of the PBS, obtained IRB approval for overseeing data collection and leading analyses.

Ongoing and Planned Activities
Ongoing and planned investigations in the PBS include head-to-head comparisons of these procedures on long-term changes in weight, rates of diabetes remission and relapse, and incidences of major surgery-related adverse events. These comparisons will be conducted separately in adults and adolescents. Additional evaluations will examine the heterogeneity of treatment effects for important covariates such as age, sex, race, and comorbidities. Furthermore, selected analyses will compare pooled individual-level data analysis with more privacy-protecting analytic approaches that share less granular information [19,20].
Examination of mortality after bariatric surgery is challenging using only EHR data. Deaths are not typically captured in EHRs except if they occur during hospitalization or in the emergency room, or when a primary care provider becomes aware of a patient's death and the information is entered manually into the EHR. Some sites within the participating CDRNs have linked to state or national death indices. The PBS plans to perform additional linkages to these death registries for a subset of the study population to increase the accuracy and completeness of death information. In addition, a number of pre-specified surgery-related adverse events, including re-hospitalization and re-operation after bariatric surgery, may be incompletely captured in EHRs because patients may get a portion of their care outside of the data-contributing health systems. The PBS study will link the EHR data from select health systems to insurance claims data to improve capture of these events.

Limitations
This study has several limitations. A non-negligible number of bariatric patients had missing BMI data either at baseline or in follow-up, and the reasons for having missing measurements were generally not well-recorded during the study period. Because the PCORnet CDM typically reflects data stored as discrete data elements, it is possible that some EHR data (eg, BMI recorded in a clinician's note instead of in the vital signs table) was not represented in our analyses. Long-term follow-up (eg, 5 years) information was not available in some patients.
Relying primarily on routinely collected health data means our data collection process might not be as systematic as in other prospective cohort studies (eg, the Longitudinal Assessment of Bariatric Surgery study [11] and the Teen-Longitudinal Assessment of Bariatric Surgery study [12]). However, it does represent the information that informs patient and provider decisions in routine clinical care. There was also variability in data capture and documentation across health systems during the study period.
We did not validate the algorithms used to identify the comorbidities of interest (eg, sleep apnea). It is possible that these conditions were under-recorded or over-recorded in certain EHRs. However, the implementation of the PCORnet CDM helps standardize a core set of variables expected to be commonly used in research studies. There is currently no plan to conduct analyses using data beyond September 30, 2015.
Although the PBS cohort will perform linkages with additional data sources to improve the completeness and accuracy of certain information, these linkages will not be performed in the entire cohort.

Conclusion
Using the data and research infrastructure created by the PCORnet, we have created one of the largest cohorts of patients with bariatric procedures in the United States. The diversity of the patients and the active engagement of the stakeholders enhance the generalizability and relevance of the research findings. The study will produce real-world evidence on the long-term benefits and risks of these most commonly used bariatric procedures in current clinical practice.