Prediction of Pancreatic Cancer in Diabetes Patients with Worsening Glycemic Control

Abstract Background: Worsening glycemic control indicates elevated risk of pancreatic ductal adenocarcinoma (PDAC). We developed prediction models for PDAC among those with worsening glycemic control after diabetes diagnosis. Methods: In 2000–2016 records within the Veterans Affairs Health System (VA), we identified three cohorts with progression of diabetes: (i) insulin initiation (n = 449,685), (ii) initiation of combination oral hypoglycemic medication (n = 414,460), and (iii) hemoglobin A1c (HbA1c) ≥8% with ≥Δ1% within 15 months (n = 593,401). We computed 12-, 36-, and 60-month incidence of PDAC and developed prediction models separately for males and females, with consideration of >30 demographic, behavioral, clinical, and laboratory variables. Models were selected to optimize Akaike's Information Criterion, and performance for predicting 12-, 36-, and 60-month incident PDAC was evaluated by bootstrap. Results: Incidence of PDAC was highest for insulin initiators and greater in males than in females. Optimism-corrected c-indices of the models for predicting 36-month incidence of PDAC in the male population were: (i) 0.72, (ii) 0.70, and (iii) 0.71, respectively. Models performed better for predicting 12-month incident PDAC [c-index (i) 0.78, (ii) 0.73, (iii) 0.76 for males], and worse for predicting 60-month incident PDAC [c-index (i) 0.69, (ii) 0.67, (iii) 0.68 for males]. Model performance was lower among females. For subjects whose model-predicted 36-month PDAC risks were ≥1%, the observed incidences were (i) 1.9%, (ii) 2.2%, and (iii) 1.8%. Conclusions: Sex-specific models for PDAC can estimate risk of PDAC at the time of progression of diabetes. Impact: Our models can identify diabetes patients who would benefit from PDAC screening.


Introduction
Only 10% of patients with pancreatic ductal adenocarcinoma (PDAC) survive beyond five years (1). Fewer than 25% of PDAC cases in the United States are diagnosed at a resectable stage (1,2). Currently, no early detection strategy exists for the general population, and biomarkers such as CA19-9 and CA-125 have not translated to meaningful gains in early detection due to insufficient diagnostic accuracy (3)(4)(5).
Many nonspecific symptoms and comorbidities commonly develop in parallel with PDAC development and are likely indicators of early disease (6)(7)(8)(9)(10)(11). A notable PDAC indicator is new-onset diabetes, which is present in 15% to 35% of PDAC patients (6,10,12,13) and is associated with a 4-fold increased risk of PDAC (6)(7)(8). Initiation of insulin is even more strongly associated with risk of PDAC, with a relative risk of 5.6, and 45% of PDAC cases with diabetes have been treated with insulin (6). Although new-onset diabetes is a wellrecognized risk factor for PDAC, worsening glycemic control among people with a known diagnosis of diabetes has not received similar attention as a PDAC risk factor. Worsening glucose control could prompt a regimen change, such as adding a second or third hypoglycemic agent or starting insulin, at which the risk of PDAC may also be assessed.
A number of prospective PDAC prediction models have been reported for new-onset diabetes in the literature (14)(15)(16), but none has focused on progression of diabetes. In our study, we built a refined prediction model to estimate the risk of PDAC from the time of progression of diabetes, considering predictors from models for new-onset diabetes, utilizing longitudinal electronic medical records (EMR) from the Veterans Affairs Health System (VA; ref. 17). We also incorporated duration of clinical risk factors, such as use of proton-pump inhibitor (PPI), which have shown to improve model performance (18). We also developed sex-specific models, given that risk of PDAC is higher in males than in females. VA was specifically chosen given that veterans represent a large proportion of the U.S. population and that the VA electronic health system is one of the oldest nationwide EMR in the United States.

Data source
Our study data originate from the Department of Veterans Affairs Corporate Data Warehouse (CDW), a nationwide VA database that collates metadata on electronic health and administrative information on patients at all regional VA healthcare centers. The CDW contains demographic data, inpatient and outpatient clinical data, laboratory test results, and diagnostic and procedure codes.

Cohort definitions
To select persons with progression of diabetes after diagnosis of diabetes, we first identified a population ages 50 or above with diabetes by diagnostic codes, laboratory results of glucose and HbA1c, and pharmaceutical records using definitions for diabetes previously used in the VA population (19): (i) at least two outpatient visits with a VA primary care provider with ICD-9 code of 250.xx, or ICD-10 code of E11.x, and (ii) HbA1c of ≥6.5%, fasting glucose of ≥126 mg/dL, or random blood glucose of ≥200 mg/dL. Laboratory tests on blood collected during hospitalization or emergency department visit were excluded, given that acute conditions could temporarily elevate glucose levels (20). Among this pool of diabetes patients, we identified three nonindependent populations representing different stages of diabetes: (i) initiation of insulin, (ii) initiation of combination (two or more) oral hypoglycemic treatment from monotherapy, or (iii) ≥1% increase in hemoglobin A1c (HbA1c) with the last HbA1c measuring ≥8%. These were chosen in consultation with expert endocrinologists as populations in whom a prediction model for PDAC would be useful for differentiating the origin of worsening glycemic control. Cohort entry was defined as the date of first prescription for insulin in population (i), as the date of simultaneous prescription for two oral hypoglycemic drugs in population (ii), and as the first date of when HbA1c measured ≥8% with a prior HbA1c value within 15 months that was lower by 1% or more, in population (iii). We excluded from these cohorts patients with apparent onset of progression within 90 days after first-ever evidence of diabetes, because although they were newly discovered to have diabetes, they likely had diabetes for some time and the progression timing is unclear. Sex-specific models were developed in each of the non-mutually exclusive population with diabetes. Patients who experienced more than one definition of progression were allowed to contribute to multiple cohorts. As a comparison, we also estimated the age-adjusted risk of PDAC among patients with diabetes who did not meet any of the above definition of progression (nonprogressors). Of note, the focus of our study was to build prediction models for PDAC among diabetes patients with progression, and not to estimate the relative risks of PDAC attributable to progression of diabetes.

Definition of incident PDAC
We ascertained incident PDAC through cancer registries in the VA, through EMR, and through Medicare claims. We first identified cases with primary pancreatic adenocarcinoma as recorded in the VA Central Cancer Registry. Because the VA Cancer Registry does not capture all cancer cases that are diagnosed within the VA (21), we additionally identified persons who had at least two encounters at the VA with an ICD diagnosis of PDAC (ICD-9, 157.0, 157.1, 157.2, 157.3, 157.9, or ICD-10 diagnoses of C25.0, C25.1, C25.2, C25.3, C25.8, C25.9). Finally, patients with at least two independent Medicare claims for PDAC (same ICD codes as above) were added, given that VA patients may have been diagnosed with PDAC outside the VA setting. Patients who did not develop PDAC were censored at the last known vital status date as recorded in the VA CDW, or December 31, 2017.

Covariates
We extracted data on the following risk factors in persons with progressing diabetes. All covariate data were assessed based on records available on or prior to the index date in each cohort. Demographic data: Age at the time of progression of diabetes in each respective cohort, sex, race (five census categories), and Hispanic ethnicity.
Smoking status was categorized as never, former, or current smoker, assessed up to the respective time of progression of diabetes. Alcohol consumption: Heavy drinking was determined by the highest AUDIT-C score prior to progression of diabetes. AUDIT-C is a validated screening tool for alcohol use disorders (22), utilized throughout the VA since 2008. A score of ≥4 for men and ≥3 for women is indicative of hazardous drinking. Comorbidity: We identified patients with acute or chronic pancreatitis, dyspepsia/gastritis/peptic ulcer disease, abdominal pain, nonalcoholic fatty liver disease (NAFLD), heart disease, jaundice, or alcoholism by ICD-9/10 codes. Patients with a history of both acute and chronic pancreatitis were classified as having chronic pancreatitis. Medications: Because the use of PPIs can indicate upper abdominal discomfort related to the pancreas, we extracted data on prescriptions for the following commonly used PPIs: pantoprazole, omeprazole, esomeprazole, lansoprazole, and rabeprazole. Given that PDAC risk varies by exposure to metabolic agents such as statins and metformin (23)(24)(25), we extracted data on prescriptions for statins and diabetes medications including biguanides, sulfonylureas, thiazolidinediones, alpha-glucosidase inhibitors, DPP-4 inhibitors, and GLP-1 receptor agonists. Metabolic parameters: Obesity and recent weight loss are risk factors for PDAC (15,26). We therefore extracted data on the highest weight and height ever recorded on a patient prior to cohort entry to determine the peak BMI. Weight values in the range of 75 to 500 lbs and height values in the range of 48 to 84 inches were considered (27). Change in weight was assessed by percentage of loss in weight compared with prior weight measured $12 months before, within a 3-to 15-month window. We also extracted data on laboratory tests: HbA1c, creatinine, cholesterol, bilirubin, hemoglobin, red blood cell (RBC), which have been significantly associated with PDAC development in a previous study (14). Laboratory test values most proximal to the "onset" of progression of diabetes within 12 months prior to the index date were entered in the model. Because physiologic changes could indicate elevated risk of PDAC, for each laboratory parameter, we computed the percentage change in lab values from a test value closest to 12 months (within a 3-15-month window) prior to the last test. Patients with incomplete data on the continuous parameters were excluded. These comprised less than 10% of the data.

Statistical analysis
All analyses were performed separately for each population (six populations in total from three diabetes cohorts stratified by sex, male/ female). Patient characteristics are presented as number of patients (%) or median (IQR, interquartile range) overall and stratified by incident PDAC. The primary outcome is incident PDAC defined as time from progression of diabetes to incident PDAC. Median follow-up time was calculated using the reverse Kaplan-Meier method (28). Ageadjusted cumulative incidences of PDAC were estimated standardized to age 60. Univariate and multivariable analyses were conducted to examine associations between incident PDAC and its potential predictors using Cox proportional hazards regression models (29). The proportional hazards assumption was assessed with scaled Schoenfeld residuals (30).
Because more recent diagnoses of comorbid conditions or prescription drugs are more strongly associated with incident PDAC than diagnoses or prescriptions made in the distant past (18), comorbid conditions (e.g., acute pancreatitis) and prescription drugs (e.g., statins) were modeled as exponential decay of the log hazard ratio according to the number of months in the past when the diagnosis or prescription occurred using the iterative linearization method (31).
Each of the conditions or drugs was included in models as an interaction term in the form of b_1 Â I[exp (f 0 )(Àb_2Ât)], where b_1 is the parameter estimate of the diagnosis of the condition or prescription of drug, I denotes an indication of the diagnosis of the condition or prescription of drug (0 or 1), b_2 is the parameter estimate of the time in the past before the diagnosis or prescription occurred, and t is months in the past before cohort entry date. Corresponding confidence intervals for the variables are presented as a function of time before cohort entry using the delta method according to estimated standard errors.
Model selection was performed using a stepwise variable selection procedure based on Akaike information criterion (AIC) considering all but duration (32). Duration variables were added to models regardless of their significance in relation to incident PDAC if corresponding diagnosis or prescription variables were retained in the model. In multivariable analyses, the possibility of collinearity was reduced through the careful initial assessment of correlations among study covariates.
The performance of multivariable models predicting incident PDAC was assessed with measures of discrimination and calibration (33). Discriminative ability of models was measured at 12, 36, and 60 months using time-dependent area under the receiver operating characteristics curves (c-statistic) with the use of cumulative sensitivity/dynamic specificity (34). Calibration of the prediction models was evaluated with calibration slope. Internal validation of the models was performed by estimating and correcting possible overfitting and optimism in the model performance estimates using the bootstrap method with 100 to 300 replicates (35), which provides stable estimates with low bias than split-sample procedure (36,37). Estimated optimism-corrected performance measures were reported. The predicted risks for incident PDAC in 12, 36, and 60 months were estimated for each study population. Then, sensitivity, specificity, and positive predictive value (PPV) were estimated at predicted risk thresholds of 0.5%, 1%, and 2% (test positive if the estimated predicted risk ≥ threshold of interest; negative, otherwise) along with 95% exact confidence intervals. Sensitivity and specificity of the models would vary by thresholds, as higher thresholds would be more specific at the cost of lower sensitivity, and lower thresholds would be more sensitive at the cost of reduced specificity.
All analyses were performed using SAS 9.4 (SAS Institute, Inc.) and R package version 4.0.2 (R Foundation) with two-sided tests at a significant level of 0.05. The IRB of VA Greater Los Angeles (Pro#1615788) and Cedars-Sinai Medical Center (Pro#51233) approved of the study.

Data availability statement
Individual-level data are not available for the public, per VA data use guidelines for research. Aggregate-level data and model specifications are reported in the manuscript.

Study population
Of 1,546,101 patients with diabetes, 799,529 experienced one of the three definitions of progression: (i) 449,685 patients with new insulin treatment (438,816 male; 10,869 female); (ii) 414,460 with new combination oral hypoglycemic treatment (404,858 male; 9,602 female); (iii) 593,401 patients with ≥1% increase in HbA1c, with the recent HbA1c ≥8% (579,384 male; 14,017 female). Of 12,412 PDAC cases identified among diabetes patients, 6,300 cases (51%) occurred after progression of diabetes. In each cohort, (i) 3,675, (ii) 3,150, (iii) 4,606 male patients and (i) 54, (ii) 48, (iii) 66 female patients developed PDAC. The distribution of the three populations and their overlap, and distribution of the PDAC cases between the three cohorts are presented in Supplementary Figs. S1 and S2. PDAC was identified through the VA Central Cancer Registry in 45% to 46% of the PDAC patients, through EMR in 30% to 31% of the patients, and through Medicare claims in 23% to 24% of the patients in the three cohorts. Median follow-up was 73.5 months in cohort (i), 93.8 months in cohort (ii), and 77.7 months in cohort (iii). Tables 1 and 2 describe the distribution of selected model parameters in male and female populations with diabetes progression, respectively. Among males, median age at time of progression of those who remained PDAC-free ranged 64.0 to 64.7 years, and median age of those who developed PDAC ranged 65.2 to 66.2. Black and Hispanic patients comprised 16% to 18% and 6% to 7% of the male populations, respectively. Among females, median age at time of progression of those who remained PDACfree ranged 58.3 to 58.8 years, and median age of those who developed PDAC ranged 60.5 to 61.2. Black and Hispanic patients comprised 26%, and 5% to 6% of the female populations, respectively.
A unique feature of our models is the inclusion of time since onset of clinical predictors of PDAC. We summarized in Supplementary Table S1, the number of days from first indication of the clinical predictor to the respective onset of progression of diabetes. Time since onset of predictor was shorter in patients who developed PDAC as opposed to those who remained PDAC-free for several indicators: acute pancreatitis, chronic pancreatitis, abdominal pain, and jaundice.

Incidence of PDAC
Age-adjusted cumulative incidence of PDAC at 12, 24, 36, 48, and 60 months standardized to 60 years of age at progression of diabetes is presented in Supplementary Table S2 and illustrated in Fig. 1. Incidence of PDAC was highest for the insulin-initiating male cohort, rising more steeply in the first 12 months to an incidence of 0.18% [180 per 100,000 person-years (p-y)] and reached 0.52% over 60 months from the time of the first prescription for insulin. The incidence was lower in female patients initiating insulin, with 12-and 60-month PDAC incidence of 0.13% and 0.30%, respectively. The incidence of PDAC in diabetes patients with increasing HbA1c also rose more steeply in the first 12 months to an incidence of 0.13% in males and 0.08% in females. The 60-month incidence of PDAC in the elevated A1c cohort was 0.44% in males and 0.31% in females. In comparison with insulin users and those with increasing HbA1c, the incidence of PDAC in diabetes patients initiating combination oral hypoglycemic treatment or those who did not meet any of the progression criteria was lower (Supplementary Table S2).

Multivariable model and independent risk factors of PDAC
Results of the selected multivariable model are presented in Table 3. In all models, increase in age was associated with PDAC. In models developed among men, Hispanic ethnicity was inversely associated with PDAC, whereas current smoking was positively associated with PDAC risk. Acute pancreatitis, abdominal pain, jaundice, and alcoholism were uniformly selected in all three models among male diabetes patients, with current abdominal pain and current jaundice showing stronger associations with PDAC than that diagnosed in more distant past. Among continuous variables, 20% weight increase was consistently and strongly associated with lower risk of PDAC (HR ¼ 0.56-0.70), as was higher levels of HbA1c (2%-5% increased risk of PDAC per 1% higher level of the most recent HbA1c). Higher levels of creatinine and cholesterol were associated with decreased risk of PDAC. Chronic pancreatitis was not selected in the model for cohort    (i), but selected in models for cohorts (ii) and (iii). Weight was associated with decreased risk of PDAC independent of weight change in cohorts (i), and (iii) and not selected in cohort (ii). Change in HbA1c associated with PDAC in cohorts (i) and (iii), independent of the most recent HbA1c levels. Similarly, percent change in bilirubin was associated with PDAC risk in cohorts (i) and (iii). Higher level of RBC was associated with reduced risk of PDAC in models for cohorts (ii) and (iii). Fewer parameters were selected in the female cohorts, which were significantly smaller than the male cohorts ( Table 3). Each year of increase in age was associated with 2% to 4% increased risk of PDAC. Other than age, no other variable was selected for in all three female cohorts, but several variables had individual significance in 1 or 2 cohorts. Of note, current NAFLD disease was associated with increased hazard of PDAC in cohorts (i) and (iii), as compared with no NAFLD disease.

Model performance
C-index values for the original data and the optimism-corrected values for each model are presented in Table 4. Corresponding receiver operator curves (ROC) of models of 12, 36, and 60 months risk of PDAC are presented in Fig. 2. Several trends are noticeable: (i) the models performed best for predicting the 12-month risk of PDAC (solid curve), as compared with predicting 36-month (dotted curve) or 60-month (dot-dashed curve) risk of PDAC; (ii) the models perform better for the insulin-initiating cohort and those with increasing HbA1c levels than for patients initiating combination oral hypoglycemic agents; (iii) the model performs better among males than in females. Among males initiating insulin, the optimism-corrected cstatistic for the 12-month incidence was 0.78, and for the 36-month incidence was 0.72. Among males with ≥1% increase in HbA1c over 8%, the model-predicted 12-month incident PDAC with c ¼ 0.76 and 36-month incident PDAC with c ¼ 0.71. Given that jaundice is a potential late-stage indicator of PDAC, we have performed sensitivity analysis excluding jaundice from the models for prediction of 12month incidence of PDAC. No change in the performance of the models was noted, except among males with insulin initiation, in whom optimism-corrected c-statistic for the 12-month model decreased from 0.777 to 0.776. Among females initiating insulin, optimism-corrected c-statistic for the 12-month incident PDAC was 0.68, and for the 36-month incidence was c ¼ 0.65. Among females with ≥1% increase in HbA1c over 8%, the model-predicted 12-month incident PDAC with c ¼ 0.68 and 36-month incident PDAC with c ¼ 0.63. Model performance among those initiating oral combination hypoglycemic treatment was lower than cohorts (i) and (iii) in both males and females. Optimism-corrected calibration slopes for prediction models ranged between 0.953 and 0.977 in males, and 0.802 and 0.870 in females.
Model sensitivity, specificity, and PPV for predicting PDAC at ≥0.5%, ≥1%, and ≥2% predicted risk thresholds for the male diabetes populations are presented in Table 5. At a predicted PDAC risk threshold of ≥0.5% over 12 months, the PPV of the model for male insulin initiators was 1.41%, with a sensitivity of 35.7% and specificity of 94.3%. At a predicted risk threshold of ≥1% over 36 months, the PPV of the model for male insulin initiators is 1.89%, with a sensitivity of 26.4% and specificity of 94.1%. At a predicted risk threshold of ≥1% over 60 months, the PPV of the model for insulin initiation male population is 1.35%, with a sensitivity of 39.3% and specificity of 83.8%. At a predicted risk threshold of ≥1% PDAC risk over 36 months, the PPV of the model for oral hypoglycemic initiating male population reaches 2.2%, and that for male population showing increasing A1c reaches 1.77%. However, in both models the sensitivities fall below 20% with the predicted PDAC risk thresholds of ≥1% over 36 months. Accuracy measures and PPV were lower for the female diabetes populations (Supplementary Table S3).

Discussion
We estimated the incidence of PDAC and developed and evaluated sex-specific models for prediction of PDAC in three populations with progression of diabetes in a nationwide sample of veterans. Cumulative incidence over 36 months from the time of progression of diabetes varied by definition of progression and by sex, with insulin-initiating males showing the highest incidence of PDAC diagnosis (0.37%). The models can predict the 12-and 36-month risk of PDAC with moderate accuracy among male veterans initiating insulin for diabetes and male veterans with increasing A1c levels. Male diabetes patients whose model-predicted 12-month risk of PDAC was ≥0.5% in these cohorts experienced actual PDAC incidence of 1.4%-1.5% over 12 months. This demonstrates that our models can identify high-risk patients in a substantial proportion of diabetes patients in whom earlier detection of pancreatic cancer may be feasible and warranted.
Beyond accuracy of prediction models, it is important to quantify the risk of PDAC in diabetes populations from various stages of   diabetes for consideration of PDAC screening feasibility, risks, and benefits. The range of average annual incidence of PDAC observed in the male and female VA populations with insulin initiation (0.13%-0.18%) is substantially higher than that of the general adult population (20 per 100,000, or 0.02%) in the United States (38). The three-year risks estimated for insulin-dependent males and females (0.37% and 0.24%) are higher than estimated for veterans with new-onset diabetes defined by diagnostic codes (0.25%) and higher than the general veteran population (0.11%; ref. 39). The observed incidence in insulin initiators is similar to that of new-onset diabetes defined by ICD codes The pressing question of whether or not to screen for PDAC in patients with 1% to 2% model-predicted risk of PDAC over 1 to 3 years is a matter of weighing the costs, risks, and benefits of screening, which would generally involve magnetic resonance imaging or endoscopic ultrasound. Cost-analysis studies for screening diabetes patients for PDAC are under way and preliminary results point to potentially costeffective strategies with <$100,000 per quality-adjusted life year gained (42). Screening for pancreatic cancer is now recommended for those with a family history of pancreatic cancer or individuals with high-risk germline mutations who face a lifetime PDAC risk of 4% to 40% (43). Annual surveillance by upper endoscopy is considered cost-effective for early detection of pancreatic cancer in these populations (44)(45)(46). Given the substantially high risk for PDAC among insulin-initiating populations with predicted probabilities ≥0.5% over 12 months and ≥1% over 36 months, screening may be cost-effective to implement at the time insulin initiation and further studies are warranted. Moreover, the model may help to improve the predictive performance of biomarkers by identifying high-risk individuals with higher prior probability of PDAC.
Our models identified consistent predictors of PDAC among male patients with progression of diabetes. These included current smoking, non-Hispanic ethnicity, acute pancreatitis, abdominal pain, jaundice, alcoholism, weight loss, increase in HbA1c, and lower levels of cholesterol. With the exception of Hispanic ethnicity, the identified risk factors have also been selected for in prior models of PDAC in diabetes patients (14,15,39). Our study of over 3,000 PDAC cases was well powered to detect these multiple predictors as independent risk factors of PDAC among males. Of note, weight loss and increasing glucose levels have received particular attention as potential early detection markers of PDAC (15,16,40,47,48). That these metabolic factors also predict PDAC in the three different stages of progression of diabetes confirms their predictive utility across the spectrum of diabetes. Novel factors identified in at least two of our models included Hispanic ethnicity, NAFLD, and change in bilirubin levels.
Our nationwide veteran male population consisted of a large number of Hispanic veterans, in whom the risk of PDAC was 21% to 24% lower as compared with non-Hispanic white patients in all three diabetes cohorts considered. Our observation is consistent with the ethnicity-specific trends reported among new-onset diabetes patients in an independent health system in Southern California (40). The mechanism by which Hispanic patients face lower risk of PDAC conditioning on diabetes status is unknown. Hispanic patients face higher risk for diabetes in part due to NAFLD (49). It is possible that liver fat-related metabolic dysfunction could be potential mediators of PDAC risk in the Hispanic population. Epidemiologic data support the link between hepatic fat and PDAC. Pan-cancer studies comparing incidence rates of cancer in persons with and without NAFLD demonstrate that NAFLD is associated with increased risks of multiple cancer types, including PDAC (50)(51)(52). Epidemiologic investigations comparing liver fat content in PDAC cases and controls (53) also demonstrate positive association between NAFLD and PDAC. Our study demonstrates the temporal relationship between diagnosed NAFLD and PDAC in persons with progression of diabetes with fine control for other metabolic parameters, therefore suggesting a potential impact of organ-specific fat on pancreatic cancer, independent of glucose control and obesity.
A novel aspect of our model is the consideration of duration of binary risk factors. Prior studies have shown that recent use of PPIs (18), and recent development of pancreatitis (54) are associated with greater risk of PDAC than risk factors of more distant past. Our own analyses of Medicare demonstrate that several medical diagnoses are more frequently diagnosed closer to the onset of PDAC (55). Recent health changes due to the development of PDAC were evident in several risk factors we investigated: acute and chronic pancreatitis, abdominal pain, jaundice, DPP-IV inhibitor use, the recent onset of which was more strongly associated than more distant diagnosis/use. Of note, DPP-IV inhibitor has an immunomodulatory effect (56) and its use has been associated with PDAC in humans (57). Greater risk associated with more recent use, rather than more distant use, suggests a diabetes prescription change in response to suboptimal glucose control.
Despite many strengths, our study has a few limitations with regard to generalizability and outcome ascertainment. The female population comprised less than 10% of the veteran population we analyzed. Risk of cancer, other than breast cancer (58)(59)(60)(61)(62), is poorly understood in the  (63). By year 2043, female veterans are projected to double (64), given broadening opportunities for women in the military. Although the prediction models for female veterans were limited in power, the PDAC risk and the associated risk factors we determined will have important implications for the millions of female veterans who will use VA services in the future.
Our study was also limited in that a considerable proportion of patients were not identified through the tumor registry (21), but through coded diagnoses in the medical records or in Medicare claims. The use of at least two encounters with ICD diagnosis of PDAC increases the specificity of the case identification, and the use of Medicare claims overcomes the limitations of identifying PDAC cases occurring beyond VA services. If non-PDAC cases were counted as PDAC cases, this would have biased our incidence estimates higher, while potentially diluting the model performance. Lastly, the age-adjusted incidence of PDAC estimated for nonprogressing diabetes comparison group could have differed from the progressing diabetes population by factors unaccounted for.
In conclusion, we estimated 12-to 60-month risks of PDAC in men and women at different stages of diabetes progression and found that risk is substantially higher in the diabetes populations than in the general population, especially in those with insulin initiation. The prediction models reach moderate accuracy for identifying male population at high risk for PDAC, in whom surveillance studies may be warranted. External validation studies for evaluating the performance of our prediction models in an independent setting are needed.