Risk Prediction of Pancreatic Cancer in Patients With Abnormal Morphologic Findings Related to Chronic Pancreatitis: A Machine Learning Approach

Background and Aims A significant factor contributing to poor survival in pancreatic cancer is the often late stage at diagnosis. We sought to develop and validate a risk prediction model to facilitate the distinction between chronic pancreatitis–related vs potential early pancreatic ductal adenocarcinoma (PDAC)-associated changes on pancreatic imaging. Methods In this retrospective cohort study, patients aged 18–84 years whose abdominal computed tomography/magnetic resonance imaging reports indicated duct dilatation, atrophy, calcification, cyst, or pseudocyst between January 2008 and November 2019 were identified. The outcome of interest is PDAC in 3 years. More than 100 potential predictors were extracted. Random survival forests approach was used to develop and validate risk models. Multivariable Cox proportional hazard model was applied to estimate the effect of the covariates on the risk of PDAC. Results The cohort consisted of 46,041 (mean age 66.4 years). The 3-year incidence rate was 4.0 (95% confidence interval CI 3.6–4.4)/1000 person-years of follow-up. The final models containing age, weight change, duct dilatation, and either alkaline phosphatase or total bilirubin had good discrimination and calibration (c-indices 0.81). Patients with pancreas duct dilatation and at least another morphological feature in the absence of calcification had the highest risk (adjusted hazard ratio [aHR] = 14.15, 95% CI 8.7–22.6), followed by patients with calcification and duct dilatation (aHR = 7.28, 95% CI 4.09–12.96), and patients with duct dilation only (aHR = 6.22, 95% CI 3.86–10.03), compared with patients with calcifications alone as the reference group. Conclusion The study characterized the risk of pancreatic cancer among patients with 5 abnormal morphologic findings based on radiology reports and demonstrated the ability of prediction algorithms to provide improved risk stratification of pancreatic cancer in these patients.


Introduction
Pancreatic cancer is the third leading cause of cancer-related death in the United States among cancers that afflict both men and women. 1 A significant factor contributing to poor 5-year survival in pancreatic cancer is the often late stage at diagnosis with more than 50% of patients harboring metastases at the time of presentation. 2,3 However, the United States Preventative Services Task Force recently reissued guidance against widespread populationbased screening citing several key gaps in current knowledge related to early detection. 4 One of the key areas highlighted was the need for a better understanding of the natural history of precursor lesions in pancreatic cancer.
Chronic pancreatitis (CP) is a chronic inflammatory condition of the pancreas, which manifests clinically with chronic or recurrent episodes of abdominal pain, exocrine as well as endocrine insufficiency. Imaging plays a key role in the diagnosis of CP and frequently involves a multimodality approach including computed tomography (CT), typically with one or more contrast enhancement phases, magnetic resonance imaging (MRI) with or without magnetic resonance cholangiopancreatography, ultrasound, and endoscopic ultrasound all having a role. 5,6 Imaging features include dystrophic calcifications, glandular atrophy, pancreatic duct dilatation, and cyst/pseudocyst development.
CP manifests histopathologically with loss of acinar cells, fibrosis, and chronic inflammatory cells. This dense stromal response resembles the desmoplasia often seen in the setting of pancreatic ductal adenocarcinoma (PDAC), which is thought to be mediated by activated myofibroblasts known as pancreatic stellate cells. 7 CP is an established risk factor for pancreatic cancer with a recent meta-analysis by Kirkegard et al 8 showed that 5 years after diagnosis patients with CP have a nearly 8-fold increased risk of pancreatic cancer. In addition, up to 5.5% of patients with suspected CP based on imaging are actually diagnosed with pancreatic cancer within 1 year of follow-up indicating underlying malignancy at the time of CP diagnosis. 9 In this study, we focused on PDAC, a common type of pancreatic cancer. We hypothesized that some of the characteristic imaging features associated with CP may represent early changes associated with PDAC-related desmoplasia in the appropriate clinical setting. We therefore sought to perform a comprehensive assessment of the natural history of common imaging-related morphologic changes of the pancreas as well as develop and validate a risk prediction model to facilitate the distinction between CP-related vs potential early PDAC-associated changes on pancreatic imaging.

Study Design and Setting
This is a retrospective cohort study conducted based on multiethnic health plan enrollees of Kaiser Permanente Southern California (KPSC). KPSC is an integrated health care system that provides comprehensive health care services for more than 4.8 million enrollees across 15 medical centers and 250+ medical offices throughout the Southern California region. The study data elements were extracted from the Research Data Warehouse, which integrates the data from electronic health records (EHRs) and legacy systems dating back to the 1980s and is supplemented by radiology reports obtained from the data repository of the KPSC EHR. The race/ethnicity distribution, demographics, and socioeconomic status of KPSC health plan enrollees are comparable to those of the residents in the Southern California region. 10 The study protocol was approved by KPSC's Institutional Review Board.

Cohort Identification and Follow-Up
Patients aged 18-84 years whose abdominal CT or MRI reports indicated duct dilatation, atrophy, calcification, cyst, or pseudocyst between January 1, 2008, and November 30, 2019, were identified using the natural language processing (NLP) algorithms previously reported. 11 For patients who had more than one qualifying imaging study during the study period, one was randomly selected. The selection of a random image was performed to gain a representation of the extent of imaging-based pancreatic morphologic changes, given the cumulative nature of potential findings over time while mitigating potential immortal time bias. The randomly selected imaging procedure was referred to as the index scan, and the date of the index scan was referred to as the index date (t 0 ). Exclusion criteria included reported mass in the pancreas >2 cm, history of pancreatic cancer, and enrollment in the health plan less than 12 continuous months before or 30 days after t 0 . The requirement of a continuous enrollment allowed adequate data to define study variables. For each patient in the cohort, follow-up started at t 0 and ended with the earliest of the following events: disenrollment from the health plan, end of the study (December 31, 2019), reached the maximum length of follow-up (3 years), non-PDAC-related death, or PDAC diagnosis or death (outcome).

Outcome Identification
The primary outcome was defined as the diagnosis of PDAC or death in the setting of pancreatic cancer in the 3 years after the index date. PDAC was captured from the KPSC Cancer Registry using the Tenth Revision of International Classification of Diseases, Clinical Modification (ICD-10-CM) code C25.x and histology codes listed in Supporting Document 1. The KPSC Cancer Registry is part of the Surveillance, Epidemiology, and End Results program. The pancreatic cancer deaths were derived from the linkage with the California State Death Master files and captured using ICD-10-CM codes C25.x. The utilization of the State files allowed the identification of pancreatic cancer cases that were not otherwise captured in the registry. 12 However, the cases identified through the death files did not contain information on histology.

Patient Demographic and Clinical Features at Baseline
A complete list of features included in the analyses is presented in Table 1. Diabetes was defined by International Classification of Diseases, Ninth Revision (ICD-9) or Tenth Revision (ICD-10) for diabetes (ICD-9: 250.x and ICD-10: E8-E13) or KPSC internal code (1200, 1201, 1202, 1203, 1204, 1839, 3141, 3186, 3639, 4124, or 5782), any prior glycated hemoglobin level >7.0%, or any dispensing record for insulin or an oral hypoglycemic medication (not including metformin; Table A1). Because all the laboratory values and weight measure and the changes of these values were not complete, "missRanger" was applied to impute the missing values if the frequency of missing for a feature was <60%. 13 We used predictive mean matching method 14 with k = 5. Laboratory measures with 60% or more missingness or change/change rate measures with 80% or more missingness were not included in the model development process. The missing values of weight-related features were handled in the same manner. Ten imputed data sets were generated.

Imaging Features
For each patient, we defined the presence/absence of each feature (duct dilatation, atrophy, calcification, cyst, or pseudocyst) using NLP based on the index scans and all the abdominal scans available in the KPSC system between January 1, 2004, and t 0 . The NLP algorithms to extract the 5 features were previously described. 11

Model Training and Validation Based on Machine Learning
A machine learning method, random survival forests (RSF), [15][16][17] was used to preselect features and train/validate risk prediction models. The learning process of RSF involves randomly drawn bootstrap samples to be used to grow trees and randomly selected predictors to split nodes. The results are averaged among trees. Compared with the Cox proportional hazards regression model, RSF has the advantages of handling nonlinear effects and interactions among predictors and without needing to test the proportionality assumption.
Feature Selection.-For each of the 10 imputed data sets, we ran RSF to preselect the most influential features. Those with an average minimum depth of <6.5 (first round) and 5.4 (second round) were identified. To avoid overfitting, we applied 5-fold cross-validation method. 18 We randomly divided each imputed data set into 5 folds and use the first 4 folds of data for model development and the remaining one fold for validation. Repeat the process 4 more times until each of the 5 folds is left out once for validation.
Based on the preselected features, the following steps were repeated 5 times for each of the 10 imputed data sets to select the most important features.

1.
Preselected features that were not in the model were added, one at a time.
Each time, the feature that yielded the maximum improvement of c-index was selected.

2.
This iterative process continued until the increase of c-index is <0.005.
Hyperparameter Setup.-The number of trees and depth of trees were set to 100 and 7, respectively. The number of covariates available for splitting at each node (termed "mtry") was set to be an integer that is close to the square root of the number of covariates.
Model Selection.-Of the 50 models derived from the 50 training data sets, the ones that appeared the most often were selected as the final models.
Model Validation.-The algorithms of the winning models were applied to the corresponding validation data sets that were left out for validation. By design, the validation data sets did not include any observations of the training data sets from which the winning models were developed.
Performance Measures.-The discriminative power for each of the winning models was evaluated by c-index, a concordance measure, pooled across all the relevant validation data sets for cohort members using Rubin's rule implemented in mi.meld function within the R package Amelia. [19][20][21] Calibration was assessed by calibration plots with 5 risk groups (<50th, 50-74th, 75-89th, 90-94th, and 95-100th percentiles). 22 The calibration plot was produced for the best model.

Statistical Analysis
Patient demographic, clinical, and imaging features are reported as n (%), mean (standard deviation), or median (interquartile range) as appropriate. Kaplan-Meier plot was generated to present PDAC-free survival in patients with the presence of one or more imaging features.
Overall and risk factor-stratified crude event rates were calculated using log-linear (Poisson) regression with a generalized estimating equations approach and are reported as per 1000 person-years of follow-up. To estimate the effect of the covariates on the risk of PDAC, multivariable Cox proportional hazard model was applied, and hazard ratios (HRs) were reported with 95% confidence intervals (CIs). All the continuous variables were normalized based on z-score standardization before they were applied to the Cox model. To estimate the pooled HR, we combined the HR derived from each of the 10 imputed data sets using Rubin's rule implemented in PROC MIANALYZE in SAS. All the analyses were performed using SAS (Version 9.4 for Unix; SAS Institute, Cary, NC) except for the R packages mentioned previously. All computations and analyses carried out in R were based on R Version 3.6.0 (R Foundation, Vienna, Austria).

Characteristics of the Study Cohort
A total of 46,041 patients/examinations met the eligibility criteria ( Figure A2; mean age 66.4 years, 55.8% female, 51.2% non-Hispanic White, 27.2% Hispanic, 11.2% African American, and 9.2% Asian and Pacific Islanders), with an average follow-up time of 1.9 years. Patient characteristics are presented in Table 1. Overall, 48.5% of patients were current or ever smokers. Alcohol abuse was reported in 6.8% in the past year and in 11.5% any time in the past. More than 3% had a family history of pancreatic cancer. One-third of study subjects had diabetes, 23.9% had gallstone disorders, and 28.9% had biliary tract disease order. In addition, 4.1% of patients had CP, and 12.6% had acute pancreatitis. The percentage of patients who were hospitalized in the past 12 months for pancreatic-related conditions was 8.5%. The median HbA1c was 6.2 (IQR: 5.7, 7.1). The 2 most common gastrointestinal symptoms were abdominal pain and back pain (33.6% and 21.3% in the 6 months before the index scan, respectively).
In terms of the imaging findings, 6753 (14.7%) patients were identified based on MRI, and 39,288 (85.3%) were identified based on CT scan (Table 2). A majority (77.8%) were performed in an outpatient or emergency department setting. Atrophy (31.2%) and cyst (31.8%) were the most common imaging abnormalities, followed by calcification (27.4%) and duct dilatation (22.6%). Overall, 17.4% of patients had more than one abnormal morphologic feature. Abdominal pain was the most common indication for the index scan accounting for 25.2% of study subjects. Other common indicators included gastrointestinal problem (13.1%), other pain (10.9%), and concern raised by laboratory test results (9.9%).

Incidence of PDAC
Of 46,041 eligible patients, 370 developed PDAC within 3 years with an incidence rate of 4.0/1000 (95% CI 3.6-4.4/1000) person-years of follow-up. The median follow-up time for PDAC cases was 96 days (interquartile range, 49-294 days). Of the 370 PDAC cases, 296 (80%) were captured from the KPSC Cancer Registry, and the rest (74 or 20%) died of pancreatic cancer based on the information with the CA State death files. The total follow-up time in years, mean follow-up time per patient, number (and incidence rate) of PDAC, and time to PDAC diagnosis or death are reported in Table 3. In terms of individual findings, main duct dilatation was associated with the highest incidence of PDAC (Table 3).
The cumulative incidence of PDAC in 3 years by imaging feature is displayed in Figure  1. The observed incidence of PDAC was further elevated among patients, with main duct dilatation combined with additional findings, particularly in the absence of calcification (Table 3). Patients without calcification but with pancreas duct dilatation and one or more other feature(s) had the highest incidence rate, followed by patients with both calcification and pancreas duct dilatation and patients with only duct dilatation (Table 3, Figure 1).

Demographic and Clinical Parameters Associated With Increased Risk of PDAC
In addition to imaging-based risk, various demographic and clinical parameters were associated with increased risk of PDAC (Table 3). Older age, male sex, and African American race were each associated with higher risk of cancer. Family history of PDAC was also associated with increased risk. In terms of clinical parameters weight loss in the past year, elevated alkaline phosphatase (ALP), lipase, bilirubin, or glycated hemoglobin value at the time of index scan was associated with increased PDAC incidence. In addition, increased extent of alanine transaminase change within the past 1 year was associated with a higher PDAC incidence.

Risk Factors Associated With the Risk of PDAC Based on Cox Regression Analysis
The adjusted HRs and 95% CIs for risk of PDAC from a multivariable model incorporating the aforementioned risk factors for PDAC are reported in  Table 4.

Risk Prediction Models Based on RSF Analysis
The preselection process identified 14-21 potential predictors from the 10 imputed data sets. Of the 50 training data sets, the models with age, weight change, duct dilatation, and either ALP or total bilirubin appeared most often (Table 5). A summary of training and validation data sets can be found in Table A2 of the Online Document.
The mean and standard deviation of c-index based on the validation data sets for each winning model are reported in Table 5. The c-indices were high for both models (0.811 for the model with ALP and 0.805 for the model with total bilirubin). The calibration plot based on age, weight change, duct dilatation, and ALP was displayed in Figure A2. The differences between the average predicted and averaged observed differences were small for the 3 lowest risk groups ( Figure A2). Although the differences appeared to be somewhat large in the 2 highest risk groups ( Figure A2), the ranges of the absolute difference between the predicted and the observed were only 0.07%-0.22% (data not shown). The calibration plot for the model with bilirubin was similar (data not shown).

Discussion
In this study, we performed a comprehensive assessment of the relationship between common parenchymal and ductal abnormalities of the pancreas on cross-sectional imaging with the risk of pancreatic cancer. Specifically, we applied NLP to identify a large cohort of patients with the presence of at least one feature commonly associated with CP: main duct dilatation, atrophy, cyst/pseudocyst, or calcification. The implementation of NLP makes the information extraction feasible for a large cohort of patients. We then performed traditional Cox regression analysis to assess the relative risk of developing PDAC based on individual as well combinations of imaging findings in addition to patient demographic and clinical parameters. Finally, we developed and validated risk prediction models using an empiric machine learning-based approach (RSF) to optimize the use of patient demographic, clinical, as well as imaging data for the prediction of 3-year risk of PDAC. The final models were able to achieve a high level of discrimination (c-index of 0.81) with acceptable calibration (absolute risk difference predicted vs predicted 0.07%-0.22%) for 3-year risk of PDAC.
Of the 5 morphological features we studied, the associations between main duct dilatation, [23][24][25][26] pancreatic parenchymal atrophy, 27-29 chronic calcific pancreatitis, 30,31 and pancreatic cyst 26,32-34 with pancreatic cancer have been previously reported in smaller casecontrol studies. In the present study, we developed and validated risk prediction models based on these morphological features using a much larger data set, including additional patient demographic and clinical features. We also reported the absolute risks and the relative risks of the individual morphological features.
Pancreatic cancer is a devastating disease and represents the third leading cause of cancerrelated death among cancers that afflict men and women in the United States. 1 A major factor contributing to the lethal nature of PDAC is the advanced stage at presentation, with more than 50% of patients having distant metastases at the time of diagnosis. 2,3 Therefore, approaches for early detection are urgently needed to improve patient outcomes. However, due in part to the relatively rare nature of PDAC (incidence 14 in 100,000), the United States Preventative Services Task Force recently reissued guidance against widespread populationbased screening for PDAC. 4 Another key barrier to early detection in PDAC has also been the inability to identify precursor lesions on conventional imaging.
We hypothesized that changes related to early cancer-related desmoplasia might be visible on cross-sectional imaging and could share the appearance of features typically associated with CP. A hallmark of PDAC is a dense surrounding stromal response consisting of extracellular matrix proteins, activated myofibroblasts (stellate cells), and inflammatory cells described as desmoplasia. 7 This stroma can constitute up to 90% of tumor volume. 35 The tumor microenvironment also plays a key role in early tumor progression. [36][37][38][39] Although the precursor lesion to PDAC, Pancreatic Intraepithelial Neoplasia type III (PanIN III) or highgrade dysplasia, 40 is a microscopic lesion that is not visible on cross-sectional imaging, it is conceivable that changes in pancreas morphology related to early cancer-related desmoplasia can be identified before tumor diagnosis. In particular, we assessed features commonly associated with CP, given shared mechanistic pathways with activated pancreatic stellate cells playing a key role in mediating extracellular matrix deposition. 41 Our hypothesis was supported by the low proportion of patients with a clinical diagnosis of either acute or CP in the imaging-based study cohort, 12.6% and 4.1%, respectively, as well as the relatively short interval to cancer diagnosis (median 96 days).
Understanding the relationship between individual and combinations of imaging findings with the risk of PDAC can help develop a profile for imaging changes during early cancer development. Of the 5 morphological features included in the study, pancreas duct dilatation, either alone or in combination with one or more other morphological features, significantly increased the risk of PDAC. This finding is consistent with previous studies associating early findings of pancreas duct dilatation with the development of pancreatic cancer. 23,24 In the study of Singh et al, 24 abrupt pancreas duct cut-off/duct dilatation were seen on CT images 12.8 months before cancer diagnosis. A review of Gangi et al 23 revealed that definite or suspicious findings (predominantly duct dilatation) based on CT studies were present in 50% of the CTs obtained in the 6-18 months before the diagnosis of pancreatic cancer. However, the median time to cancer diagnosis among patients with duct dilatation was only 74 days, indicating this is likely a very late event in tumor development. In contrast, other findings such as parenchymal atrophy were associated with a longer interval before cancer diagnosis. This observation combined with that of patients with duct dilatation in conjunction with other imaging abnormalities conferred greatest risk, and most rapid onset of PDAC argues for a sequential accumulation of imaging findings potentially corresponding with stages of early tumorigenesis as illustrated in Figure 2.
As the imaging findings included in the present study can also be seen in the setting of age-related changes or conditions other than PDAC, we set about determining additional clinical parameters that would enhance specificity for early cancer-related morphologic changes. In addition to established risk factors such as advancing age and family history, 42 weight loss, and elevated A1c, 43,44 elevation in lipase level and alterations in liver tests were also associated with the development of PDAC. Among these clinical parameters, weight loss was associated with the longest interval to cancer diagnosis consistent with previous studies among patients with new-onset diabetes. 45 Weight loss in the setting of one of the aforementioned imaging abnormalities would raise suspicion for cancer-related changes. This also supports previous observations that cancer-related cachexia in PDAC can begin before tumor diagnosis potentially mediated by alterations in body fat composition. 46,47 The empiric machine learning-based prediction models were developed to enhance the specify of imaging findings for the identification of cancer-related changes as well as demonstrate the potential accuracy of combining data from imaging reports with clinical parameters from the EHR. The final models selected by the algorithm were parsimonious, containing only 4 parameters: age, duct dilatation, weight loss, and a measure of cholestasis (ALP or bilirubin). These models could have several future applications in terms of research including integration with emerging blood-based biomarkers for early detection of PDAC. In addition, such a model could be included as an automated algorithm for enhanced radiology reporting of PDAC risk when pancreatic abnormalities are identified in the context of routine clinical care.
Although malaise/fatigue is a known risk factor of PDAC, it was found to be a protective factor in the present study. This could be at least partially attributed to non-PDAC cancers, which also causes malaise and fatigue. Overall, 7.6% of the study subjects had active cancer other than PDAC, and the risk of PDAC in this group of patients is lower.
There were several limitations in the present study. First, the images used for analysis were acquired in the context of routine clinical care, and as a result, there was variation in types of studies and imaging protocols used. This may have caused inconsistency in the interpretation of the imaging reports. Second, the study population was heterogeneous with respect to the indications for imaging. It is therefore unclear how the present study findings would extend to an asymptomatic population undergoing screening. However, the findings do reflect conditions in real-world practice. Third, it is possible that some of the desired features may not have been reported by radiologists as part of a clinical reading for a nonpancreas-related indication. Thus, the prevalence of the abnormalities may be higher than what was reported.
A direct imaging analysis in the future to extract pancreas morphological features could minimize the issue.
Also, the current analysis looked only at morphologic imaging features on CT and MRI. The analysis did not include evaluation of newer oncologic imaging techniques in MRI, such as diffusion weighted imaging, or quantitative measures such as differential contrast enhancement on both single and dual-energy CT (delta). Studies have shown diffusion weighted imaging is helpful in distinguishing pancreatic cancer from acute or CP. 48 Differential contrast enhancement (high delta) has been shown to aid in the identification of pancreatic neoplasms 49 as well as correlate with prognosis. 50 It is possible that some of these features could be even more predictive, and assessment of other features provides opportunity for future research.
Despite the aforementioned limitations, the present study has some key strengths that have enabled us to glean new insights into the relationship between specific imaging findings and early pancreatic cancer. First, by scaling up a previously developed automated natural language algorithm for pancreas findings on the free text of radiology reports, we were able to identify a large cohort of patients with the features of interest on cross-sectional imaging. By combining this approach with comprehensive data from a robust electronic health system within an integrated care system, we were able to reliably ascertain both patient-related clinical characteristics as well as robust ascertainment of cancer diagnoses. Finally, by incorporating state-of-the-art machine learning approaches to predictive modeling, we were able to achieve a high degree of accuracy for discrimination of findings suggestive of early cancer by combining structured data from the EHRs as well as unstructured data from radiology reports acquired in the context of routine clinical care.
In conclusion, we have characterized the risk of pancreatic cancer among patients with 5 abnormal morphologic findings based on radiology reports and demonstrated the ability of prediction algorithms to provide improved risk stratification of pancreatic cancer in these patients. We have further mapped the temporal development of imaging abnormalities in relation to cancer diagnosis, which suggests an accumulation of derangements that may parallel early tumorigenesis with main duct dilatation representing one of the last developments in this sequence. Based on our initial hypothesis, the overlap of morphologic changes seen before PDAC diagnosis with classic features of CP likely represents macroscopic changes associated with the stromal response in early tumorigenesis seen in PDAC rather than the tumor itself. Although much additional investigation is needed, these findings suggest that features associated with cancer-related desmoplasia may be visualized before cancer development and therefore provide a suitable target for early detection as well as provide a critical window for potential intervention or perhaps even prevention by applying therapy directed at altering the tumor microenvironment before frank tumor development.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.

Acknowledgments:
The authors thank Sole Cardoso for the assistance with formatting the manuscript.

Funding:
Research reported in this publication was supported by a grant from the National Cancer Institute (5U01CA200468-05). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agency.

Abbreviations used in this paper:
ALP alkaline phosphatase  The cumulative incidence of PDAC in 3 years by imaging feature. The order of the descriptions in the legend and the order of the curves match. Pancreas with duct dilatation and atrophy involving the body as well as tail of pancreas 2 years before cancer diagnosis.