A systematic review and network meta-analysis of single nucleotide polymorphisms associated with pancreatic cancer risk

In this meta-analysis, we systematically investigated the correlation between single nucleotide polymorphisms (SNPs) and pancreatic cancer (PC) risk. We searched PubMed, Network Science, EMBASE, Cochrane Library, China National Knowledge Infrastructure (CNKI), China Science and Technology Periodical Database (VIP), and Wanfang databases up to January 2020 for studies on PC risk-associated SNPs. We identified 45 case-control studies (36,360 PC patients and 54,752 non-cancer individuals) relating to investigations of 27 genes and 54 SNPs for this meta-analysis. Direct meta-analysis followed by network meta-analysis and Thakkinstian algorithm analysis showed that homozygous genetic models for CTLA-4 rs231775 (OR =0.326; 95% CI: 0.218-0.488) and VDR rs2228570 (OR = 1.976; 95% CI: 1.496-2.611) and additive gene model for TP53 rs9895829 (OR = 1.231; 95% CI: 1.143-1.326) were significantly associated with PC risk. TP53 rs9895829 was the most optimal SNP for diagnosing PC susceptibility with a false positive report probability < 0.2 at a stringent prior probability value of 0.00001. This systematic review and meta-analysis suggest that TP53 rs9895829, VDR rs2228570, and CTLA-4 rs231775 are significantly associated with PC risk. We also demonstrate that TP53 rs9895829 is a potential diagnostic biomarker for estimating PC risk.


INTRODUCTION
AGING TERT, UGT2B4, XRCC4, XPC, SLC22A3, NR5A2, ABO and XPD genes are associated with susceptibility to pancreatic cancer [3][4][5][6][7][8][9][10]. Single nucleotide polymorphisms (SNPs) in several genes correlate with increased risk of pancreatic cancer [11]. SNPs in protein-coding genes and non-coding RNAs are the most common type of gene mutations implicated in several human disease. Specific SNPs are associated with increased or decreased risk of multiple cancer types because of a genetic phenomenon called linkage disequilibrium (LD) [12]. Variants in insulin-like growth factor, platelet-derived growth factor subunit B, atopy-related immunologic candidate genes, tasterelated genes, and inflammatory genes are associated with pancreatic cancer (PC) risk [5,11,[13][14][15][16]. However, the results of many SNP-related studies are often inconclusive because of small sample sizes. Therefore, systematic review of multiple studies is required to analyze the relationship between pancreatic cancer and SNPs [17][18][19][20]. There are very few systematic reviews regarding the relationship between SNPs and pancreatic cancer. Therefore, we performed this meta-analysis to identify prominent SNPs associated with greater PC risk. We then selected the most suitable genetic model by comparing data for these PC-related SNPs from network meta-analysis and Thakkinstian algorithm. We then evaluated the reliability of the meta-analysis results using the false positive report probability (FPRP) to determine the most strongly associated SNPs with pancreatic cancer susceptibility.

Description of included studies
This study included 45 studies with 36,360 PC patients and 54,752 non-cancer controls. Supplementary Table 2 shows the data characteristics of the meta-analysis. Initial screening identified 178 genes and 419 SNPs in the included studies, but, only 27 genes and 54 SNPs met the final selection criteria. The genes and SNPs identified in the 45 studies are shown in the Supplementary Table 2. See Table 1 for more details. A total of 45 articles were included [2,3,5,9,. The results of the quality evaluation of the included studies are shown in Supplementary Table 1. The evaluation criteria of this study include the following nine aspects: (1) whether to describe genotyping methods; (2) Whether to describe the population stratification method; (3) Whether to describe genotype inference method; (4) Whether the genotype distribution of the control group conforms to HWE; (5) Whether to emphasize the repeatability of research; (6) Whether to describe the inclusion and exclusion criteria and matching methods for the research objects; (7) Whether the statistical method and software version are explained; (8) Correlation judgment method; (9) Whether the data is sufficient

Pairwise meta-analysis
Supplementary Table 3 shows the results of the direct  meta-analysis to determine the correlation between 54 SNPs and PC risk. We evaluated 6 genetic models for all SNPs to determine the most optimal genetic model that shows correlation with PC risk. The GG and AA genotypes of TP53 rs9895829 showed significant correlation with higher PC risk compared to the GA genotype (GG+AA vs. GA ). In addition to this, direct meta-analysis of other meaningful models showed the following in Table 2.

Network meta-analysis and Thakkinstian algorithm analysis of the most appropriate genetic models for SNPs associated with PC risk
We performed network meta-analysis with the consistency model to compare the genetic models of different SNPs that show significant correlation with PC risk in order to select the most suitable genetic model for studying PC susceptibility. The results showed that some SNPs were linked to a network, whereas the others were linked only with their genetic models ( Figure 1). There are altogether 14 networks, among which 6 networks are formed according to different SNPS, while the other 8 networks are formed within SNPS only with different models as nodes due to insufficient data. After comparing the genetic models with network meta-analysis and paired meta-analysis (Supplementary Tables 1-14   AGING rs25487, XPC rs2607775, MORC4 rs12837024, VEGF +405G/C rs2010963, MTHFR rs1801133) and their the most suitable gene model. Based on the rank probabilities, the optimal models for most genes were either additive or dominant ( Figure 1). The results of Thakkinstian analysis showed that the co-dominant model was most optimal for these 14 SNPs ( Table 2). The prior probability FPRP values for these 14 SNPs are summarized in Table 2.
Both network meta-analysis and the Thakkinstian's criteria showed different optimal gene models for the 14 SNPs, but the model that showed significant correlation when the FPRP value was less than 0.2 was selected as the most optimal one to determine PC risk (Supplementary Table 3

Diagnostic meta-analysis
We performed diagnostic meta-analysis of the additive gene model of TP53 rs9895829 to evaluate the efficacy of this SNP to diagnose pancreatic cancer. As shown in Figure 2, the results of the diagnostic meta analysis for the additive gene model of TP53 rs9895829 based on the random effects model were

DISCUSSION
Several studies have investigated genetic susceptibility in PC, but the relationship between PC and SNPs is not conclusive. In this meta-analysis, we combined the results of several published studies to evaluate the association between PC and SNPs. We performed network meta-analysis, which is similar to pairwise meta-analysis, but is validated based on the quality of evidenceThe ranking probability was obtained by combining direct and indirect evidences with a Bayesian approach. A previous study successfully applied this approach to select the best genetic model for detecting the risk of hepatocellular carcinoma [61].
In our study, TP53 rs9895829 showed stronger association with PC risk compared with COX2-765, HIF-1α rs11549467, VDR rs2228570, TP53 rs9895829, CTLA-4 rs231775, and MTHFR rs1801133. However, we were unable to conduct further subgroup analysis on TP53 rs9895829 to explore its specific association with PC because of smaller sample size.
In this study, we aimed to identify the most optimal genetic models among the six genetic models for the 30 SNPs that associate with PC risk by using network meta-analysis and Thakkinstian algorithm. Network meta-analysis is an extension to pairwise meta-analysis and, similar to pairwise meta-analysis, its validity is based on quality of evidence. The ranking probability was obtained from a combination of direct and indirect evidence with a Bayesian approach. dIn our study,  the available data and did not consider any extrinsic factors that may affect the results in the study.
We then used FPRP values to determine the most plausible genetic model for the genes listed in Table 1.
The three determinants of FPRP are prior probability, observed P value or α level, and statistical power, Wacholder et al. suggested that large studies or pooled analyses should use a stringent FPRP value below 0.2, prior probability as high (≈0.1), moderate (≈0.01), or low (≈0.001), and statistical power of 1.5 for alleles with higher cancer risk to obtain meaningful results [61,62]. We chose moderate prior probability of 0.01 for FPRP and analyzed the 14 genes associated with pancreatic cancer. Our analysis showed that TP53 rs9895829 was the best susceptibility gene for PC because it demonstrated a FPRP value of less than 0.2 even when the prior probability was 0.00001. The remaining 13 candidates showed FPRP values above 0.2 when prior probability value of 0.00001 was used.
Pancreatic cancer (PC) is a highly malignant cancer and is caused by a variety of factors of unknown etiology. The death rate of PC ranks eighth among all cancers worldwide and fourth among developed countries, with more than 260,000 deaths reported each year [63]. Most patients survive for less than a year after diagnosis and only 5% of PC patients survive for more than 5 years [64,65]. The incidence of pancreatic cancer varies with population structure and the lifestyle of individuals [66,67]. Identification of risk genes is critical in for decreasing the high mortality rates in PC. Our metaanalysis demonstrates the most relevant model for PC risk by collating the results of already published casecontrol studies related to PC. However, further highquality studies with larger sample sizes and detailed PC risk factor data are necessary in the future to conclusively prove our findings.
Van et al. showed that vitamin D deficiency is very common in patients diagnosed with advanced pancreatic cancer [68]. Several studies have shown that 1, 25(OH) 2 Vitamin D regulates cellular proliferation, differentiation, apoptosis, and angiogenesis [64]. Colston et al. demonstrated that 1, 25(OH) 2 VitD and its synthetic analogues inhibited the proliferation of PC cell lines [69]. Vitamin D receptor (VDR) is expressed in the stroma of pancreatic tumors and mediates interstitial reprogramming to inhibit pancreatitis and pancreatic cancer [70]. VDR gene polymorphisms are associated with colon, breast, kidney, and prostate cancers [71][72][73][74][75]. Moreover, VDR gene polymorphisms affect immune response in immunerelated diseases such as Graves' disease [76] and SLE [77]. The VDR rs2228570 T/C allele is 10 base pairs upstream of the translation initiation codon with the rs2228570 C allele variant generating shorter protein with AGING higher activity than the rs2228570 T variant [78,79].
Alimirah F et al. demonstrated that the T allele of rs2228570 increases breast tumor aggressiveness by upregulating the expression of epidermal growth factor receptor (EGFR) [30,80]. Li et al. showed that VDR rs2228570 gene polymorphism is associated with PC risk in the Northern China population [30]. VDR rs2228570 polymorphism also significantly correlates with pathological differentiation and TNM stages, and is a potential prognostic biomarker for PC [81].
The cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) gene is located on chromosome 2q33 and encodes a crucial immune checkpoint protein on the T-lymphocytes; it consists of four exons that encode the leader sequence, the extracellular domain, the transmembrane domain, and the cytoplasmic domain [82]. Injection of anti-CTLA-4 antibody promotes antitumor immunity by enhancing the activation of T cells, thereby demonstrating its importance in tumorigenesis [83,84]. The +49G>A allele in CTLA-4 rs231775 changes of the amino acid from alanine 17 to threonine 17 and is associated with the high expression of CTLA-4, which inhibits T cell activation and proliferation [82,85,86]. In the Chinese population, the CTLA-4 + 49a allele is associated with increased risk of lung, breast and cervical cancers [83,85,86]. Another meta-analysis showed that the CTLA-4 +49A allele is associated with increased risk AGING of pancreatic cancer in Caucasians and Chinese populations compared to the +49G allele [87]. In general, CTLA-4 is highly expressed in human pancreatic cancer cells [88]. The phase 2 trial of the anti-CTLA4 antibody, Ipilimumab, showed delayed progression in some advanced stage pancreatic cancer patients [89].
P53 protein encoded by the TP53 gene plays a significant role in DNA damage, hypoxia, and metabolic stress, and inhibits tumorigenesis by regulating cell cycle and apoptosis [90]. Somatic mutations in the TP53 gene have been reported in nearly 50% of human cancers, including pancreatic cancer [91,92]. Morton et al. demonstrated that TP53 mutations promote PC metastasis [93]. TP53 gene mutations have been reported in 60% of sporadic pancreatic cancer cases and 33% of familial pancreatic cancer cases [94]. Biankin et al. demonstrated that TP53 mutations are associated with susceptibility to pancreatic cancer [95]. Feng et al. showed that TP53 rs9895829 SNP was related to increased expression and activation of p53 in 373 lymphoblast cell lines [31].
Overall, our data suggests that several SNPs are potential candidates to diagnose PC because they are related to greater PC risk. However, because of smaller sample sizes and lack of sufficient information about extrinsic factors, we could not conduct sub-group analysis and optimal diagnostic meta-analysis of relevant indicators. Therefore, a single SNP may not be a sufficient indicator of PC risk, but, we postulate that analyzing multiple genes and SNPs may be a relevant diagnostic index for determining PC risk.
There are several limitations s in our study. Firstly, we lacked sufficient data to perform subgroup analysis and calculate heterogeneity. Secondly, we did not consider the potential impact of many extrinsic factors because these data were not available in the included studies. Thirdly, some of the included studies were of poor quality, which limited our ability to validate the combined results and perform subgroup analysis. Fourthly, it is plausible that because of our inclusion criteria, we excluded studies with relevant information about the SNPs. Hence, further analysis with large sample sizes and quality data is necessary to confirm our findings.
Nevertheless, this is the first systematic review and metaanalysis to our knowledge to comprehensively assess several SNPs associated with PC through network metaanalysis and Thakkinstian algorithm. We also measured the reliability of the meta-analysis results by FPRP to identify SNPs strongly associated with pancreatic cancer susceptibility. Our data suggests that some of the SNPs may be used in the future either alone or in combination for early screening of pancreatic cancer.
In conclusion, our data suggests that additive gene models of COX-2 -765, HIF-1α rs11549467, and TP53 rs9895829, as well as dominant gene models of DR rs2228570, CTLA-4 rs231775 and MTHFR rs1801133 are associated with PC risk. The additive genetic model for TP53 rs9895829 is the most optimal to diagnose PC risk. Future studies with large samples, detailed data on PC risk factors, and high-quality research are required to further validate the role of these PC risk-related SNPs.

MATERIALS AND METHODS
This systematic review and meta-analysis was conducted in accordance with the guidelines and protocols of the systematic review and meta-analysis preferred reporting project (PRISMA) and registered in the INPLASY database (INPLASY202040023).

Criteria for included studies
We included case-control studies on SNPs related to PC risk for this meta-analysis. We excluded repetitive reports, conference reports, review reports, news articles, animal studies, studies without sufficient data to calculate genotype distribution, and studies regarding SNPs that deviate from Hardy-Weinberg equilibrium (HWE). In these included studies, the experimental group included serum samples from PC patients that had not received any chemotherapy, whereas, the control group included healthy individuals, patients with non-malignant diseases, and noncancer patients of different ages, gender, country, and tumor stage.

Study search, selection and data extraction
We used terms such as single nucleotide polymorphism, SNP, pancreatic cancer and pancreatic tumor to search PubMed, Web of Science, Embase, Cochrane Library, China National Knowledge Infrastructure (CNKI), Science and Technology Periodical Database (VIP), and Wanfang databases for studies published until January 2020 without any language restrictions. The search criteria for the Pubmed database are shown in Supplementary Information 1.
Data selection was performed independently by two reviewers (ZY and LL). In the case of disagreements, a third independent reviewer (JZ) was involved to reach consensus. The strategy used for study selection is shown in Figure 3. We extracted data including author names, year of publication, country, sample size of men and women, Hardy Weinberg equilibrium values, genotyping methods and genotype frequencies. The data was methodically evaluated by two independent reviewers (ZY and LL) according to the guidelines of the STREGA statement [23]. The third reviewer (JZ) was involved in AGING resolving any issues between the two reviewers. The corresponding authors were contacted if any data was missing, insufficient, or vague. However, if relevant data was not obtained, those studies were excluded.

Statistical analysis
The statistical data was analyzed using the StataMP14.0 software (https://www.stata.com/). The fixed effects or random effects pooled odds ratio (OR) were calculated using 95% confidence intervals (CIs) for pairwise metaanalysis based on the heterogeneity of the genetic models. We then conducted a network meta-analysis to determine the most suitable genetic model for each SNP.
The heterogeneity between studies was analyzed using I 2 statistic and P value. Fixed effects model was used for studies with low heterogeneity as indicated by I 2 value less than 50% and P value greater than 0.1. Otherwise, random effects model was used for studies with high heterogeneity. When sufficient data was available for SNPs with heterogeneity, we performed subgroup analysis to identify the source of heterogeneity and generate an optimal genetic model that can be used to predict PC susceptibility.

Network meta-analysis
We used the ADDIS software (1.14) based on the Bayesian framework and Markov Chain Monte Carlo AGING (MCMC) theory to generate mesh relationship diagram between genes related to PC risk. The four parallel Markov Chain Monte Carlo (MCMC) simulations underwent a burn-in phase of 20,000 stimulations and then an additional phase of 50,000 stimulations. The outcomes were evaluated by OR and 95% CI under random -effects model and consistency model was applied if 95% CI of log (OR) included 0. Otherwise, inconsistency model was used. The potential scale reduction factor (PSRF) was used to determine convergence. The model was considered convergent if the value of PSRF was closer to 1.0. This Bayesian method was used to rank each genetic model regarding the probability of PC risk and the corresponding rank probability map was automatically generated.

Diagnostic meta-analysis
We performed diagnostic meta-analysis using the Meta-DiSc software [22] and evaluated sensitivity, specificity, likelihood ratios (LRs), diagnostic odds ratios (DORs), and summary receiver operating characteristic curves (SROC) of the SNPs to predict PC risk.

False positive report probability (FPRP)
In the mesh meta-analysis, we used the Thakkinstian's algorithm [23] to evaluate the best genetic model for each SNP. A SNP consists of a dominant allele (G) and a recessive allele (g).

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.