Genetic Variants Associated with Thyroid Cancer Risk: Comprehensive Research Synopsis, Meta-Analysis, and Cumulative Epidemiological Evidence

Purpose With the increasing incidence of thyroid cancer (TC), associations between genetic polymorphisms and TC risk have attracted a lot of attention. Considering that the results of associations of genetic variants with TC were usually inconsistent based on publications until now, we attempted to comprehensively evaluate the real evidence of associations between single nucleotide polymorphisms (SNPs) and TC risk. Method We performed meta-analyses on 36 SNPs in 23 genes associated with TC susceptibility based on the data from 99 articles and comprehensively valued the epidemiological evidence of significant associations through the Venice criteria and false-positive report probability (FPRP) test. OR and P value were also calculated for 19 SNPs in 13 genes based on the insufficient data from 22 articles. Results 19 SNPs were found significantly associated with TC susceptibility. Of these, strong epidemiological evidence of associations was identified for the following seven SNPs: POU5F1B rs6983267, FOXE1 rs966423, TERT rs2736100, NKX2-1 rs944289, FOXE1 rs1867277, FOXE1 rs2439302, and RET rs1799939, in which moderate associations were found in four SNPs and weak associations were found in eight SNPs. In addition, probable significant associations with TC were found in nine SNPs. Conclusion Our study systematically evaluated associations between SNPs and TC risk and offered reference information for further understanding of polymorphisms and TC susceptibility.


Introduction
yroid cancer (TC) is the most common endocrine malignant tumor with the increasing incidence worldwide. Besides radiation exposure, TC is also closely related to family inheritance and genetic variant risk [1]. As early as 2009, Gudmundsson et al. firstly pointed out that variants on 9q22.33 (FOXE1) and 14q13.3 (NK2 homeobox 1 (NKX2-1)) might increase the risk of papillary thyroid cancer and follicular thyroid cancer [2]. BRAF V600E mutation is comparatively common and widely used in the detection of papillary thyroid cancer [3]. However, still most of the genetic variation remains uncharacterized with TC susceptibility.
So far, the research on associations between genetic variation and cancer risk received a lot of attention. Quite a few pooled studies and reviews have expounded the relationship between TC and genetic variation [4][5][6], but it is difficult to interpret the inconsistent results between the same variants and TC risk. A small sample size may not have sufficient ability to detect the true associations. Meta-analyses can comprehensively conduct secondary research by collecting the effective data from single study, which can increase the statistical power and reliability of the causality [7,8]. However, there are still inconsistent results in the meta-analyses updated until now, which indicates the existence of false-positive report caused by unnecessary overlap. Moreover, the Venice rating standard, firstly proposed by Ioannidis et al. [9], has been used to systematically grade the cumulative evidence of genetic associations, so as to help understand associations between genetic variants and disease [10,11]. Herein, we collected data updated until now and performed meta-analyses to comprehensively evaluate the evidence for further understanding of associations between genetic variation and TC risk.

Materials and Methods
Our study was performed based on the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement and the Human Genome Epidemiology Network for the systematic review of genetic association studies [12,13].
We searched publications about genetic variation and TC risk on PubMed, MEDLINE, Web of Science, and CNKI before December 31, 2020, using the keywords as follows: ("thyroid"), ("cancer" or "carcinoma"), ("genetic" or "single nucleotide polymorphism (SNP)" or "SNP" or "polymorphism" or "genotype" or "variation" or "variant" or "mutation" or "susceptibility"), ("association" or "associate"), using "and" collect each keyword as well. A total of 3887 records were searched, as well as 157 records from relevant reference publications. As a result, 99 relevant publications with available data were included in our study. e articles included in our study must meet the following inclusion criteria: (i) the object of study must be thyroid cancer, (ii) studying associations between genetic variants and etiology of TC using human-related case-control or cohort or crosssectional study, and (iii) offering sufficient data to perform meta-analyses. Repetitive and unrelated articles were excluded by browsing titles and abstracts. e articles were excluded (i) if the interests were not concentrated on variants with TC risk, (ii) if there is lack of necessary data, and (iii) if the articles were just letters to editors.

Data Extraction.
Data extraction was carried out by two people independently and exchanged to check with each other after extraction. Any inconsistent was duplicately checked and discussed to reach an agreement with the corresponding author. For the variants reported in articles, we extracted data as follows: PMID of articles, first author, year of publication, country or region, ethnicity, name of variants, polymorphisms, study design, genotyping method, case and control, Hardy-Weinberg equilibrium (HWE) status, genotype counts, and minor allelic frequency (MAF). According to the results of previous meta-analyses, we divided the major ethnic groups into 3 group categories: Caucasians, Asians, and African Americans. e overall population was defined as two or more populations as above. As for the name of SNP, which often has many different naming methods, we selected the most common and well-known name of the SNP as the representative by querying on NCBI.

Statistical Analysis.
For each SNP, we sorted out allelic, dominant, and recessive models according to the included ethnicities.
en, meta-analyses based on models and ethnicities were performed using STATA, version 12 (Stata, College Station, TX, USA) only if two or more studies were included. Crude ORs with the corresponding 95% CIs were used to assess the strength of the association between SNPs and TC risk. e I 2 test was performed to quantitatively assess possible heterogeneity in the combined studies as follows: I 2 ≤ 25 indicated no or mild heterogeneity, 25% < I 2 < 50% indicated moderate heterogeneity, and I 2 ≥ 50% indicated large heterogeneity [14]. Sensitivity analysis was performed by removing the first published study from the total or studies deviated from the HWE in the controls and reanalyzing the remainder. In addition, publication bias was assessed by Egger's test and P > 0.05 indicated no publication bias existed [15,16].

Evaluation of Epidemiological Evidence.
We evaluated the evidence of significant associations between SNPs and TC by the Venice guideline first based on three criteria as follows: amount of evidence, replication of association, and protection from bias [17,18]. e amount of evidence was related to the number of alleles or genotypes and graded as A (N > 1000), B (100 ≤ N ≤ 1000), or C (N < 100). e replication of association was graded as A (I2 ≤ 25%), B (25% < I2 < 50%), or C (I2 ≥ 50%) based on heterogeneity statistics. e protection from bias was determined by various potential sources of bias, including sensitivity analysis, publication bias, and small study bias, as well as an excess of significant findings. A was graded when there was no demonstrable bias or the bias was unlikely to invalidate the association. B was graded when there was no obvious bias without sufficient information on identifying evidence. C was graded when there was obvious bias or the bias was likely to explain the presence of association. Furthermore, C was graded in any one of the following situations: (1) association lost with exclusion of first study or studies deviated from HWE in sensitive analysis; (2) a low magnitude of the association (0.87 < OR < 1.15) only if the association had been identified by GWAS or several studies with no evidence of publication bias; and (3) evidence of obvious publication bias (P value in Egger's test<0.05). In summary, the cumulative epidemiological evidence of significant associations was graded as follows: strong associations (all above three grades were A), weak associations (any grade was C), and moderate associations (all other conditions).
Furthermore, we used a false-positive report probability (FPRP) assay suggested by Wacholder et al. [17] with a prior probability of 0.05 and an FPRP cutoff value of 0.2 to detect the potential false-positive results among significant associations, in order to confirm whether there was a real association between SNPs and TC risk. e evidence of FPRP was graded as strong (FPRP < 0.05), moderate (0.05 ≤ FPRP ≤ 0.2), and weak (FPRP > 0.2), which indicated upgrading of cumulative evidence one level (from moderate to strong or from weak to moderate), maintaining of the original level, and downgrading of cumulative evidence one level (from strong to moderate or from moderate to weak), respectively.

Results
As presented in Figure 1, a total of 3887 records were searched, as well as 157 records from relevant reference publications. Of these, 1821 duplicate records were removed, and 1931 irrelevant records were excluded via scanning the title or abstract. Of the 292 publications assessed for eligibility, 121 publications were excluded due to no etiology of TC, 34 publications were excluded due to no genetic polymorphism, 13 publications were excluded due to no case-control or cohort or cross-sectional study, 16 publications were excluded due to lack of necessary data, and 9 publications were excluded for letter to editors. At last, a total of 99 eligible publications were included in our study. e data from those publications involving 36 SNPs in 23 genes were used to perform meta-analyses and value the cumulative epidemiological evidence with the Venice criteria and FPRP test. Additionally, 22 publications including 19 SNPs in 13 genes with insufficient data were also used to calculate OR and P value.
Sensitivity analysis was performed for all significantly associated SNPs and significant SNPs in subgroup analysis by removing the first published study from the total publications or studies deviated from the HWE in the controls. As a result of removing the first published study, RET rs1800858 was no longer significantly associated with TC under all models, neither was RET 1800862 under the allelic model. In addition, only XRCC3 rs1799794 lost the significant association with TC when removing studies deviating from the HWE. Meanwhile, publication bias was assessed by Egger's test. Obvious publication bias was shown in FOXE1 rs965513 under recessive model for the overall population, RET rs1799939 under recessive model for the Asian population, RET rs1800862 under dominant model for the Caucasian population, XRCC1 rs1799782 under recessive model for the Asian population, and XRCC3 rs861539 under recessive model for the overall population and the Caucasian population.
In addition, in the subgroup analysis, 7 SNPs were graded as strong associations with TC after calculating FPRP value (POU5F1B rs6983267, NKX2-1 rs944289, DIRC3 Journal of Oncology rs966423, FOXE1 rs966423, MTHFR rs1801133, XRCC3 rs861539, and FOXE1 rs965513), in which the cumulative epidemiological evidence of MTHFR rs1801133, XRCC3 rs861539, and FOXE1 rs965513 was upgraded from moderate to strong. 2 SNPs were graded as a moderate association with TC (FOXE1 rs2439302 and RET rs1799939), and the association of FOXE1 rs2439302 was upgraded from weak to moderate. 2 SNPs were still maintained as weak association with TC based on the FPRP value (RET rs1800858 and XRCC1 rs1799782).

Discussion
In this study, we collected data about associations between polymorphisms and TC from publications, performed metaanalyses, and valued the cumulative epidemiological evidence of associations by the Venice criteria and FPRP test, which extended our understanding of true associations between SNPs and TC etiology.
DIRC3, first identified as a fusion transcript in familial renal carcinoma as early as 2003, was identified to affect thyroid-stimulating hormone levels and promote TC development through decreasing thyroid epithelium differentiation [18,19]. e SNP rs966423 located in 2q35 of the DIRC3 gene, within a lncRNA, was valued as strong evidence for association with TC risk in our study. e allele C mutation increased TC risk in the overall population and the Caucasian population compared with the wild-type allele T (OR � 1.227 and OR � 1.214, respectively). However, lack of data resulted in ambiguous associations for the Asian population. As susceptibility genetic loci of DIRC3 were also commonly found in GWAS in the Korean population [20], further investigation for SNPs on DIRC3 in the Asian population is necessary. e TERT gene is a catalytic subunit of telomerase and plays an essential part in cellular immortality by maintaining telomere length at the end of chromosomes, which exhibited low or no expression in normal cells but highly expressed in 85%-90% of tumor cells and stem cells [21][22][23]. e SNP rs2736100 is located in intron No. 2 of TERT gene and has a genotype-specific impact on TERT expression [24]. In our meta-analyses, 5 studies with a sample size of over 10000 subjects demonstrated its true evidence of strongly increasing TC risk in the Asian population, especially for the Chinese population. GWAS [25]. e SNP rs6983267 is located in chromosome 8q24 and has been identified to be associated with several cancers, such as prostate, ovary, colon, and several other carcinomas [26,27]. POU5F1B (also known as POU5F1P1) is the nearest gene of rs6983267, which can probably encode a functional protein contributing to carcinogenesis by acting as a weak transcriptional activator [28]. We found strong epidemiological evidence of increasing TC risk among the overall population, especially in the Caucasian population. A higher TC risk among Caucasians than Asians was demonstrated in our study, and it was consistent with the result of the metaanalyses performed by Zhu et al. [26], which may be related to the lower mutation of risk allele G among Asians than Caucasians.
e SNP rs944289 is located in a 249 kb LD region near the gene of NK2 homeobox 1 neighborhood (NKX2-1), which plays a vital role in thyroid morphogenesis regulating via encoding thyroid transcription factor 1 (TTF1) [27]. Previous studies found that it is significantly associated with TC risk in the Japanese and Icelandic populations, but not associated with that in the Belarusian population [27,29]. Strong evidence of increasing TC risk among three populations was confirmed in our meta-analyses with over 10000 subjects. Previous publication is referred to a probable relationship between rs944289 and female TC susceptibility for a higher prevalence of allele T in female patients of TC [30].
In our study, RET rs1799939 was found significantly increasing with the TC risk by 1.535-fold and had strong epidemiological evidence in the overall population. A change from allele G to allele A of rs1799939 may activate RET via leading to an amino acid change from glycine to serine, which played a vital role in thyroid carcinogenesis [31,32].
Forkhead factor E1 (FOXE1), also called TTF2 for thyroid transcription factor (2), was firstly isolated from cDNA of mouse and modified the development of the thyroid gland and their expression in thyroid tumors through encoding thyroid-specific transcription factors [33,34]. For both rs1867277 and rs2439302, strong accumulative epidemiological evidence of increasing TC risk was demonstrated among Caucasians in our study. Previous publications have referred that allele A of rs1867277 was significantly related to TC risk in Poles [35]. As one of the most specific thyroid transcription factors, FOXE1 could identify thyroperoxidase and thyroglobulin, which contributed a lot in tumor transformation [36], but lack of sufficient data resulted in the ambiguous association among the Asian population, which need further accumulation and investigation about other ethnicities. 4 SNPs in our study were demonstrated as a moderate association with TC risk and 8 SNPs as weak associations. For SNPs such as FOXE1 rs965513, MTHFR rs1801133, XRCC3 rs861539, and XRCC3 rs1799794, a different epidemiological evidence for associations was observed in different ethnicities or genetic models. In addition to ethnic heterogeneity, the influence of diverse genetic behaviors and multiple environments should also be considered in further well-designed studies. Due to insufficient data, only OR and P value were calculated for 19 SNPs in which 7 SNPs revealed probably increased TC risk, while 2 SNPs might decrease the risk of TC. Further large size studies were expected to identify the actual association for these SNPs.
A total of 17 SNPs showed no association with TC risk in meta-analyses. A similar result was also found in the metaanalysis of Kang et al. that ATM variants might not be important dominants of TC susceptibility [37]. Besides, 5 SNPs had a sample size of more than 6000 subjects with the MAFs ranging from 10% to 30%. Based on the detection level or value setting at 1.15 in the additive model, the metaanalyses can provide about 86% power with a MAF of 10% and improve 97% power with a MAF of 20%. erefore, no significant results may be presented for these five SNPs in the future TC susceptibility investigation with a similar sample size. Certain inevitable limitations existed in this study: (i) despite the full trade-off between inclusion and exclusion criteria, some articles may have been missed; (ii) owing to the insufficient of some data, meta-analyses could not be performed for SNPs included in each ethnicity and genetic model; (iii) study was designed only for associations among SNPs and TC susceptibility, but not involved in tumor progression, metastasis, and prognosis of TC; and (iv) factors included in this study were only ethnicity and genetic models, and other factors such as pathological types of TC and radiation exposure should be considered to further assess the association. Despite these limitations, our study provides an updated and comprehensive evaluation of the TC susceptibility and provides a reference for further genetic research.
In conclusion, our study comprehensively assesses the cumulative epidemiological evidence of significant associations among SNPs and TC susceptibility based on the Venice criteria and FPRP test. Seven SNPs were identified as strong evidence of associations with TC risk, as well as four SNPs with moderate evidence. We provided an updated understanding of TC susceptibility and inspired further investigation into gene polymorphism and clinic strategy of TC.

Data Availability
All data generated or analyzed during this study are included in this published article and its supplementary information files.

Disclosure
Ran Ran and Gang Tu are co-first authors.

Conflicts of Interest
e authors confirm that there are no conflicts of interest.

Authors' Contributions
Ran Ran and Gang Tu contributed equally to this work.

Supplementary Materials
Supplementary