Genetic variation of long non-coding RNA TINCR contribute to the susceptibility and progression of colorectal cancer

Colorectal cancer (CRC) accounts for the leading causes of cancer-related morbidity and mortality. However, a large part of heritable factors are warranted to be explored. Long non-coding RNAs (lncRNAs) serve critical roles in cancer development and progression. Herein, we explored effect of genetic variants of Tissue differentiation-inducing non-protein coding RNA (TINCR), a key lncRNA required for somatic tissue differentiation and tumor progression, on risk and progression of CRC. Three tagSNPs, including rs2288947, rs8105637, and rs12610531, were evaluated in in a two-stage, case-control study. Two SNPs, rs2288947 and rs8105637, were significantly associated with susceptibility of CRC in both stages. When pooled together, the allele G was significantly associated with 23% decreased risk of CRC (OR=0.77; 95% CI=0.67-0.88; P value = 1.2×10−4)for SNP rs2288947. While for SNP rs8105637, the allele A was significantly associated with 22% increased risk of CRC (OR=1.22; 95% CI=1.09-1.37; P value = 6.2×10−4). The two SNPs were also statistically associated with occurrence of lymph node metastasis of CRC. The carriers of allele G are less likely to get lymph node metastasis (OR=0.77; 95% CI=0.63-0.94; P value = 0.011) for rs2288947, and the carriers of allele A are more likely to get lymph node metastasis (OR=1.22; 95% CI=1.03-1.43; P value = 0.019) for rs8105637. These results suggest that lncRNA TINCR polymorphisms may be implicated in the development and progression of CRC.


INTRODUCTION
Colorectal cancer (CRC) is one of the leading causes of cancer-related morbidity and mortality [1]. Except for advanced age, family history, male sex, and lifestyle factors which contribute to the increased risk of CRC, many genetic factors has been identified to be associated with susceptibility [1][2][3][4][5][6]. High-penetrance germline mutations, mismatch repair genes, together with identified loci from genome-wide association studies (GWAS), account for about 14% of the familial risk of CRC [7]. However, a large part of heritable factors are warranted to be explored [7,8]. Further exploration of the interactive mechanism between genes and environment is helpful for specific diagnosis, screening, and personal treatment [9,10].
With the innovations in sequencing technologies, long noncoding RNAs (lncRNAs) are being identified and characterized for serial steps of cancer development, including tumor initiation, growth, and metastasis [11][12][13][14][15][16][17][18]. Previously, we identified that the allele del of lncRNA GAS5 rs145204276 was significantly associated with 21% decreased risk of CRC [19]. Carriers of allele del are less likely to get lymph node metastasis, which should that GAS5 rs145204276 were significantly associated with the susceptibility and progression of CRC [19]. Here, we explored effect of genetic variants of another lncRNA on CRC risk in a case-control study, Tissue differentiation-inducing non-protein coding RNA (TINCR), a key lncRNA required for somatic tissue differentiation and tumor progression [20,21]. Loss of TINCR expression promoted proliferation, metastasis through activating EpCAM cleavage in colorectal cancer [22].

Demographic characteristics
As shown in Table 1, the characteristics of the subjects were generally comparable in two stages, as no significant difference were detected for age group, gender, alcohol status and smoking status between CRC cases and healthy controls (all the P value > 0.05). Figure 1 shows the selection of tagSNPs for TINCR gene, including rs2288947, rs8105637, and rs12610531. The distribution of genotypes of all three tagSNPs in healthy controls in the two stage was in accordance with Hardy-Weinberg equilibrium (HWE, P > 0.05). As shown in Table  2, two SNPs, rs2288947 and rs8105637, were significantly associated with susceptibility of CRC in stage I (P=0.004 and 0.022, respectively). Thus, we replicated the associations of the two SNPs in an independent population (stage II, Table  3), which also presented statistically significant associations and same trend (P=0.007 and 0.009, respectively). When pooled together, the allele G was significantly associated with 23% decreased risk of CRC (OR=0.77; 95% CI=0.67-0.88; P value = 1.2×10 -4 ) for SNP rs2288947. While for SNP rs8105637, the allele A was significantly associated with 22% increased risk of CRC (OR=1.22; 95% CI=1.09-1.37; P value = 6.2×10 -4 ).

Associations between TINCR polymorphisms and CRC susceptibility stratified by tumor site
The associations between rs2288947, rs8105637 and CRC susceptibility were analyzed by Tumor site (Table 4). In colon and rectum cancers, the trend was not materially changed.

Associations between TINCR polymorphisms and lymph node metastasis and distant metastasis of CRC
We also investigated the associations between rs2288947, rs8105637 and Lymph node metastasis and Distant metastasis of CRC. As shown in Table 5, the carriers of allele G are less likely to get lymph node metastasis (OR=0.77; 95% CI=0.63-0.94; P value = 0.011) for rs2288947, and the carriers of allele A are more likely to get lymph node metastasis (OR=1.22; 95% CI=1.03-1.43; P value = 0.019) for rs8105637. Due to the limited sample size and statistical power, the associations with distant metastasis of CRC were not significant (P>0.05).

DISCUSSION
The current study systematically explored the potential associations between three tagSNPs of lncRNA TINCR, including rs2288947, rs8105637, and rs12610531, and risk and progression of CRC in in a two-stage, case-control study in Chinese population. To be best of our knowledge, this should be the first study which aims to evaluated the associations between genetic variation of lncRNA TINCR and susceptibility and progression of CRC.
In the stratified analyses, we observed difference in the association between rs2288947 genotype and CRC risk according to tumor site. The association was more significant for colon cancer while not significant for rectal cancer, although the exact mechanisms for these differences are currently unclear. We also didn't detected significant association between lncRNA TINCR rs2288947, rs8105637 and distant metastasis of CRC. They might be caused by to the limited sample size of the event cases and the insufficient statistical power.
Our study has several strengths. First, the implement of the two-stage, case-control study design, which is suggested for genetic association studies [41,42]. Second, we have sufficient statistic power to detect      40) 0.319 such associations. Using QUANTO software (http:// biostats.usc.edu/Quanto.html/), we found that the statistic power for the log additive model of rs2288947 was 98%, and 92% for that of rs8105637. There are also limitations in the current study. Such as the lack of independent replication with different ethnic background, and mechanism research. Further investigations are required to gain insight into the mechanisms by which TINCR regulates the occurrence progress of CRC. Taken together, this is the first study demonstrating the potential associations between genetic variation of lncRNA TINCR with susceptibility and progression of CRC in Chinese population. Our results firstly indicate that SNP rs2288947 and rs8105637 may act as independent biomarkers associated with occurrence and progression of CRC. This study provided valuable clues for better understanding the underlying contribution of genetic variation of lncRNA TINCR to carcinogenesis of CRC. Future functional studies should be conducted to further explore the role of lncRNA TINCR in the development and progression of CRC basing on the epidemiological findings.

Study subjects
In this two-stage, cases-control study, we totally recruited 1400 CRC cases and 1400 healthy controls between 2010 and 2015, which were matched by age group, gender, alcohol and smoking status. We have described these in a previous study which evaluated the functional of LncRNA GAS5 in development and progression of CRC [19]. Five milliliter peripheral blood was collected from all subjects, and demographic information were face to face interviewed by the project staff. The study was approved by appropriate Research Ethics Committee (REC) of Renmin Hospital of Wuhan University, and written informed consent was obtained from all participants.

TagSNP selection, DNA extraction and genotyping
TagSNP selection was conducted using SNPinfo (https://snpinfo.niehs.nih.gov/). Qiagen genomic DNA purification kit were used for extraction of the genomic DNA from blood samples. Genotyping was performed using the TaqMan allelic discrimination assay on the ABI PRISM 7900HT Sequence Detection System. The genotyping results were determined by using the SDS 2.3 Allelic Discrimination Software (Applied Biosystems, Carlsbad, CA). Quality control was conducted by direct sequencing 5% duplicate samples in blind, with a concordance rate of 100%. Furthermore, a 5% random selected sample was replicated in duplicate by different persons, and the concordance rate was 100%.

Statistical analysis
Unconditional Logistic regression model was used to calculate the Odds ratios (ORs) and 95% confidence intervals (95% CIs) for the associations between TINCRpolymorphisms and risk of CRC and its Lymph node metastasis and Distant metastasis, adjusted for age group, gender, alcohol and smoking status. Hardy-Weinberg equilibrium was tested for with a goodness of fit χ2 test with one degree of freedom to compare the observed genotype frequencies among the subjects with the expected genotype frequencies. All statistics were performed using SPSS software 19.0 (SPSS Inc., Chicago, IL, USA), and P values were two sided with the statistical significance criteria of P < 0.05 all through the study.