Introduction

Colorectal cancer (CRC) is the third most common cancer type in men and the second most common in women worldwide, and it is the fourth leading cause of cancer-related deaths in the world1. The mechanisms underlying the development of CRC are not comprehensively understood; however, various risk factors are known to contribute to carcinogenesis in CRC, including age, presence of colon polyps, high consumption of red meat, obesity, smoking, and alcohol consumption2. Furthermore, the response to these risk factors seems to differ between ethnicities and geographical regions, which may affect the prevalence and prognosis of CRC.

Recently, the gut microbiota has been added to the list of CRC risk factors, as it has been implicated in the development of CRC and might contribute to CRC progression, as suggested by the dynamic ā€œdriver-passengerā€ model3. In the past few decades, it has been shown that among the diverse bacterial species that are a part of the gut microbiome, the presence of specific bacteria can play a role in colorectal carcinogenesis4. Bacteria-driven oncogenic mechanisms in CRC have been proposed to include the activation of Wnt signaling (enterotoxigenic Bacteroides fragilis (ETBF)5 and Fusobacterium species6), proinflammatory signaling (Enterococcus faecalis7 and Streptococcus gallolyticus8), and genotoxicity (colibactin-producing Escherichia coli, colBā€‰+ā€‰E. coli9). These carcinogenic effects can occur from very early stages and over the multistep processes of CRC carcinogenesis and can be accompanied by shifts in the gut microbiome and metabolome10. Furthermore, a recent study showed that the gut microbiome may promote the progression of CRC11. Therefore, it is crucial to identify bacteria that may be associated with the development of CRC and to assess which other factors can elicit transformation from a healthy gut microbiome to a tumorigenic microenvironment.

Diet is the most well-known factor that influences the gut microbiome, and differences in diet can substantially affect the entire gut microenvironment12. In South Korea, CRC is one of the most prevalent types of cancer; however, few metagenomic studies have been conducted to determine the association between the gut microbiome and CRC development13. The composition of the gut microbiome may be influenced by factors other than diet, including alcohol consumption14. Numerous epidemiological studies have shown that alcohol consumption is strongly associated with the incidence of CRC15. Retrospective propensity score matching analysis (adjusted hazard ratio: 1.86) revealed that high alcohol consumption increased the development of CRC16. Alcohol consumption is considerably high in South Korea17 and may be a causative factor underlying the high prevalence of CRC in Korea18. However, to our knowledge, the effect of alcohol consumption on the gut microbiome has not been investigated thoroughly in South Korea.

In this study, we aimed to determine the prevalence of five CRC-associated bacteria (ETBF, E. faecalis, colBā€‰+ā€‰E. coli, F. nucleatum, and S. gallolyticus) in the biopsied tissues of patients with CRC using quantitative real-time PCR (qPCR) and the relative abundance of operational taxonomic units (OTUs) by 16S rRNA analysis. Furthermore, we determined the associations between the epidemiological characteristics of CRC and the gut microbiome and alcohol consumption.

Results

Comparison of demographics and clinical characteristics of cases and controls

A total of 39 CRC samples and 21 control samples were examined (Table 1). No significant differences between the groups were observed with respect to epidemiological characteristics, apart from BMI and hypertension. BMI was lower in CRC patients (22.9ā€‰Ā±ā€‰2.9) than in controls (24.8ā€‰Ā±ā€‰2.7), and hypertension was more prevalent in controls (76.2%, nā€‰=ā€‰16/21) than in CRC patients (46.2%, nā€‰=ā€‰18/39). The CRC cases consisted of 71.8% colon cancers and 28.2% rectal cancers, with right- and left-sided cancers accounted for 28.2% and 71.8% of the cases, respectively.

Table 1 Comparison of demographics and clinical characteristics of CRC patients and controls.

Prevalence of CRC-associated bacteria as assessed by qPCR

The prevalence of the five CRC-associated bacteria was examined in the carcinoma tissues (CTs) of CRC patients and in normal tissues (NTs) of controls using qPCR (Table 2). F. nucleatum was most frequently detected in CRC patients and was not detected in the NTs of controls (82.1% and 0%, respectively; Pā€‰<ā€‰0.0001). However, no significant difference in the prevalence of colBā€‰+ā€‰E. coli, ETBF, E. faecalis, or S. gallolyticus was observed between CRC patients and controls.

Table 2 Prevalence of CRC-associated bacteria by quantitative real-time PCR (qPCR).

Prevalence of F. nucleatum by tumor stage and tissue type

We analyzed the association of F. nucleatum positivity in CRC patients with that in controls based on clinicopathological features. Patients with CRC were classified by tumor stage (early stage [I/II] or late stage [III/IV]). F. nucleatum was detected very frequently in both early (92.3%) and late (76.9%) stages; however, this difference was not significant (Fig.Ā 1A). With respect to the tissue type, the prevalence of F. nucleatum was significantly higher in CTs (82.1%) than in adjacent normal tissues (ATs) (39.9%) as well as distal normal tissues (DTs) (33.3%) in CRC patients; no significant difference was observed between ATs and DTs (Fig.Ā 1B). When the cycle threshold (Ct) values obtained from qPCR were compared among tissue types in F. nucleatum-positive cases, they were found to be significantly lower in CTs (Mannā€“Whitney U test, Pā€‰<ā€‰0.0001). F. nucleatum was markedly enriched in CTs compared to ATs (Supplementary Fig. S1). The fadA gene, which encodes a virulence factor implicated in adhesion and invasion, was statistically more frequent in the CTs of CRC patients (69.2%) than in the NTs of controls (9.5%) (Fig.Ā 1C).

Figure 1
figure 1

F. nucleatum statuses of CRC patients and controls. (A) Patients with CRC were categorized according to tumor stage (I/II: early stage, III/IV: late stage). F. nucleatum prevalence was significantly higher in CRC patients at early and late stages than in controls (92% and 76.9% vs. 0%, respectively; Fisherā€™s exact test, Pā€‰<ā€‰0.0001, each). (B) Tissues of CRC patients were collected from carcinoma tissues (CTs), adjacent normal tissues (ATs) and/or normal tissues (NTs) from non-CRC sites. F. nucleatum was significantly higher in CTs (82.1%) than in ATs (38.9%) and NTs (33.3%) of CRC patients (CT vs. AT, CT vs. NT, AT vs. T, and NT vs. T; Fisherā€™s exact test, Pā€‰=ā€‰0.0002,ā€‰<ā€‰0.0001, 0.0008, and 0.003, respectively) and was significantly higher in CTs than in tissues (Ts) of controls (Pā€‰<ā€‰0.0001). (C) Comparison of the prevalence of the fadA gene revealed significant differences between the CTs of patients and tissues (Ts) of controls (69.2% vs. 9.5%; Fisherā€™s exact test, Pā€‰<ā€‰0.0001). Four asterisks (****) indicate Pā€‰<ā€‰0.0001, three asterisks (***) indicate Pā€‰<ā€‰0.001, and two asterisks (**) indicate Pā€‰<ā€‰0.001; Fisherā€™s exact test was performed using GraphPad Prism version 5.0 for Windows.

16S rRNA analysis

Sequencing of the V3 and V4 regions in the 16S rRNA gene produced 171,988ā€‰Ā±ā€‰81,412 sequencing reads, on average. After quality filtration, clustering and taxonomy assignment were performed at the genus level by QIIME software19. A total of 698 OTUs were generated and used for further statistical analysis. In a diversity analysis, no significant difference in Ī±- or Ī²-diversity was observed between CRC cases and controls (Supplementary Fig. S2).

When comparing CRC cases with controls, the proportions of six OTUs, namely, Peptostreptococcus, Collinsella, Prevotella, Parvimonas, Fusobacterium, and Gemella, were significantly different after false discovery rate correction for multiple testing (Fig.Ā 2). Among the epidemiological characteristics with continuous values, Fusobacterium-positive CRC patients were significantly younger than Fusobacterium-negative patients (64.0ā€‰Ā±ā€‰10.5 vs. 73.1ā€‰Ā±ā€‰9.3; Pā€‰=ā€‰0.034), which was also true for Parvimonas-positive patients (60.4ā€‰Ā±ā€‰8.5 vs. 69.0ā€‰Ā±ā€‰10.9; Pā€‰=ā€‰0.011; Table 3). With respect to epidemiological characteristics with binomial values, Fusobacterium was associated with alcohol consumption and KRAS mutation (Pā€‰=ā€‰0.088 and Pā€‰=ā€‰0.094), and Parvimonas was associated with tumor location and KRAS mutation, although the difference was only marginally significant (Pā€‰=ā€‰0.070, Table 4). Fusobacterium was observed in 14 of the 15 heavy drinkers (93.3%) but not in 11 of the 17 non/light drinkers (64.7%). The proportion of Fusobacterium OTUs was significantly higher in heavy drinkers than in non/light drinkers (Pā€‰=ā€‰0.003); no corresponding pattern was observed in controls (Fig.Ā 3). With respect to tumor location, the proportion of the Parvimonas OTU was significantly higher in the descending colon and in the rectum (Pā€‰<ā€‰0.001; Supplementary Fig. S3) than in other parts of the colon. Tumor stage and metastasis were not significantly correlated with the presence of specific bacteria. When the Fusobacterium OTU proportions were compared between cases and controls using Ct values, qPCR-positive cases showed higher OTU abundance values than qPCR-negative cases, and the significant positive correlation between these values is shown in Fig.Ā 4.

Figure 2
figure 2

Relative abundance of six OTUs that were significantly different between cases (CRC patients) and controls were compared using R software.

Table 3 Differences in epidemiological characteristics with continuous values based on the association of CRC with the six OTUs.
Table 4 Differences in epidemiological characteristics with binomial values based on the association of CRC with the six OTUs.
Figure 3
figure 3

Differences in the proportion of Fusobacterium between heavy drinkers and non/light drinkers by cases (CRC patients) and controls. The proportion of Fusobacterium OTUs was significantly higher in heavy drinkers than in non/light drinkers (Pā€‰=ā€‰0.003); no corresponding pattern was observed in controls. We used R software to perform the comparison and generate the figure.

Figure 4
figure 4

Box and whisker plots of the relative abundance of Fusobacterium by 16S rRNA analysis based on F. nucleatum qPCR positivity. Relative abundance (expressed as OTU percentages) was log-transformed for plotting on the Y-axis. Analysis was conducted with R software.

Discussion

CRC is a major health threat in many countries; however, its etiology and underlying mechanisms are still not comprehensively understood. The determination of a cancerā€™s etiology can result in the development of preventive or therapeutic measures. Several studies have demonstrated enrichment of fecal or tissue samples of CRC patients with specific bacterial pathogens.

ETBF is a well-known pathogen of the gastrointestinal tract that can trigger a carcinogenic multistep process through B. fragilis toxin3,4,5,20. However, we observed no significant difference in the prevalence of ETBF between CRC cases and controls, and its prevalence (35.9%) was significantly lower than that in a previous study (88.5%)20. In a recent study21, it was shown by qPCR that only 36.4% of CRC cases were positive for ETBF, and an equal distribution of B. fragilis was found in tumors, paired normal tissue and diverticula. Thus, the prevalence and distribution of ETBF should be determined by follow-up investigations. It has been shown that pathogenic E. coli can synthesize colibactin, which is genotoxic22. The prevalence of colBā€‰+ā€‰E. coli in CRC patients might be the link between E. coli and CRC. In our study, the prevalence of colBā€‰+ā€‰E. coli in CRC cases was twice as high as that in controls, but the difference was not statistically significant, probably due to an insufficient sample size. S. gallolyticus and E. faecalis were also regarded as CRC-associated bacteria in previous reports23,24; however, recent studies showed that the roles of S. gallolyticus and E. faecalis in CRC carcinogenesis were controversial due to their low prevalence21,25,26. In our study, neither species was frequently detected in cancer patients (Table 2).

In recent decades, F. nucleatum has attracted attention as a potential cause of CRC4,6,27. The role of F. nucleatum in CRC pathogenesis has not been comprehensively understood, but at least four mechanisms have been suggested to describe the same28: (1) cell proliferation through Wnt signaling by an interaction between FadA and E-cadherin, (2) antitumor immune evasion via Fap2 and T cell immunoreceptors having Ig and immunoreceptor tyrosine-based inhibitory motif domains, (3) tumor binding and enrichment of Fap2 and Gal-GalNAc, and (4) chemoresistance by lipopolysaccharide and Toll-like receptor 4.

Fusobacterium species are strictly anaerobic and are difficult to isolate by culture. In this study, the prevalence of F. nucleatum in the tissues of CRC patients as tested by qPCR (82.1%) was notably higher than that obtained using anaerobic culture (7.7%; data not shown). Therefore, non-culture-dependent detection methods such as qPCR or 16S rRNA analysis may be essential for screening Fusobacterium species or for studying its epidemiology in a population during the development of CRC.

The presence of F. nucleatum was significantly correlated with both the early and late stages of CRC (Fig.Ā 1A). This finding suggests that F. nucleatum may be involved in CRC carcinogenesis from an early stage, and one review indicated an association of F. nucleatum with carcinomas throughout the different stages of CRC progression29. Analysis of the abundance of F. nucleatum by tissue type showed that the bacteria seemed to be more prevalent in CTs than in ATs and NTs (Fig.Ā 1B). This finding is consistent with previous studies showing that a high level of Fusobacterium colonization is associated with CRC27,30.

FadA is a well-known oncogenic virulence factor that can induce oncogenic gene expression and promote the growth of CRC cells6. Therefore, FadA is regarded as a major virulence factor of F. nucleatum in CRC, and 69.2% of CT samples were positive for the fadA gene. However, some discrepancies between F. nucleatum presence and fadA gene frequency were observed in our study (Fig.Ā 1C). Thus, it may be necessary to evaluate both F. nucleatum presence and fadA gene frequency to accurately determine the presence and virulence of F. nucleatum31. Interestingly, a recent study suggested that the fadA gene is not specific to F. nucleatum and that it is also present in other Fusobacterium species32. This finding indicates the need for further research regarding the role of other Fusobacterium species in CRC carcinogenesis.

The gut microbiome comprises numerous bacterial species that interact with each other, and it is possible that other bacteria in addition to the five known CRC-associated bacteria that were examined in the present study may affect the carcinogenesis of CRC. Thus, CRC metagenomic analysis is now commonly used to analyze this complex bacterial community. According to our 16S rRNA analysis, six genera, namely, Peptostreptococcus, Collinsella, Prevotella, Parvimonas, Gemella, and Fusobacterium were significantly abundant in CRC patients. Kwong et al. reported that P. anaerobius promoted CRC carcinogenesis in ApcMin/+ mouse models, and a correlation between Peptostreptococcus species and human CRC was reported previously33. The role of Collinsella species in CRC has rarely been investigated; however, Ai et al. reported that Collinsella species were specifically correlated with CRC in the Chinese population using metagenomic analysis34. The association of Prevotella species with CRC was reported in a previous study35, and our results are compatible with this report. Parvimonas and Gemella species were also reported as potential causative agents that may be responsible for CRC carcinogenesis36. However, it is difficult to determine which bacteria are the primary cause of CRC, as the gut microbiome is too complex and can vary by population37.

Based on an integrated analysis of 16S rRNA data and epidemiologic characteristics, we found a marginally significant correlation between Fusobacterium occurrence and a history of alcohol consumption. It has been shown that alcohol consumption is one of the major contributors to CRC carcinogenesis38. The generation of acetaldehyde and other metabolites from alcohol activates cancer-promoting signals39. Moreover, it was suggested that ethanol oxidation by intestinal anaerobes, including Fusobacterium species, under aerobic conditions in the colon and rectum may also play an important role in the pathogenesis of ethanol-related CRC40. To our knowledge, the present study was the first to determine the relationship between alcohol consumption and the presence of Fusobacterium using 16S rRNA analysis. In the integrated statistical analysis, a cutoff of 1% was used to determine the positivity of each OTU41. It is interesting to note that 87.5% (28/32) of qPCR-positive cases and 76.2% (16/21) of qPCR-negative controls showed the same 16S rRNA analysis results when a cutoff of 1% was used. This cutoff seems to be the appropriate threshold to determine abundance in 16S rRNA analysis in CRC tissues, though more investigations are required to confirm the cutoff.

The limitations of this study were the relatively small sample size, which was associated with low statistical significance, and the fact that tubular adenoma could not be analyzed as an independent group due to the small sample size. However, the simultaneous analysis of the presence of specific bacterial pathogens, 16S rRNA analysis, patient demographics and clinical characteristics may be an advantage of this study. The detection of F. nucleatum in stool can be helpful for diagnosing CRC in patients as a biological tumor marker if used concurrently with other tumor markers, and an appropriate testing method should be developed and validated.

In conclusion, F. nucleatum was significantly prevalent in the CTs of patients with CRC and frequently present in both early and late stages. These data support that this bacterium is strongly associated with the development of CRC. The correlation between alcohol consumption and the abundance of Fusobacterium OTUs in cancer tissue discovered using 16S rRNA analysis suggests a possible link between alcohol metabolism and changes in the gut microbiome and subsequent tumorigenesis caused by F. nucleatum.

Methods

Study population

A total of 38 CRC patients and 21 controls who were diagnosed using colonoscopy at a tertiary-care hospital between June 2015 and November 2016 were enrolled. Patients who received preoperative radiation, chemotherapy, and/or antibiotic treatment within 4Ā weeks were excluded. One patient had two primary adenocarcinoma lesions; thus, 39 CRC samples were analyzed in this study. Among the 21 controls, 12 individuals had tubular adenomas removed during colonoscopy, and 9 individuals showed no specific findings in colonoscopy. Patients with tubular adenomas were analyzed as controls because most of the polyps could be regarded as low risk due to their pathologic findings and small size (<ā€‰10Ā mm), and the resection margin was clear in all samples. G* power 3.1.9.7 analysis was conducted using the chi-square test to justify the appropriate sample size, and the test showed a significance level of 0.05 and a power of 0.81. Patient demographics and clinical characteristics were investigated by reviewing the medical records and interviews. Total alcohol consumption was calculated based on the history of alcohol intakeā€”determined based on factors such as beverage type, amount, and frequency of consumptionā€”and patients were classified as non/light drinkers (0ā€’1 drink/day) or heavy drinkers (ā‰„ā€‰2 drinks/day). All patients were classified as smokers (persons who are/were smokers) or nonsmokers (persons who never smoked) based on their smoking history.

Ethics approval and consent to participate

This study was approved by the Institutional Review Board at Severance Hospital, Yonsei University, Republic of Korea (approval number: 2014-2768-004). All methods were performed following the relevant guidelines and regulations. Documented informed consent was obtained from each study participant.

Colon tissue collection

Colonoscopy biopsies were obtained from the carcinoma tissues (CTs), adjacent normal tissues (ATs), and/or distant normal tissues (DTs) of CRC patients and from the right (from the cecum to transverse) and left (from descending to the rectum) colons of controls. Tissue samples were categorized as 'normal tissue' (NT) if no macroscopic or pathological evidence of carcinoma was observed. Tissue biopsies were collected in Transystem tubes (Copan, Brescia, Italy) containing anaerobic transport medium. Culture and DNA extraction for qPCR were performed within 30Ā min after tissue collection, and the remaining tissues were stored in a deep freezer (āˆ’ 70 ā„ƒ) for 16S rRNA analysis. Detailed characteristics of CRC patients and controls are summarized in Supplementary Tables S1A,B.

Detection of CRC-associated bacteria using qPCR

DNA was directly extracted from colon tissue using the QIAamp DNA Micro Kit (QIAGEN, Hilden, Germany) according to the manufacturerā€™s instructions. DNA concentration and purity were recorded using a NanoDrop spectrophotometer (NanoDrop Technologies, Montchanin, U.S.A.). The specific 16S ribosomal RNA (rRNA) genes were amplified by qPCR using the CFX96 Real-Time PCR System (Bio-Rad, Hercules, CA) to detect five CRC-associated bacteria: ETBF42, colBā€‰+ā€‰E. coli43, E. faecalis44, F. nucleatum27, and S. gallolyticus45. The reaction mixture consisted of 10 ĀµL of Ex Taq (TaKaRa, Kusatsu, Japan), 0.4 ĀµL of each specific primer pair (10Ā ĀµM), and 25Ā ng/ĀµL DNA template in a total reaction volume of 20 ĀµL. E. faecalis ATCC 29212, F. nucleatum subsp. nucleatum ATCC 25586 and three clinical isolates (ETBF, colBā€‰+ā€‰E. coli and S. gallolyticus) were used for qPCR quality control. Identification of the clinical isolate was performed using MALDI-TOF mass spectrometry and 16S rRNA sequencing. For S. gallolyticus, the rpoB gene was also sequenced. The presence of the fadA gene in F. nucleatum was determined using the conventional PCR method in accordance with a previous study31, and the 211Ā bp size of the amplicon was confirmed by sequencing.

16S rRNA analysis

The V3 and V4 regions of the 16S rRNA gene (400ā€“450Ā bp) were PCR amplified using target-specific primers and adapters for Illumina sequencing. Next-generation sequencing was performed on a MiSeq platform (Illumina, San Diego, USA) using the MiSeq Reagent Kit V2 with 500 cycles. Paired-end sequences were merged using PANDASEQ software46, and clustering of operational taxonomic units (OTUs) and taxonomy assignment were performed using QIIME software19. The observed diversity and Chao1 and Shannon indexes for Ī±-diversity were calculated using QIIME. Principal components were calculated and plotted for Ī²-diversity estimation. The relative proportion of each OTU was compared between cases and controls as well as with respect to epidemiological characteristics. OTUs present inā€‰>ā€‰1% of all OTUs were regarded as positive16. The mock community was analyzed before clinical specimen testing to validate the 16S rRNA analysis results.

Statistical analyses

For 16S rRNA analysis, correction for multiple testing was performed using the false discovery rate (FDR) method. All statistical analyses were performed using GraphPad Prism version 5.0 (https://www.graphpad.com/scientific-software/prism/, GraphPad Software, La Jolla, CA, USA; Tables 1, 2, Fig.Ā 1) and R software (https://www.R-project.org, The R Foundation, version 3.5.3; Tables 3, 4, Figs.Ā 2, 3, 4)47. A value of Pā€‰<ā€‰0.05 was defined as statistically significant, and a P-value between 0.05 and 0.10 was considered marginally significant.