Risk Factors Associated with Colorectal Cancer in a Subset of Patients with Mutations in MLH1 and MSH2 in Taiwan Fulfilling the Amsterdam II Criteria for Lynch Syndrome

Background and Aim Lynch syndrome, caused by germline mutations in mismatch repair genes, is a predisposing factor for colorectal cancer (CRC). This retrospective cohort study investigated the risk factors associated with the development of CRC in patients with MLH1 and MSH2 germline mutations. Methods In total, 301 MLH1 and MSH2 germline mutation carriers were identified from the Amsterdam criteria family registry provided by the Taiwan Hereditary Nonpolyposis Colorectal Cancer Consortium. A Cox proportional hazard model was used to calculate the hazard ratios (HRs) and 95% confidence intervals (CIs) to determine the association between the risk factors and CRC development. A robust sandwich covariance estimation model was used to evaluate family dependence. Results Among the total cohort, subjects of the Hakka ethnicity exhibited an increased CRC risk (HR = 1.62, 95% CI = 1.09–2.34); however, those who performed regular physical activity exhibited a decreased CRC risk (HR = 0.62, 95% CI = 0.41–0.88). The CRC risk was enhanced in MLH1 germline mutation carriers, with corresponding HRs of 1.72 (95% CI = 1.16–2.55) and 0.54 (95% CI = 0.34–0.83) among subjects of the Hakka ethnicity and those who performed regular physical activity, respectively. In addition, the total cohort with a manual occupation had a 1.56 times higher CRC risk (95% CI = 1.07–2.27) than did that with a skilled occupation. Moreover, MSH2 germline mutation carriers with blood group type B exhibited an increased risk of CRC development (HR = 2.64, 95% CI = 1.06–6.58) compared with those with blood group type O. Conclusion The present study revealed that Hakka ethnicity, manual occupation, and blood group type B were associated with an increased CRC risk, whereas regular physical activity was associated with a decreased CRC risk in MLH1 and MSH2 germline mutation carriers.


Methods
In total, 301 MLH1 and MSH2 germline mutation carriers were identified from the Amsterdam criteria family registry provided by the Taiwan Hereditary Nonpolyposis Colorectal Cancer Consortium. A Cox proportional hazard model was used to calculate the hazard ratios (HRs) and 95% confidence intervals (CIs) to determine the association between the risk factors and CRC development. A robust sandwich covariance estimation model was used to evaluate family dependence.

Results
Among the total cohort, subjects of the Hakka ethnicity exhibited an increased CRC risk (HR = 1.62, 95% CI = 1.09-2.34); however, those who performed regular physical activity exhibited a decreased CRC risk (HR = 0.62, 95% CI = 0.41-0.88). The CRC risk was enhanced in MLH1 germline mutation carriers, with corresponding HRs of 1.72 (95% CI = 1. 16-2.55) and 0.54 (95% CI = 0.34-0.83) among subjects of the Hakka ethnicity and those who performed regular physical activity, respectively. In addition, the total cohort with a

Introduction
Lynch syndrome is a hereditary disorder caused by a mutation in mismatch repair (MMR) genes, and patients affected by this syndrome are at a higher risk of developing colorectal cancer (CRC) early in life [1][2][3]. In Southeastern Asia, Taiwan has the second highest incidence of CRC after Japan; moreover, this incidence is higher in men (42/100,000) than in women (31/ 100,000) [4]. In addition, CRC is the second and third leading cause of cancer death in Taiwanese women and men, respectively [5]. The cumulative prevalence of MLH1, MSH2, and MSH6 germline mutations in Lynch syndrome is approximately 5.6%, and 3% of all CRCs are attributed to Lynch syndrome [6,7]. According to the NCBI database, MLH1 and MSH2 germline mutations contribute to approximately 90% of all mutations associated with Lynch syndrome; MSH6 contributes to 7%-10%, and PMS2 contributes to less than 5% of these alterations [8]. In addition, deletions in the EPCAM gene have also been reported in approximately 1%-3% of Lynch syndrome patients within Dutch and German populations [9].
Several epidemiological studies have consistently demonstrated that modifiable lifestyle and dietary factors, such as smoking, alcohol consumption, meat intake, and increased body mass index, increase the CRC risk in patients with Lynch syndrome [10][11][12][13]. However, these studies have been performed only in Western populations. Moreover, Asian and Western populations differ in culture, lifestyle factors, and dietary intake patterns [14]. Identifying risk factors associated with CRC in Lynch syndrome patients is crucial because they are at a higher risk of developing CRC in early life.
To date, only one case-control study has investigated this association in Taiwan and reported that decreased physical activity, low or moderate coffee consumption, cigarette smoking, and alcohol intake were associated with CRC; however, this case-control study excluded Lynch syndrome carriers [15]. Another case-control study examining a multiethnic study population reported a significant association between regular physical activity (1 hour/week) and a lower risk of colon polyps and adenomas [16]. Studies have consistently shown that smoking, alcohol consumption, and meat intake increase the risk of CRC development [10][11][12]. Other studies have reported that regular physical activity and vegetable and fruit consumption are associated with a decreased CRC risk [15][16][17][18]. Similarly, most studies investigating risk factors associated with CRC development have focused only on sporadic CRC or asymptomatic cases [15,19]. This retrospective cohort study investigated possible risk factors associated with CRC development in patients with MLH1 and MSH2 germline mutations. To our knowledge, no study has investigated this association in patients with Lynch syndrome in the Chinese population.

Participant recruitment
The Amsterdam II criteria were adopted for patient enrollment in the Amsterdam criteria family registry provided by the Hereditary Nonpolyposis Colorectal Cancer Consortium of the National Health Research Institutes in Taiwan. Clinical data, including that on histological characteristics and molecular genetic analysis, were collected from all index patients from seven hospitals and medical centers around Taiwan from May 2002 onwards. The search for germline mutations in the MLH1 and MSH2 genes was performed in each index patient. The probands were requested to contact their relatives to seek permission for their enrollment in the registry; details of the participants were described previously [20].
Written informed consent was obtained from all study participants, and the study protocol was approved by the Institutional Review Board of Taipei Medical University and the National Health Research Institutes. As of February 2012, 135 Amsterdam II criteria families, comprising 1,014 family members, had been registered and their genetic analyses completed. Pathogenic mutations in the MLH1 or MSH2 gene were identified in 82 of 135 families (60.7%). Information of the mutated genes, germline mutations, and family ID is shown in S1 Table. We studied 303 carriers harboring a germline mutation in one of the two MMR genes, MLH1 or MSH2. Because two patients had both mutations in MLH1 and MSH2 genes, the total number of studied carriers was 301. One of the patients had both c.1846_1848delAAG and c.2595_2597delCAT mutations in MLH1 and MSH2, respectively, and another had both c.793C>T and c.2516A>G mutations in MLH1 and MSH2, respectively.

Data collection
Professional nurses in the colorectal surgery department were trained for interviewing all probands and their relatives at the hospitals. Nurses had no prior knowledge of the study hypotheses regarding diet, lifestyle, and CRC, and all interviews were administered uniformly in the wards. After obtaining informed consent, a standardized interview was conducted using a structured questionnaire covering questions on sociodemographic characteristics, lifestyle factors (physical activities, cigarette smoking, alcohol intake, and tea and coffee intake), dietary factors, and medical history. The reliability (standardized Cronbach's alpha) of this questionnaire was 0.92, based on the lifestyle and dietary variables [15].
Information on the usual dietary intake of 14 food items by the participants was obtained for 5 years preceding the date of study registry. The intake frequency was categorized into six levels: never, less than once a month, one to three times a month, once a week, two to three times a week, and almost daily. Physical activity in the last year was assessed on the basis of vigorous leisure time activities, including jogging 16 km/week, swimming more than 3.2 km/ week, and participation in racket sports, martial arts, and other sports more than 5 hours/week [21]. Subjects practicing any one of these activities were defined as "Yes," and others as "No." Cigarette smoking and alcohol, tea, and coffee consumption were categorized into never, former, and current, based on the status of use of the participants. Because few of the subjects were former cigarette smokers or alcohol, tea, or coffee consumers (6.9%, 3%, 0.66%, and 1.6%, respectively), we combined the "former" and "current" categories to form the "ever" category. The participants were biennially followed up from May 2002 to February 2012 to obtain updates about their morbidity statuses. Reports of cancer diagnoses and the age at diagnosis were confirmed, where possible, on the basis of pathology reports, medical records, cancer registry reports, and death certificates.

Germline mutation screening
DNA was extracted from white blood cells by following standard procedures involving sodium dodecyl sulfate proteinase K-RNase digestion and phenol-chloroform extraction. MLH1 and MSH2 germline mutations were tested in all probands who had a colorectal tumor exhibiting evidence of impaired MMR function as indicated by microsatellite instability or a lack of MMR protein expression in immunohistochemical analysis. Mutations were tested by performing denaturing high-pressure liquid chromatography (WAVE DNA Fragment Analysis System, Omaha, NE, USA), followed by confirmatory DNA sequencing. Large insertion and deletion mutations were detected using the multiplex ligation-dependent probe amplification (MLPA) analysis by using SALSA MLPA kit P003 according to the manufacturer's instructions (MRC-Holland, Amsterdam, Netherlands). All participants who donated blood samples and who were related to the probands with a pathogenic mutation were tested for the same mutation identified in the proband.

Immunohistochemistry
Immunoperoxidase staining was performed on formalin-fixed tissues. Sections (5 μm) from representative tumor blocks were deparaffinized in xylene and absolute alcohol, retrieved with heat, and treated with 3% hydrogen peroxide to eliminate endogenous peroxidase activity. Immunohistochemical staining was performed using specific mouse monoclonal antibodies for MLH1 (clone ES05, 1:100; Novocastra, Newcastle Upon Tyne, UK) and MSH2 (clone 25D12, 1:80; Novocastra), respectively, and the NovoLink Polymer Detection System (Novocastra). Slides were then counterstained with hematoxylin, mounted with Permount (Fisher Chemical, Pittsburgh, PA, USA), and examined for the extent and intensity of nuclear and cytoplasmic staining in tumor cells and for background staining in the stroma. For negative controls, these primary antibodies were replaced with phosphate-buffered saline.

Microsatellite instability analysis
Microsatellite analysis was performed as described previously [22]. The primer panel comprised the reference panel of markers (BAT25, BAT26, D2S123, D5S346, and D17S250). Tumors with a high frequency of microsatellite instability were defined as tumors having instability in two or more markers from the reference panel [22].

Statistical analysis
Food items were categorized into five major groups: staple, meat, vegetable, seafood, and fruit. The consumption of each food item was scored from 1 for "ate almost daily" to 6 for "never ate." A group score was the sum of scores for the intake frequency of the individual food items. These dietary factors were tertile stratified into "low," "medium," and "high," and the lowest tertile of the intake of each food item was set as a reference category.
Descriptive statistics was used to describe the sociodemographic characteristics, lifestyle factors, dietary factors, family history of cancer, and medical history. The risk was considered to begin at birth and end at the diagnosis of CRC, death, or loss to follow-up. Carriers who were not diagnosed for CRC, did not die, and were not lost to follow-up were censored at the date of their last known contact or in February 2012.
Pearson's chi-square test was used to compare the intergroup distributions of the clinical characteristics. A Cox proportional hazard model was used to assess the hazard ratios (HRs) and 95% confidence intervals (CIs) for the association between risk factors and CRC development. Mutated MMR genes (MLH1 or MSH2) and year of birth (<1940, 1940-1949, 1950-1959, 1960-1969, 1970-1979, and >1980) were adjusted for as potential confounding factors. A robust sandwich covariance estimation model was used to adjust for within-cluster (data were not independent within the groups, but independent among the groups) and within-family correlations of the age of onset [23].
The following variables were regarded as potential risk factors associated with CRC in patients with Lynch syndrome: sex, education, ethnicity, occupation, blood group, physical activity, cigarette smoking, alcohol drinking, tea consumption, coffee consumption, and intake of meat, vegetables, fruits, seafood, and staple foods. A stepwise selection procedure with default selection criteria was applied to determine whether covariates were included in the multivariate models, beginning with all potential risk factors. Moreover, a stratified analysis of mutations in MMR genes, namely MLH1 and MSH2, was performed. A P value of <0.05 was considered statistically significant, and all analyses were performed using the Statistical Analysis Software (SAS) package (Version 9.4 for Windows; SAS Institute, Inc., Cary, NC, USA). All statistical tests were two-sided.

Results
The study population comprised 209 MLH1 and 92 MSH2 germline mutation carriers from 75 families, accounting for a total person-time of 12,529 years. During the follow-up period, 147 (48.8%) carriers developed histologically confirmed CRC, and 109 (74.1%) were diagnosed with adenocarcinoma. Of these carriers, 145 (48.2%) were men; moreover, approximately 33.2% of the carriers attended school up to the elementary level, 35.2% attended school up to the high school level, and 31.6% attended school up to the college level (Table 1). In addition, 78.1% of the carriers were Taiwanese, 19.6% were Hakka, and the remaining 2.3% were mainland Chinese or aborigines. The median age for CRC diagnosis was 42 and 35.5 years for MLH1 and MSH2 germline mutation carriers, respectively. However, the proportion of MLH1 germline mutation carriers who were Hakka and developed CRC was higher than that of MSH2 germline mutation carriers. Table 2 shows the association of the risk factors with CRC development. The MMR gene germline mutation carriers of Hakka ethnicity exhibited a 1.65 times higher CRC risk (95% CI = 1.12-2.43) than did their Taiwanese counterparts. In addition, the carriers with a manual occupation exhibited a 1.75 times higher CRC risk (95% CI = 1.20-2.55) than did those with a skilled occupation. Furthermore, regular physical activity, ever tea consumption and high-level fruit intake were associated with a decreased CRC risk (HR = 0.58, 95% CI = 0.40-0.86; HR = 0.68, 95% CI = 0.48-0.96; and HR = 0.60, 95% CI = 0.38-0.94, respectively).
Among MSH2 germline mutation carriers, those with blood group B exhibited an approximately 2.64 times higher CRC risk (95% CI = 1.06-6.58) than did those with blood group O (Table 4). Those who consumed alcohol exhibited an approximately 2.33 times higher CRC risk (95% CI = 1.04-5.21) than did those who never drank alcohol. However, no significant association was observed among the other risk factors and CRC risk. Table 5 lists the results of the multivariate analysis of the association between the risk factors and CRC development. Among the total cohort, subjects of the Hakka ethnicity exhibited a significantly increased CRC risk (HR = 1.62, 95% CI = 1.09-2.34); however, those who performed regular physical activity exhibited a significantly decreased CRC risk (HR = 0.62, 95% CI = 0.41-0.88). These significant risks were pronounced in MLH1 germline mutation carriers, with corresponding HRs of 1.72 (95% CI = 1.16-2.55) and 0.54 (95% CI = 0.34-0.83) among subjects of the Hakka ethnicity and those who performed regular physical activity, respectively. In addition, the total cohort with a manual occupation had a 1.56 times higher CRC risk (95% CI = 1.07-2.27) than did that with a skilled occupation. After stepwise selection model in MSH2 germline mutation carriers, only blood group remain significant and its results are the same as in Table 4.
We further analysed data without probands, using a multivariate Cox proportional hazard model, which revealed that Hakka ethnicity was associated with an increased CRC risk in both the total cohort (HR = 2.23, 95% CI = 1.36-3.64) and MLH1 germline mutation carriers (HR = 2.50, 95% CI = 1.42-4.39) (S2 Table). MSH2 germline mutation carriers with blood group type B had a 3.5-times higher risk of CRC (95% CI = 1.11-10.9) than those with blood group type O. However, regular physical activity was associated with a decreased CRC risk for MLH1 germline mutation carriers (HR = 0.52, 95% CI = 0.29-0.92). Although manual occupation was not significantly associated with CRC risk, it also exhibited a positive association.

Discussion
This study revealed that Hakka ethnicity, manual occupation, and blood group type B were associated with an increased CRC risk, whereas regular physical activity was associated with a decreased CRC risk. The risk factors thought to be associated with CRC in patients with Lynch syndrome varied between people with MLH1 and MSH2 germline mutations. In MLH1 mutation carriers, CRC risk was associated with Hakka ethnicity and regular physical activity, whereas in MSH2 mutation carriers, it was associated with blood group type B. The multivariate model revealed that Hakka ethnicity was associated with a higher CRC risk in the total cohort (HR = 1.62, 95% CI = 1.09-2.34) and MLH1 germline mutation carriers (HR = 1.72, 95% CI = 1.16-2.55) than in their Taiwanese counterparts. No study has investigated this association; thus, the mechanism underlying the increased risk of CRC development in this ethnic group remains unclear. However, Pan et al. investigated the associations of genetic polymorphisms in glutathione S-transferase genes of the Hakka population of south China with family histories of certain chronic diseases and reported that the Hakka population exhibits distinct genetic structures, possibly because of factors related to long-distance migration that occurred centuries ago [24].
Manual occupation was significantly associated with an increased risk of CRC development (HR = 1.56, 95% CI = 1.07-2.27) in the total cohort compared with skilled occupation. Workers with a manual occupation (blue collar) were from farming, architectural, and labor-related industries. These industries have been previously reported to expose workers to harmful chemicals [25]. Manual workers in these industries are exposed to more potential hazardous chemicals and physical factors than skilled workers (white collar) in ambient environment or those in other nonoccupational settings such as retirees [26].
The association between blood group type and CRC risk has rarely been investigated. To date, only one case-control study has reported an association between blood group type O and CRC risk [27]. However, our study revealed that only MSH2 germline mutation carriers with blood group B exhibited an increased CRC risk (HR = 2.64, 95% CI = 1.06-6.58). Moreover, a study reported that people with blood group type B exhibit an increased risk of cancer development; which is consistent with our findings, although the primary outcome of that study was the development of pancreatic cancer and not CRC [28].  The association between physical activity and CRC in Lynch syndrome patients has rarely been investigated because most studies focus on sporadic CRC cases. In this study, we found a risk reduction of 38% in the total cohort and 46% in the MLH1 germline mutation carriers, which is consistent with the results of two previous sporadic CRC studies [15,19]. Yeh et al. reported a risk reduction of approximately 50% in men performing vigorous leisure time activities [15]. Song et al. reported that increased physical activity reduces the prevalence of colorectal adenoma by 44% in Koreans [19]. The protective effect of regular physical activity against CRC has been suggested previously by several researchers. Sanchez et al. and Simons et al. reported that physical activity may reduce weight [16] and improve immune system and insulin sensitivity [29], both of which have been associated with a decreased CRC risk. Another study suggested that regular physical activity may increase peristaltic movements, thereby reducing the duration of mucosal exposure to carcinogens [30].
A major strength of our study was that all study participants were confirmed to have hereditary MMR gene germline mutations. A structured baseline questionnaire enabled us to explore several risk factors, which enabled us to generalize our findings to people with MLH1 and MSH2 germline mutations in clinical settings. However, a major concern is potential ascertainment bias. The Amsterdam II criteria were used for patient ascertainment, hence some families with MMR germline mutations were missed. Syngal et al. reported sensitivity of 78% and specificity of 61% for the Amsterdam II criteria. However, the Amsterdam II criteria are a better screening tool than Amsterdam or modified Amsterdam criteria [31]. The ascertainment bias is minimal as the same risk factors (Hakka ethnicity, blood group type B and regular physical activity) of CRC development were observed from our multivariate Cox regression models for data analyzed with and without probands. In addition, this study was limited by our inability to test for other MMR gene germline mutations that are also associated with Lynch syndrome, such as MSH6, PMS2 and EPCAM germline mutations. We found two patients with mutations in both MLH1 and MSH2. Because the risk of CRC development could be higher in patients with two different mutations than in those with one mutation, these two patients were excluded from this study. Further study with more subjects with multiple MMR gene germline mutations is required to explore this relationship. In addition, approximately 54% of the mutation carriers were not willing to be followed up for their recent morbidity status at least once since recruitment; hence, we likely failed to record some newly developed cancer cases. However, if the identified risk factors are associated with CRC risk in MLH1 or MSH2 germline mutation carriers, this misclassification would bias the results toward the null hypothesis and lead to underestimation of the risk.
In conclusion, this study revealed that Hakka ethnicity, manual occupation, and blood group type B were associated with an increased CRC risk, whereas regular physical activity was associated with a decreased CRC risk. We therefore suggest that all Lynch syndrome patients must be encouraged to adjust their lifestyle by engaging in regular physical activity. Moreover, regular colonoscopy surveillance should be encouraged among these patients, particularly among those of Hakka ethnicity or with blood group type B.