A survey of the clinicopathological and molecular characteristics of patients with suspected Lynch syndrome in Latin America

Genetic counselling and testing for Lynch syndrome (LS) have recently been introduced in several Latin America countries. We aimed to characterize the clinical, molecular and mismatch repair (MMR) variants spectrum of patients with suspected LS in Latin America. Eleven LS hereditary cancer registries and 34 published LS databases were used to identify unrelated families that fulfilled the Amsterdam II (AMSII) criteria and/or the Bethesda guidelines or suggestive of a dominant colorectal (CRC) inheritance syndrome. We performed a thorough investigation of 15 countries and identified 6 countries where germline genetic testing for LS is available and 3 countries where tumor testing is used in the LS diagnosis. The spectrum of pathogenic MMR variants included MLH1 up to 54%, MSH2 up to 43%, MSH6 up to 10%, PMS2 up to 3% and EPCAM up to 0.8%. The Latin America MMR spectrum is broad with a total of 220 different variants which 80% were private and 20% were recurrent. Frequent regions included exons 11 of MLH1 (15%), exon 3 and 7 of MSH2 (17 and 15%, respectively), exon 4 of MSH6 (65%), exons 11 and 13 of PMS2 (31% and 23%, respectively). Sixteen international founder variants in MLH1, MSH2 and MSH6 were identified and 41 (19%) variants have not previously been reported, thus representing novel genetic variants in the MMR genes. The AMSII criteria was the most used clinical criteria to identify pathogenic MMR carriers although microsatellite instability, immunohistochemistry and family history are still the primary methods in several countries where no genetic testing for LS is available yet. The Latin America LS pathogenic MMR variants spectrum included new variants, frequently altered genetic regions and potential founder effects, emphasizing the relevance implementing Lynch syndrome genetic testing and counseling in all of Latin America countries.


Background
LS is caused by a defective mismatch repair (MMR) system, due to the presence of germline defects in at least one of the MMR genes, MLH1, MSH2, MSH6, PMS2, or to deletions of the 3′ portion of the EPCAM gene [1]. Such variants are here referred to as path_MMR and, when specifying one of the genes, as path_MLH1, path_MSH2, path_MSH6, path_PMS2 or path_EPCAM [2,3]. LS is clinically classified according to the Amsterdam (AMS) criteria and/or the Bethesda guidelines, both relying in clinical information and family history. The Bethesda guidelines also takes into account the microsatellite instability (MSI) tumor marker, which is a signature characteristic of MMR-deficient tumors [4][5][6][7]. MSI or immuno-histochemical (IHC) testing of tumors are strategies to select patients for subsequent germline diagnostic testing in blood [8].
LS patients have an increased lifetime risk of colorectal cancer (CRC) (70-80%), endometrial cancer (50-60%), stomach cancer (13-19%), ovarian cancer (9-14%), cancer of the small intestine, the biliary tract, brain as well as carcinoma of the ureters and renal pelvis [9]. The cumulative incidence of any cancer at 70 years of age is 72% for path_MLH1 and path_MSH2 carriers but lower in path_MSH6 (52%) and path_PMS2 (18%) carriers. Path_MSH6 and path_PMS2 carriers do not have an increased risk for cancer before 40 years of age [2,3]. The identification of LS patients is a goal because an early diagnosis and intensive screening may predict the disease and/or improve the disease prognosis [2].
The path_MMR variant spectrum of LS has been widely studied in CRC patients from North America, Europe, Australia and Asia. In the past decade, significant advances have been made in molecular testing and genetic counseling for LS in several Latin America countries .
A broad definition of Latin America is that all countries of the Americas south of the United States are included, with Mexico, Cuba, Puerto Rico and all the countries located in South America as well as the Caribbean Islands. Latin America presents with genetically somewhat different populations, where European and African immigrants have a concentration of the Caucasian population in the southern regions of the continent, whereas in the northern region, the population is predominantly Mestizo (a mixture of European and Amerindian) [52].
The clinical, molecular and MMR variant spectrum of LS has not been fully studied in all Latin America countries. Our study aims to combine both unpublished register data and published data in order to better describe the LS molecular profile and to update the previously described South American path_MMR variant spectrum study [32].

Methods
Unpublished data from hereditary cancer registries and published data from patients with suspected LS from Latin America have been included in this work. Through research collaborations, data from the Latin America hereditary cancer registers are available following direct contact with the register. The data include results from germline DNA testing, tumor testing (based on MSI analysis and/or IHC) and family history (Fig. 1).

Hereditary cancer registries
Families that fulfilled the AMSII criteria [4,5], the Bethesda guidelines [6] and/or other criteria i.e. families suggestive of a dominant CRC inheritance syndrome were selected from 11  Patients were informed about their inclusion into the registries, which generally contained data on family history, clinical information, age at onset and results of DNA testing or tumor screening in the diagnosis of LS. Written informed consent was obtained from all participants during genetic counseling sessions.

LS databases
A systematic review was performed in order to identify published reports on MMR variants in LS or hereditary CRC by querying the PubMed, SciELO and Google databases using specific key words (focusing on clinical, tumor or genetic testing information associated with the MMR genes) and taking into account publications in three languages, namely Spanish, English and Portuguese, up to July 2016. The search terms were "Lynch syndrome", "hereditary colorectal cancer", "hereditary colorectal cancer and Latin America" and "Lynch syndrome and Latin America". We also used keywords in association with the names of Latin America countries (e.g., "Lynch syndrome and Colombia"). The results of the search were subsequently screened for the presence of path_MMR variants or tumor screening, clinical diagnosis and family history.

Germline DNA testing
Genetic testing was generally based on Sanger sequencing of MLH1, MSH2, MSH6 and/or PMS2 and/or EPCAM in 7 participating centers from Argentina (Hospital Italiano de Buenos Aires and Hospital Español de Rosario), Brazil (Barretos Cancer Hospital and Hospital de Clinicas de Porto Alegre), Chile (Clinica Las Condes), Colombia (Clinica del Country) and Uruguay (Hospital de Las Fuerzas Armadas). Multiplex Ligation-dependent Probe Amplification (MLPA) was used to analyze genomic rearrangements in MMR and EPCAM genes (SALSA kit P003, MRC-Holland, Amsterdam, Netherland). For PMS2 analysis, especially for exons 12 to 15, to ensure the correct analysis of PMS2 and to avoid pseudogene co-amplification, a long-range PCR followed by a nested PCRs strategy was adopted. After amplification, sequencing was performed according to the manufacturer's instructions.

Tumor testing
Methods to assess tumor MMR status, e.g. MSI analysis and/or MMR protein staining are being currently used in Cordoba (Argentina), Lima (Peru), La Paz (Bolivia) and Mexico City (Mexico) as an approach to identify potential carriers of germline path_MMR variants. Germline MMR testing is then mandatory to confirm LS cases.
Families from Peru (Instituto Nacional de Enfermedades Neoplasicas) were evaluated for MSI using a 5mononucleotide marker panel (BAT-25, BAT-26, D2S123, D17S250 and D5S346). Tumors were classified into three categories and defined as MSI high (MSI-H) when ≥2 markers were unstable, MSI low (MSI-L) when one marker was unstable and microsatellite stable (MSS) when none of the markers were unstable. In Bolivia (Centro de Enfermedades Neoplasicas Oncovida), MSI analysis was evaluated by 1-mononucleotide marker panel (BAT-26).

Family history
Available data of family history of patients with CRC included 4 published reports from Brazil [19], Mexico [49], Paraguay [50] and Peru [33].

MMR variants nomenclature and classification
The nomenclature guidelines of the Human Genome Variation Society (HGVS) were used to describe the detected MMR variants [54]. Variants were described by taking into account the following reference sequences: The MMR variants were classified according to the 5tier classification system into the following categories: class 5 (pathogenic), class 4 (likely pathogenic), class 3 (uncertain variants), class 2 (likely not pathogenic) and class 1 (not pathogenic) [55]. Novel MMR variants were considered class 5 if they: a) introduced a premature stop codon in the protein sequence (nonsense or frameshift); b) occurred at the most conserved positions of donor or acceptor splice sites (i.e. IVS ± 1, IVS ± 2); or c) represented whole-exon deletions or duplications.
Well established polymorphisms, Class 1 variants and Class 2 variants were considered normal variants and not included in this study, except for the MSH6 c.733A > T, which has conflicting interpretations of pathogenicity. We focused on Class 3, Class 4 and Class 5 variants in this study.
In addition, we updated our previous South American LS study [32] according to the 5-tier classification system, with InSiGHT updates [55].

Splicing-dedicated bioinformatics analysis
The potential impact on RNA splicing induced by the MMR variants was evaluated by focusing on alterations of donor and acceptor splice sites. We took into consideration both the potential impairment of reference splice sites and the possibility of creation of de novo splice sites. The analysis was performed by using the MaxEntScan algorithm [56] interrogated by using the Alamut software (Interactive Biosoftware, France) [57,58]. For stratification purposes, negative alterations of reference splice sites were deemed important when MaxEntScan scores showed ≥15% decrease relative to corresponding wild-type splice sites [57]. The possibility of variantinduced de novo splice sites was assessed by annotating all increments in local MaxEntScan scores and comparing their values with those of reference splice sites as well as of nearby cryptic splice sites. In this case and for exonic variants, only scores equal or higher to those of the corresponding reference splice site within the same exon (as well as of local cryptic sites) were considered worth noting. In the case of intronic variants, only scores equal or higher to those of the weakest corresponding reference splice site within the same gene (as well as of local cryptic splice sites) were considered as potentially creating de novo splice sites.

Statistical analysis
Clinical characteristics were described using frequency distributions for categorical variables and summary measures for quantitative variables. To assess comparability of study groups, chi-square test or Fisher's exact test was used for categorical variables and Student's t test or Mann-Whitney to compare quantitative variables.

Path_MMR variants
By combining data provided by 7 participating centers, we identified suspected LS in a total of 881 Latin America individuals belonging to 344 unrelated families (Table 1, Fig. 1). Path_MMR genes were identified in 47% (range 39-64% depending on the participating countries/registries) of the families that fulfilled the AMSII criteria and/or the Bethesda guidelines and/or other criteria (Table 1). When the AMSII criteria were considered, the path_MMR genes detection raised to 64% (91/142), whereas 32% (54/170) and 23% (11/47) fulfilled the Bethesda guidelines and other criteria, respectively. The range of the mean age at diagnosis was 32-45 years for CRC and 43-51 years for endometrial cancer depending on the countries/registries (
By the MaxEntScan algorithm, we found that 12% of the variants in our cohort are expected to have a negative impact on RNA splicing (Table 3). Indeed, for 27 out of the 220 variants, the MaxEntScan algorithm predicts a significant decrease in splice site strength (>15% decrease in MaxEntScan scores relative to corresponding wild-type splice sites). These include 23 intronic variants (7 within acceptor sites and 16 at donor sites) and 4 exonic variants (located either at the penultimate or at the last position of the exon). Among these variants, 24 are already considered pathogenic (either Class 4 or Class 5, with MaxEntScan scores ranging from −23% to −100% of WT), including 15 variants located at the most conserved positions of the consensus splice sites, i.e. IVS ± 1 or IVS ± 2, and a nonsense mutation located at the penultimate position of MLH1 exon 8. The threeremaining potential splicing mutations are either currently considered as Class 3 (MLH1 c.588G + 5G > C, and PMS2 c.1144G > C) or have not yet been reported (MLH1 c.588 + 5G > T). Further studies will be necessary to determine if these three variants cause splicing alterations as predicted by MaxEntScan (decrease in donor splice site strength, MaxEntScan scores ranging from −27% to −55% of WT), and if they are pathogenic or not.
Our in-silico assessment of potential variant-induced de novo splice sites (data not shown) indicates that 3 out of the 220 variants analyzed in this study are likely to create new splice sites. More precisely, MLH1 c.117-1G > T is predicted to destroy the acceptor site of MLH1 exon 2 and to concomitantly create a potential new and stronger acceptor site 5 nucleotides downstream, within the exon; MSH2 c.645 + 1_645 + 10delins15 is expected to destroy the donor site of MSH2 exon 3 and to create a new donor site 14 nucleotides downstream the reference site, within intron 8; and PMS2 c.804-1G > T is predicted to destroy the acceptor site of PMS2 exon 8 and to concurrently create a new and stronger acceptor site, 8 nucleotides downstream, within the exon. These in silico predictions support the classification of MLH1 c.117-1G > T, MSH2 c.645 + 1_645 + 10delins15 and PMS2 c.804-1G > T as pathogenic (Table 3).
We found that the Latin America LS variant spectrum was broad with 80% (175/220) alterations being private i.e., observed in a single family, 15% (33/220) observed in 2-3 families and 6% (12/220) variants observed in ≥4 families. Forty-one variants (19%) had not previously been reported in LS, and thus herein represent novel genetic variants in the MMR genes (including 10 in MLH1, 13 in MSH2, 11 in MSH6, 5 in PMS2 and 2 in EPCAM). The classification of the remaining 179 variants is indicated in Table 3, 37 variants being currently considered as Class 3, 10 as Class 4, 131 as Class 5 and 1 has conflicting interpretations of pathogenicity (Table 3, Fig. 3). The variants have been submitted to the InSiGHT locus-specific database (https://www.insight-group.org).

Differences between LS patients according to the path_MMR gene
The clinicopathological characteristics evaluated were similar between path_MLH1, path_MSH2, path_MSH6, path_PMS2 and path_EPCAM carriers, except for the mean age at CRC diagnosis for MLH1 (39.6 years) and MSH2 carriers (41.5 years) (p ≤ 0.05) ( Table 5). For path_MLH1 carriers, we observed that the probands had more family history of CRC (56.4%) than LS-associated cancers (20.1%) and 97% fulfilled the AMSII criteria. LS individuals with path_MSH2, path_MSH6 and path_PMS2 were mostly females (63.5%, 90% and 77.8% respectively). Path_MSH2 carriers fulfilled AMSII criteria (100%) while path_MSH6 and path_PMS2 carriers had more family history of CRC (30% and 75%, respectively) than LS-associated cancers (10% and 25%, respectively). Path_EPCAM carriers had a lower number for each clinical characteristic (Table 5). Deviating distributions of the parameters discussed above for path_MSH6 and especially path_PMS2 carriers may have escaped significance due to limited number of carriers included.

Tumor testing results
Tumors specimens from 83 individuals from Peru, 6 from Argentina, 61 from Bolivia, and 60 from Mexico were analyzed either by IHC and MSI-testing, MSItesting only, or IHC only, respectively, ( Table 6). Of these, 69 (32.8%) were found to have MMR-deficient tumors as determined by IHC or MSI analysis (   and gender ( Table 7). As shown in Table 7, family history of CRC was increased in MMR-deficient individuals compared to MMR proficient (P ≤ 0.05). Interestingly, AMSII criteria were more frequently fulfilled among MMR deficient (42.4%) than MMR-proficient (10.9%) individuals and this difference was statistically significant (P ≤ 0.05) ( Table 7). Compilation of IHC and MSI data from reports on Latin America LS cases (published results and/or database entries) revealed that 21% had MMR deficiency based on IHC and/or MSI analysis (2.5%-60%). No information was available for the mean age at CRC and endometrial cancer diagnosis (Table 8). This data highlights the importance of genetic testing for LS in these populations.

Family history
Since there are no premonitory signs of susceptibility to LS, family history has been the primary method for   (Table 9).

Discussion
Progress has been achieved throughout the past years regarding a better molecular and clinical characterization of LS in Latin America, which is important for the surveillance and management of high-risk patients and their families [2]. Here, we present the first thorough LS investigation in Latin America by taking into account 15 different countries. We found that germline genetic testing for LS is already available in six of these countries (Argentina, Brazil, Chile, Colombia, Uruguay and Puerto Rico). Moreover, in three countries (Bolivia, Peru and Mexico), where genetic testing is not yet implemented, tumor analyses are already performed for identifying patients most likely to carry a path_MMR variant.
According to our data, the contribution from the different MMR genes is apparently slightly higher for MLH1 and MSH2 and lower for MSH6 and PMS2 when comparing to the InSIGHT database and international reports. It is possible that this pattern reflects the recent inclusion of MSH6, PMS2 and EPCAM in LS genetic testing in Latin America molecular diagnostic laboratories but could also reflect population structure [32,48,76,77]. Interestingly, the clinicopathological features of path_MMR carriers described in Latin America families are in accordance with other studies, e.g. the AMSII criteria were fulfilled by 64% of the path_MMR carriers [37,77].
This study revealed that the Latin America spectrum of MMR variants is broad with a total of 220 different variants, of which 80% are currently considered as private, whereas 20% are deemed as recurrent. Our data support evidence on a significant contribution from large deletions/duplications in EPCAM and frameshift variants in MLH1 and MSH2. Of the 220 MMR variants, 178 were already listed in the InSiGHT database or previous studies [78,79], whereas 41 have not been previously reported in LS [80]. In addition, we observed that MSH2 variants most frequently caused disease in Argentinean LS families. Further studies are needed to elucidate the ancestral origin of MMR variants in this population, which may increase the knowledge on the inheritance of LS among affected Latin America individuals [10,14,17,40].
Differences in the spectrum of path_MMR variants between populations could be due to differences in the sample size, clinical criteria, selection bias, as well as, genetic ancestry of the individual populations. For instance, Caribbean Hispanics have higher percentage of African ancestry compared to Argentineans and Uruguay nationals [36]. Puerto Ricans are an admixed population of three ancestral populations, including European, Africans and Taínos [36]. The South American population is ethnically mixed from American Indian, European, and other ancestries, but the proportions may vary between countries. For instance, European ancestry predominates in Uruguay and Argentina, whereas Brazil includes a more heterogeneous population, which is the result of interethnic crosses between the European colonizers (mainly Portuguese), African slaves, and the autochthonous Amerindians [15]. The Peruvian population is a multi-ethnic population with Amerindian (45%), Mestizo (37%), white Spanish influence (15%), as well as other minority ethnic groups, such as African-American, Japanese, and Chinese (3%) [24]. In Chile, Colombia and Bolivia, Spanish colonist and American Indian ancestry influence the populations [20,32].
It is well established that awareness of founder variants in a specific geographic area or population can be very helpful in designing cost-effective molecular diagnostic approaches [70,81,82]. Founder mutations provide molecular diagnostic centers the benefit of unambiguous results and thereby, do not demand high skilled professional training.
The other aim of the study was to investigate if the previously MMR variants identified in South American LS families [32] are in accordance with the 5-tier classification system [55]. We were able to refine the classification of 16 MLH1 and MSH2 variants.
When the tumor MMR data from original and published studies were combined, up to 33% of suspected  [34,[83][84][85][86]. These differences could also be a reflect of the differences in the tumor testing methodologies across the countries, e.g. MSI analysis is not widely available in the majority of routine pathology service laboratories, the number of MSI mononucleotide markers varies between laboratories as well as the limitation in the number of MMR proteins analyzed by IHC. Moreover, even if MMR deficiency is a good predictor of carrying a germline path_MMR variant, MMR deficiency can also result from somatic inactivation, most commonly due to methylation of the MLH1 promoter [86]. IHC and MSI testing will, however, combined identify most LS patients with high sensitivity and specificity.  In Latin America, low budgets make the issue of integrating genetics into clinical practice a challenge, a situation in which the use of family history becomes important for patient care, as it is a low-cost strategy and a risk assessment tool [19]. In this scenario, published family history data from Paraguay, Peru, Brazil and Mexico suggest its use as a triage tool together with IHC and MSI to identify and stratify genetic risk in these populations [19]. However, awareness of hereditary cancer among clinicians involved in diagnosis and treatment of CRC is currently low, and families actually meeting the clinical criteria may not be identified [77]. In addition, the average life expectancy in Latin America and the Caribbean is 75 years and inequalities persist among and within the countries (www.paho.org). These countries are mainly represented by a young population where family history could be less informative and insensitive for assessing genetic screening for LS.
Limitation on genetic testing has an impact in the evaluation of the patients at risk of hereditary cancer and their relatives, and ultimately increases the burden of cancer for this minority population [35]. As mentioned, in Latin America, genetic testing is not routinely available at the public health system, with exception of few studies conducted in research institutes or private institutions. For instance, until recently the coverage of oncogenetic services in Brazil, was restricted to less than 5% of the population. However, a significant advance took place in 2012, when the coverage of genetic testing by private health care plans became mandatory in Brazil, currently covering around 20-30% of the population [19,87].
This work provides a snapshot view of the current LSassociated diagnostics practice/output in Latin America. The limitations of this study include the selection of patients recruited from selected reference centers and/or from a nation-wide public reference hospital for cancer patients that cannot renders a representative sample. Furthermore, the diagnostic methodologies may vary between the countries regarding the coverage of the coding region of the genes tested and the clinical criteria for referral to genetic counseling and testing, thus causing an even larger knowledge gap. Finally, several countries are not represented; for instance, we could not find any reports from Venezuela, Honduras, Nicaragua or Ecuador. It will be important to pursue additional studies on LS in Latin America countries to both increase the knowledge of MMR variants in different populations and to bring additional awareness of this condition to medical professionals and public health leaders in Latin America.

Conclusions
The Latin America LS MMR variants spectrum included new MMR variants, genetic frequent regions and potential founder effect. The present study provides support to set or improve LS genetic testing in these countries. Improving the accessibility, including tertiary care, is vital in low-income and middle-income countries that face an increasing burden of CRC. An early diagnosis and intensive screening may predict the disease and/or improve the disease prognosis. Low cost approaches to reach these ends are discussed.