C6orf10 Low-Frequency and Rare Variants in Italian Multiple Sclerosis Patients

In light of the complex nature of multiple sclerosis (MS) and the recently estimated contribution of low-frequency variants into disease, decoding its genetic risk components requires novel variant prioritization strategies. We selected, by reviewing MS Genome Wide Association Studies (GWAS), 107 candidate loci marked by intragenic single nucleotide polymorphisms (SNPs) with a remarkable association (p-value ≤ 5 × 10-6). A whole exome sequencing (WES)-based pilot study of SNPs with minor allele frequency (MAF) ≤ 0.04, conducted in three Italian families, revealed 15 exonic low-frequency SNPs with affected parent-child transmission. These variants were detected in 65/120 Italian unrelated MS patients, also in combination (22 patients). Compared with databases (controls gnomAD, dbSNP150, ExAC, Tuscany-1000 Genome), the allelic frequencies of C6orf10 rs16870005 and IL2RA rs12722600 were significantly higher (i.e., controls gnomAD, p = 9.89 × 10-7 and p < 1 × 10-20). TET2 rs61744960 and TRAF3 rs138943371 frequencies were also significantly higher, except in Tuscany-1000 Genome. Interestingly, the association of C6orf10 rs16870005 (Ala431Thr) with MS did not depend on its linkage disequilibrium with the HLA-DRB1 locus. Sequencing in the MS cohort of the C6orf10 3′ region revealed 14 rare mutations (10 not previously reported). Four variants were null, and significantly more frequent than in the databases. Further, the C6orf10 rare variants were observed in combinations, both intra-locus and with other low-frequency SNPs. The C6orf10 Ser389Xfr was found homozygous in a patient with early onset of the MS. Taking into account the potentially functional impact of the identified exonic variants, their expression in combination at the protein level could provide functional insights in the heterogeneous pathogenetic mechanisms contributing to MS.


INTRODUCTION
Multiple Sclerosis (MS) is a chronic autoimmune disease of the central nervous system (CNS), involving inflammatorybased mechanisms and characterized by demyelination, neurodegeneration and progressive accumulation of neurological dysfunction (Ciccarelli et al., 2014). The heterogeneous manifestation and clinical course of MS are explained by its complex multi-factorial nature, where the interaction of genetic, lifestyle, and environmental factors confer the susceptibility (Morandi et al., 2015;Olsson et al., 2017). The heritable contribution to MS risk is supported by investigations on families (Patsopoulos, 2018).
To date, the majority of genetic studies on MS have been focused on susceptibility variants. In particular, several genomewide association studies (GWAS), and subsequent replication studies, have identified hundreds of variants within susceptibility gene loci (Bashinskaya et al., 2015).
The single nucleotide polymorphisms (SNPs) identified through GWAS are mainly located within non-coding regions of the genome, which could pinpoint the presence of diseaseassociated variants in linkage disequilibrium.
The very recent study, made by the International Multiple Sclerosis Genetics Consortium, provides for the first time the evidence that low-frequency variants (minor allele frequency, MAF < 5%) explain 11.34% of the observed difference between cases and controls (International Multiple Sclerosis Genetics Consortium, 2018). The majority of low-frequency variants which contributed to MS risk were not individually detectable at genome-wide thresholds and among those associated with MS, only 1/3 were in linkage disequilibrium with the common variants from the GWAS (International Multiple Sclerosis Genetics Consortium, 2018).
Genetic studies in MS have used whole exome sequencing (WES) to investigate somatic mutations (Kemppinen et al., 2014), to define the genetic contribution to MS clinical outcomes (Sadovnick et al., 2017) and to suggest new potential causative variants in families (Dyment et al., 2012;Garcia-Rosa et al., 2017;Maver et al., 2017;Mescheriakova et al., 2018) or unrelated patients (Bernales et al., 2018). WES data in families, suggesting monogenic disease forms caused by rare variants with strong functional impact (Ramagopalan et al., 2011;Wang et al., 2016), were not confirmed in the subsequent replication studies (Ban et al., 2013;International Multiple Sclerosis Genetics Consortium, 2016;Minikel and MacArthur, 2016). Based on the aforementioned findings, decoding the genetic risk components of MS still represents a challenge, and novel strategies are required to prioritize variants (Wang and Biernacka, 2015).
We set up a targeted WES-based pilot study in MS families, followed by low-frequency variants investigation in a cohort of unrelated patients in Italy, where a high prevalence and incidence of MS have been reported (Battaglia and Bezzini, 2017;Granieri et al., 2018).
The main aim of this study was to identify new exonic and potentially functional low-frequency variants for MS risk within genes marked by intragenic common variants from the GWAS.

Study Population
The study population included (i) three Italian families with MSaffected members (clinical characteristics and pedigree shown in Figure 1) and (ii) 120 Italian unrelated MS patients coming from the mainland, selected from previous studies (Gemmati et al., 2012;Marchetti et al., 2018;Ziliotto et al., 2018) for having age of MS onset under 52 and mean age of MS onset (34.5 ± 9.7) similar to that of the affected family members (34.1 ± 8.2; Student's t-test p-value = 0.371). MS diagnosis was assessed according to the 2010 revised McDonald criteria (Polman et al., 2011). All MS patients underwent neurological visits, MRI examinations and assessment of the Expanded Disability Status Scale (EDSS) (Kurtzke, 1983). All patients were over 18 years of age and the patients' medical records were examined in order to ascertain age of diagnosis and type of MS at the time of blood sample collection. Among the 120 unrelated MS patients, 44 patients were relapsing remitting MS (RR-MS), 53 were secondary progressive MS (SP-MS) and 23 had primary progressive (PP-MS) course of the disease. Written informed consent was obtained from all subjects, and the study was approved by the Ethical Committee of the S. Anna University-Hospital, Ferrara, Italy. Demographic and clinical characteristics of the unrelated MS patients are summarized in Table 1.

Search Strategy and Selection Criteria of Candidate Genes
A systematic review of the literature was performed for all years available through December 31, 2017. The primary source was the PubMed database 1 , for which search terms "GWAS" and "multiple sclerosis" were used. Further search included NHGRI-EBI GWAS catalog 2 . Variants identified by GWAS in MS were selected for being intragenic, non-HLA, and having p-value ≤ 5 × 10 −6 . On the basis of these selected common variants, we generated the final gene reference list for the present study (Supplementary Table 1). The study design is schematically described in Figure 2.

Whole-Exome Sequencing and Analysis
The genomic DNA (gDNA) was extracted from peripheral blood using the Wizard R Genomic DNA Purification Kit (Promega, Madison, WI, United States). WES was performed on eleven individuals, seven diagnosed with MS and four unaffected, from three independent families (Figure 1). Sequencing was performed by BGI (Shenzhen, China) using nanoarray-based short-read sequencing-by-ligation technology (cPAL TM ). Reads were mapped against the hg19 human reference sequence 3 using SOAPaligner 4 . Variants calling was performed FIGURE 1 | Pedigree of families (A-C) with multiple sclerosis and clinical characteristics of the affected family members, RR-MS, relapsing-remitting multiple sclerosis; SP-MS, secondary-progressive multiple sclerosis; PP-MS, primary progressive multiple sclerosis; EDSS, expanded disability status scale. EDSS ranges from 0 to 10 in 0.5-point increments; higher scores indicate more disability. Onset, mean y ± SD 34.5 ± 9.7 33.9 ± 9.3 32.9 ± 9.5 39.1 ± 9.8 0.032 EDSS at examination, median (IQR) 6 (2-6.5) 2 (1-2.5) 6.5 (6-6.5) 6 (6-6.5) <0.0001 with the Complete Genomics Small Variant Caller 5 . Genetic variations were verified in: (i) the database of Single-Nucleotide Polymorphisms (dbSNP, Build 150 6 ), (ii) the 1000 Genomes Project databases 7 , (iii) the Exome Aggregation Consortium (ExAC 8 ), and (iv) controls of Genome Aggregation database (gnomAD v2.1.1 controls 9 ). The coverage of the target region was 99.26%, the average depth on target region was 129X and the target coverage with at least 20× was 90.6%. The filtering performed on WES-data is schematically described in Figure 2. In order to remove systematic artifacts, we visually verified the filtered low-frequency variants with IGV 10 . We considered a variant as true heterozygous with a call of 40% of the total reads.
The effects of new coding variations on protein structure and function were predicted using Provean/SIFT 11 , while the effects of low-frequency coding variations, already reported on databases as prediction by SIFT/PolyPhen, were extrapolated from Ensembl 12 . The whole-exome sequencing dataset generated during the current study is available in the https://www.ncbi.nlm. nih.gov/sra (Accession Number: PRJNA544162).

Mutation Screening
Low-frequency variants, identified through the filtering in MS families, were confirmed by Sanger sequencing (n = 13) or restriction analysis (n = 2, ADAMTS3 and GC). Primers to amplify the coding sequences containing the identified low-frequency variants were designed with Primer3 software v0.04.0 13 . A total of 50 ng of gDNA was amplified by polymerase chain reaction (PCR) using a standard protocol with AmpliTaq Gold 360 DNA polymerase (Applied Biosystems, Foster City, FIGURE 2 | Study design. Multiple sclerosis (MS) GWAS and WES SNP filtering is schematically described. In light blue, the search strategy and selection criteria of candidate genes; in orange the Whole-exome sequencing and the analysis on MS families; in green the screening strategy in the unrelated MS patients.
CA, United States) (Balestra et al., 2016). PCR conditions were set up as follows: an initial denaturation at 94 • C for 5 minutes and then at 65 • C for 3 minutes, followed by 35 cycles at 94 • C for 30 seconds, specific temperatures for each couple of primers for 30 seconds, 72 • C for 30 second or 1 minute, and a final elongation at 72 • C for 7 minutes. Detailed primer sequences and PCR conditions used for Sanger sequencing are reported in Table 2. The PCR products were purified with CleanSweep TM PCR Purification (Applied Biosystems) prior to direct sequencing (Macrogen, Madrid, Spain). Sequences were analyzed using the software NovoSNP (Weckx et al., 2005). Size of PCR amplicons and restriction products were examined through agarose gel electrophoresis.
The presence of the selected low-frequency variants was investigated in a sample set of 120 Italian unrelated MS patients through Sanger sequencing ( Table 2) or restriction analysis using the primer sequences and the restriction enzymes reported in Table 3.

Bio-Informatics Analysis of Nucleotide Changes
The prediction of mi-RNA targets was conducted by using the tool at www.mirdb.org exploiting the support vector machines (SVMs) procedures. The computational prediction of splice sites and or splicing regulatory elements was conducted by using the www.umd.be/HSF/ online software.

Statistical Analysis
For populations comparison, we used MAFs obtained from: (i) the "dbSNP Build 150" (Homo sapiens Annotation Release 108) which combines all available frequencies from submitted SNPs clustered together into a reference SNP, (ii) the ExAC, which includes exome sequencing data from a wide variety of large-scale sequencing projects and in particular of European (not-Finnish) individuals, (iii) controls of gnomAD, which includes only samples from individuals who were not selected as a case in a case/control study of common disease, and (iv) 1000 Genome Project which contains allelic frequencies for a sample of 107 subjects from Tuscany, Italy, an optimal reference population for our MS individuals. The low prevalence (188 per 100'000 individuals) of MS in Tuscany (Bezzini et al., 2016) makes improbable the presence of individuals with MS in the Tuscany control sample.

RsaI
To verify the presence of the selected mutations, PCR products were digested with restriction enzymes. Nucleotides in bold and underlined are those modified to create specific restriction sites.
To test the difference in MAFs between reference populations and the allelic frequencies observed in the study population, a two-proportion z-test, with a 0.05 two-sided significance level, was applied. A threshold of p < 0.0042, assuming the Bonferroni correction for multiple testing, was used for significance.
The potential enrichment of exonic low-frequency variants in MS patients was evaluated using a permutation approach based on the observed exonic polymorphisms. We first generated the null distribution of the number of low-frequency variants in a random sample of 107 genes, considering the exons composing the longest isoform of each gene, as defined by the human genome annotation (GRCh37/hg19). We took into account both the number and the length of exons, dividing the number of low-frequency variants by the total exon length, for each gene set. Then, we repeated the permutation process 1,000 times and the empirical p-value was defined as the proportion of replicates showing a number of variants higher than the observed value.

Selection of Candidate Genes
Since GWAS and classical linkage studies have extensively investigated the HLA locus, harboring the greatest genetic risk for MS (reviewed in Hollenbach and Oksenberg, 2015), HLA genes were not included in this study.
The review of GWAS in MS literature, reporting polymorphisms associated with MS in case-control studies, identified 141 variants which were selected for being intragenic and for having a p-value ≤ 5 × 10 −6 , an arbitrary threshold potentially highlighting genes with remarkable disease association. These common variants established the list of 107 genes used for the purpose of this study. Variants and corresponding genes are listed in the Supplementary Table 1.

Search for Low-Frequency Variants in MS Families by WES
WES was performed in three independent Italian families with at least two affected members in each pedigree (Figure 1).
A targeted analysis within the 107 MS susceptibility genes was conducted in all pedigree members. SNPs with MAF ≤ 0.04 were taken into account when present in at least one affected family member. The selection of SNPs with MAF ≤ 0.04 was aimed at filtering low-frequency variants which did not emerge in GWAS.
These filtering criteria revealed 17 exonic mutations (ten missense and seven synonymous) and three in the UnTranslated Regions (UTRs), all in the heterozygous condition (Table 4).
The number of low-frequency exonic variants in these families was investigated by a permutation test. No significant difference was observed between our result and that expected by chance (p = 0.231).

Screening of WES-Selected Low Frequency Variants in Unrelated Multiple Sclerosis Patients
We focused our investigation on the 14 low-frequency variants with parent-child transmission and on the new variant of ADAMTS3 ( Table 5). The 15 candidate SNPs were explored in a sample set of 120 Italian unrelated MS patients ( Table 1).
The predicted effects of the 15 analyzed low-frequency variants, each on the main encoded transcript, are reported in Table 5. In addition to the mutation of the start codon (GC, Met1Ile), expected to reduce the amount of the translated protein, missense variants were predicted as damaging (ANKRD55 Ser376Pro; TET2 Gly355Asp) or potentially damaging (MMEL1 Pro368Thr). For the rs147248515 (MMEL1) the proline to threonine change produced discrepant predictions. Among the four SNPs predicted as benign, two cause noticeable changes in amino acid polarity and size (Arg to Gly in MALT1; Ala to Thr in C6orf10).
Aimed at prioritizing low-frequency variants that may contribute to the disease-risk, we compared the MAFs observed in the unrelated MS cohort with those reported in public databases ( Table 5). The new variants on STAT4 and ADAMTS3 genes were not found in the cohort of unrelated MS.
The allelic frequencies of C6orf10 rs16870005 and IL2RA rs12722600 resulted significantly higher in MS patients compared with all the databases (even after Bonferroni's correction). Based on frequencies in the Control gnomAD, the OR for the rs1687005 risk T-allele was 4.57 (95% CI 2.33-8.97) and the odds ratios for the rs12722600 risk T-allele was 9.88 (95% CI 5.71-17.09), both highly significant (pvalue < 0.001).
The TET2 rs61744960 and TRAF3 rs76781122 showed significant MAF differences between MS patients and public databases with the exception of a nominal borderline p-values with the Italian Tuscany population. On the other hand, TOP3A rs2230153 showed significant MAF differences between MS patients and public databases, with the exception of dbSNP150.
For two SNPs (CD86, rs11575853 and GC, rs76781122) significant differences were observed only in the comparison between MS patients and the dbSNP150 population. Of note, for the MALT1 rs74847855, and WWOX rs7201683 highly significant MAFs differences between MS and Tuscany subjects were observed, that may reflect increased frequency of lowfrequency alleles in MS Italian patients.
Within the unrelated MS cohort, 17 patients were carriers of two low-frequency variants and five patients were carrier of three variants. The combinations repeatedly included the C6orf10, TET2, and IL2RA variants ( Table 6). The variants detected in combination were always located on different chromosomes.
The IL2RA rs12722600 and the TRAF3 rs138943371 were detected in the homozygous condition. In bold the genes including the exonic variants present only in affected family members defined as in Figure 1.
Frontiers in Genetics | www.frontiersin.org    The heterozygous (Het) or homozygous (Hom * ) condition of the variants is specified. In bold, patients with combination of 3 low-frequency variants. § Age of onset below 50 years old.
Frontiers in Genetics | www.frontiersin.org The comparison of age of MS onset between patients with or without the 15 investigated low-frequency variants did not provide significant differences.

Detection of Null Mutations in the 3 Exon of C6orf10
The rs16870005 within C6orf10, a scarcely investigated locus, resulted the only missense variant with a significantly increased frequency in our cohort compared to dbSNP-Build150, Tuscany of 1000 Genome Project, ExAC -European (not Finnish)-and Control gnomAD -European (not Finnish)-( Table 5) and, in addition, it was frequently present in combination with other low-frequency variants ( Table 6). Further, the nucleotide change C > T (rs16870005) in the 3 region of the C6orf10 transcripts would substitute threonine for alanine in the carboxyl-terminal region of all the predicted proteins (reference transcript used for the study shown in Figure 3).
Based on these observations we sequenced the region chr6:32261295-32260757 in the 120 MS patients, which revealed the presence of 14 low-frequency mutations (MAF ≤ 0.04), 10 not previously reported ( Table 7).
Of note, two mutations predicted premature termination of translation and two translational frameshift. Inspection of C6orf10 variants (Figure 3) pointed out that the four null mutations affected all C6orf10 transcripts. Among these, the ENST0000442822.6, after splicing, is shorter and encodes a different 3 sequence. In the transcripts other than ENST0000442822.6, null mutations would remove a larger C-terminus portion (Figure 3) in which we detected several missense SNPs.
For missense changes the algorithms predicted discordant effects (Table 7), with the exception of the damaging Gly477Val (rs7751028).
Several databases were inspected for low-frequency (MAF ≤ 0.04) exonic variants within the full C6orf10 transcript ENST00000533191.5 (total length 80 Kb), and for the presence of null variants. In particular, (i) in the Control gnomAD database 301 variants were found, of which 24 were nonsense or frameshift, and (ii) in the ExAC database 271 variants were found, of which 20 were nonsense or frameshift. The proportion of null variants in the MS patients (4/14 within 538 bp) was higher than that in the Control gnomAD (p = 0.0254, Maximum Likelihood chi-square) and in the ExAC (p-value = 0.0184, Maximum Likelihood chi-square).
The distribution of the 14 low-frequency C6orf10 variants detected by Sanger sequencing in the unrelated MS patients is shown in Table 8. The frameshift mutations, of which the Ser389Xfr in the homozygous condition, were detected in two patients with young age of disease onset (21 and 25 years). Three patients were carriers of two/three missense C6orf10 mutations ( Table 8). None of the 14 low-frequency variants was associated with the presence of the C6orf10 rs16870005.
Seven patients over the 11 carriers of the C6orf10 variants were also carriers of SNPs detected in the family WES study (Supplementary Table 2).

DISCUSSION
Taking into account the complex multi-factorial nature of MS, and the recently estimated contribution of low-frequency variants into disease risk, we aimed at investigating genetic risk components through combination of variant prioritization strategies. We explored by WES 107 candidate loci for exonic low-frequency variants in three Italian MS families with two or three affected members and validated the results in 120 unrelated  The sequenced 3 exonic region spans chr6:32261295-32260757 (GRCh37/hg19). The reference transcript is ENST00000533191.5, and the reference protein is ENSP00000431199 (exon position 26/26). The Provean cut off: equal or below −2.5, deleterious; above −2.5, neutral. The SIFT score range prediction: 0.0 to 0.05 deleterious; 0.05 to 1.0 tolerated. All variants were detected in heterozygous condition with the exception of Ser389Xfr (homozygous condition, bold and black). * Position in repetitive regions.
Italian patients. This experimental approach brought to attention 15 exonic variants, firstly selected for parent-child transmission and further investigated for increased frequency as compared to public databases. Among these, STAT4 and ADAMTS3 variants were found to be private of the families under study, which might support the notion that a proportion of the unexplained MS heritability is accounted by additive effects of individual variants (International Multiple Sclerosis Genetics Consortium, 2018). Noticeably, our screening identified a number of significant differences in the observed allelic frequencies as compared with public databases. The observed MAFs, higher in the Italian MS patients under study, point toward an increase in frequency of the detected variants in patients as compared to healthy population. The C6orf10 rs16870005 and IL2RA rs12722600 are the main signals identified in our study, as confirmed by comparison with all databases.
Through different experimental approaches, focused on lowfrequency variants, we report several and novel findings in the C6orf10 locus. The 3 region, including the potentially damaging variant rs16870005 which resulted associated with MS in our cohort, was found to contain 14 low-frequency mutations, among which 10 not previously reported and four potentially null variants. Furthermore, three patients showed combination of the C6orf10 heterozygous low-frequency variants, one was found to be homozygous for the Ser389Xfr, and seven displayed combinations of C6orf10 with low-frequency variants in other candidate genes. Although anecdotal, finding the homozygous Ser389Xfr in a patient with early onset of the MS disease fosters further investigation in relation to the recent study suggesting that C6orf10 could be implicated in the age of onset of other neurodegenerative disease (Zhang et al., 2018).
The interpretation of potential functional consequences was hampered by the C6orf10 chromosomal location and structure. As a matter of fact, this ORF is located on chromosome 6p21.32, in the major histocompatibility complex region which contains the major MS-associated risk gene HLA-DRB1 (Bashinskaya et al., 2015). To evaluate if the signals of associations between the C6orf10 variants could reflect linkage disequilibrium (LD) with DRB1 SNPs reported in the literature (reviewed in Bashinskaya et al., 2015), the data from the 1000 Genome project were explored (Supplementary Table 3). The low LD of C6orf10 rs16870005 with C6orf10 rs3129934 (r 2 0.113), and with the HLA rs9271366 (r 2 0.055), suggests that the association between C6orf10 rs16870005 and MS does not simply reflect the LD with the HLA-DRB1 locus. The C6orf10 structure comprises several transcripts, including three isoforms of a validated, but not characterized, long noncoding RNA (NR_136244.1, NR_136245.1, and NR_136246.1) and a pseudogene hnRNP (HNRNPA1P2). Thus, the null mutations that we have found in the MS cohort would affect the C-terminal portion of several uncharacterized proteins expressed in brain and B cells 14 both tissues of interest for MS.
The small sample size and the statistical power derived from our population do not permit an informative evaluation of the possible impact of the numerous newly detected C6orf10 variants on the disease onset or clinical course within an integrated multiple variants model.
Finding of the IL2RA 3 UTR low-frequency variant (rs12722600) (i) in two MS families, (ii) with a significantly higher frequency in the unrelated MS cohort than in the public databases, and (iii) in several combinations with the other low-frequency SNPs, is particularly intriguing. As for the other IL2RA polymorphisms previously associated with the risk of developing the disease (Matiello et al., 2011;Wang and Chen, 2018), the functional consequence of the rs12722600 can be only proposed (Maier et al., 2009) as affecting the post-transcriptional regulation of the IL2RA mRNA, of which little is known (Techasintana et al., 2017). Bio-informatics prediction of miRNA binding sites in the IL2RA 3 UTR did not reveal creation or disruption of regulatory sites produced by the rs12722600 nucleotide change. Interestingly, therapies for MS have been already developed to avoid the formation of the interleukin-2 receptor complex, and particularly targeting CD25 (Bielekova, 2018), the α-subunit encoded by the IL2RA. Our findings support further studies aimed at characterizing the IL2RA 3 UTR in patients undergoing this therapeutic approach.
The TET2 gene codifies for an enzyme that catalyzes the conversion of 5-methylcytosine to 5-hydroxymethylcytosine, thus modifying the DNA methylation pattern. As functional partner, TET2 may participate in histone modification (O-GlcNAcylation, Chen et al., 2013). Noteworthy, demethylation by TET2 is also actively involved in T cells differentiation and their cytokines production (Ichiyama et al., 2015;Wang et al., 2017). The presence of the low-frequency TET2 rs61744960 variant in MS patients of all the three families, and its frequency in unrelated MS patients cohort higher than in three databases support further investigation of this finding in relation to MS. It is of note that the SNP rs61744960 has been previously reported in an Italian study focused on leukemia, and two of the six patients, who carried this variant, had MS as a primary disease (Ottone et al., 2012).
For the TRAF3 rs138943371 we observed in the four databases a frequency pattern similar to that of TET2 rs61744960. TRAF3 gene codifies for tumor necrosis factor receptor-associated factor 3, a major regulator of innate immune response through different transduction signal pathways (Cullell et al., 2017). By "Human Splicing Finder" analysis, the rs138943371 C to T change, disrupting an exonic splicing enhancer, might create in the TRAF3 transcript a new exonic splicing silencer. Further investigation is needed to evaluate potential effects on transcription and splicing processes, that are strictly related and rely on regulatory elements in the 3 gene region.
Although in the GWAS MS candidate genes we did not detect an increased number of low frequency variants in our MS families, those variants firstly found with parent-child transmission in MS families were detected in several combinations in unrelated patients, particularly in C6orf10, IL2RA and TET2. These observations further highlight the complexity of the MS genetic risk components.
Our study supports further investigation within genetics consortiums of multiple low-frequency risk variants in coding regions of MS candidate genes. Expression of specific protein variants, and their combinations, could provide functional insights in the heterogeneous pathogenetic mechanisms contributing to MS.

ETHICS STATEMENT
Written informed consent was obtained from all subjects, and the study was approved by the Ethical Committee of the S. Anna University-Hospital, Ferrara, Italy.

AUTHOR CONTRIBUTIONS
NZ, GM, CS, MB, PZ, and FB conceived and designed the study. NZ, GM, CS, and FB wrote the manuscript. NZ, SM, and LL performed the literature revision. NZ, SM, LL, BL, NB, and IG carried out the lab experiments. NZ, CS, AB, and DB performed bio-informatic analyses of the data. FS, SS, DG, and PZ selected and recruited the patients, and performed their clinical evaluation. NZ, MB, SM, EM, and DG collected the samples and evaluated the pre-analytical variables. All authors critically evaluated the final version of the manuscript.