Role of non‐coding variants in cardiovascular disease

Abstract Cardiovascular diseases (CVDs) constitute one of the significant causes of death worldwide. Different pathological states are linked to CVDs, which despite interventions and treatments, still have poor prognoses. The genetic component, as a beneficial tool in the risk stratification of CVD development, plays a role in the pathogenesis of this group of diseases. The emergence of genome‐wide association studies (GWAS) have led to the identification of non‐coding parts associated with cardiovascular traits and disorders. Variants located in functional non‐coding regions, including promoters/enhancers, introns, miRNAs and 5′/3′ UTRs, account for 90% of all identified single‐nucleotide polymorphisms associated with CVDs. Here, for the first time, we conducted a comprehensive review on the reported non‐coding variants for different CVDs, including hypercholesterolemia, cardiomyopathies, congenital heart diseases, thoracic aortic aneurysms/dissections and coronary artery diseases. Additionally, we present the most commonly reported genes involved in each CVD. In total, 1469 non‐coding variants constitute most reports on familial hypercholesterolemia, hypertrophic cardiomyopathy and dilated cardiomyopathy. The application and identification of non‐coding variants are beneficial for the genetic diagnosis and better therapeutic management of CVDs.


| INTRODUC TI ON
Cardiovascular diseases (CVDs) are the leading cause of death and account for 31% of mortality, worldwide. 1 Some progressive pathologies linked to cardiovascular diseases are familial hypercholesterolemia, different types of cardiomyopathies, thoracic/aortic aneurysms, congenital heart diseases, coronary artery diseases, heart failure 2 and strokes. 3,4 Despite the promising results of conventional pharmacological treatments, cardiovascular diseases still have poor prognoses. 5 Many factors are associated with cardiovascular disease pathogenesis. Among them, the genetic component is a beneficial tool for the risk stratification of cardiovascular disease development.
Improvements in sequencing technologies have conferred not only better clinical management and diagnosis of genetic disorders but also a better understanding of genetic disorders with unknown mechanisms. 6 Before the completion of the Human Genome Project, genes associated with rare Mendelian forms of cardiovascular diseases had been identified. Recent years have found the identification of hundreds of loci by cardiovascular genome-wide association studies (GWAS) and the formation of a general concept that common genetic associations located in the non-coding parts of the genome have a considerable prevalence. A significant portion of loci associated with cardiovascular traits and disorders is not in linkage disequilibrium (LD) with causative coding regions and elements. 7 The majority of noncoding GWAS variants that play a significant functional role in gene regulation occur within the regions of open chromatin. 4,[7][8][9] Research has indicated that even rare variations contribute to the development of both arrhythmias and cardiomyopathies. 10,11 More recent GWAS studies have unravelled the genetic architecture of more prevalent forms of cardiovascular diseases such as coronary artery diseases and atrial fib F2 cross lation and contributed to a better understanding of pathophysiological pathways involved in cardiovascular diseases. 12 The era of cardiovascular genomics has ushered in two distinguished goals: understanding molecular pathways and implementing the knowledge to expand the field of personalized medicine. 13 Limitations in mapping methods create problems in identifying variants located in non-coding regions. Two main challenges of mapping are the amount of recombination and allelic diversity. Only allelic diversity within the recombinant inbred line 14 population and F2 cross can be assayed through mapping. 15 RIL is the result of sibmating progeny or continuous selfing of F2 population until we have complete homozygosity. 16 The F2 cross is the offspring of two sister seedlings from the F1 hybrid, or the following generation.
What also continues to present a challenge is functional annotation. 17 Accordingly, we conducted the present study to collect all non-coding variants associated with different forms of cardiovascular diseases through a comprehensive review. We herein discuss the necessity of considering not only coding variants but also non-coding variants in the risk of susceptibility to cardiovascular diseases.

| Variant region: coding and noncoding variants
Deoxyribonucleic acid (DNA) is composed of both genic and intergenic regions. Exonic regions encode amino acids and are generally well conserved. 18 Whole-exome sequencing and targeted sequencing of coding regions of the human genome have helped identify both causative frameshift mutations and nonsense and missense variants associated with human disorders. 19 GWAS studies have also indicated that single-nucleotide variants/polymorphisms (SNVs/ SNPs) located in enhancer elements, DNase hypersensitivity regions and chromatin marks known as 'functional non-coding regions' are associated with complex diseases. 20,21

| S E ARCH S TR ATEGY
In the present study, a comprehensive and systematic search was conducted on literatures and Clinvar database in order to fulfil all reported non-coding variants of different cardiovascular diseases such as dyslipidaemia, familial hypercholesterolemia, different types of cardiomyopathies, congenital heart diseases, thoracic aortic aneurysms and dissections, coronary artery diseases and strokes and sudden cardiac death. All genes and non-coding variants involved in each disorder were checked separately in ClinVar, dbSNP, Iranome, 1000 Genomes Project, gnomAD and TOPMed databases.
Nomenclature for variants was also confirmed according to the recommendations of the Human Genetic Variation Society (HGVS) (http://varno men.hgvs.org/). In addition, we conducted a comprehensive search on published articles on non-coding variants and some variants were extracted through this method. After collecting all the variants, total number and common variant of each separate gene were reported in our study ( Figure 1).  36 Two effective strategies for the detection of familial hypercholesterolemia cases are genetic testing and family cascade screening, which also assist in distinguishing monogenic forms from polygenic or sporadic hypercholesterolemia. Since 2008, the implementation of next-generation sequencing (NGS), as a high-throughput technology, has also yielded promising results for familial hypercholesterolemia patients. 36,37 Sequencing in patients and high-risk families affected with rare monogenic lipid diseases has revealed not only a remarkable number of rare coding mutations but also the necessary pathways involved in lipid metabolism. [38][39][40] GWAS studies have also shown that blood lipid traits such as highdensity lipoprotein cholesterol (HDL-C), total cholesterol levels of low-density lipoprotein cholesterol (LDL-C) and triglycerides have heritability rates of between 40% and 60% among different populations. Although GWAS studies have identified more than 100 common lipid-associated variants, these variants constitute only a small portion (10%-14%) of variations in the lipid phenotype. 41,42 More recent studies on the rare variants of complex traits due to the facility of WES and its data interpretation have focused mainly on the coding regions of the genome, 43 which explains why the effects of the non-coding part of the genome have remained unknown.

| NON -CODING VARIANTS IN DYS LIPIDAEMIA AND FAMILIAL HYPER CHO L E S T E R O L E M I A
The implementation of WES in recent years has proffered better clinical management and diagnosis of less common genetic disorders.
Despite such promising improvements, however, WES is capable of investigating and examining only 30% of the genome, which underscores the role of the other parts of the genome such as regulatory regions. 6,[44][45][46] There have also been reports of distal enhancers and alterations in the 3D genome structure. 17,47 Therefore, the next milestones in the interpretation of the human genome sequence will focus on the remaining 98% of genome regions. 48 A study by Igartua et al. in 2017 provided strong support for the association between rare non-coding variants and lipid traits. That GWAS study recruited 98 Hutterites (European descent), and imputation indicated that 660,238 SNPs that were either rare (frequency <1%) or absent in European ethnicity were more common in Hutterites (frequency >1%). The results also revealed 2 novel non-coding rare variants. The first identified variant (viz, rs17242388 in LDLR) was associated with LDL-C, and the second variant, located between GOT2 and APOOP5 (viz, rs189679427), had a robust association with HDL-C. The third variation was rs138326449, which was previously replicated as a splice variant in APOC3 and was associated with triglycerides and HDL-C. 49 Rare non-coding variants are sometimes a reasonable explanation for AD familial hypercholesterolemia traits.
In another whole-genome sequencing (WGS) study on a large family clinically diagnosed with familial hypercholesterolemia, in whom no mutations were detected in the coding regions of LDLR, APOB and PCSK9, a novel LDLR deep intronic variant (viz, c.2140+103G>T) co-segregated with LDL-C and the familial hypercholesterolemia phenotype. 50 The impact of regulatory elements on lipid traits has F I G U R E 1 The image presents genes with the most frequently reported non-coding variants, the total percentage of non-coding variants in any type of cardiovascular diseases is shown in parentheses. ARVC, arrhythmogenic right ventricular cardiomyopathy; CAD, coronary artery disease; CHD, congenital heart disease; DCM, dilated cardiomyopathy; HCM, hypertrophic cardiomyopathy; RCM, restrictive cardiomyopathy; TAA, thoracic aortic aneurysm. also been reported. Weissglas-Volkov et al. 51 in 2009 identified an rs1424032 SNP in a highly conserved non-coding region of APOB that functioned as a regulatory element and contributed to serum apolipoprotein B levels. In the past 5 years, the main focus of genome analysis has been on three regions: cis-and trans-regulatory elements, enhancers or promoters, and regulatory transcribed noncoding regions. [52][53][54] Regulatory elements, which are mainly located in non-coding regions, play a role in the gene expression of various cell types and act through interactions with various transcription factors (TFs). 53 In our search, we found 408 variants associated with familial hypercholesterolemia. The majority of the reported variants were for LDLR (328/408, 80.39%). Moreover, LDLR had an overlap with another gene, AS1, and 34 variants that reside in this region (34/408, 8.33). Three variants were reported for LDLRAP1, which is important in the rare forms of familial hypercholesterolemia (Table S1). are also thought to be involved in cardiomyopathy pathogenesis. 62

| Non-coding variants in hypertrophic cardiomyopathy
Hypertrophic cardiomyopathy, characterized by asymmetric hypertrophy in the ventricular wall, is the most prevalent Mendelian cardiomyopathy (≈1:500). 63,64 Despite variable clinical phenotypes such as breath shortness, palpitations, syncope, chest pain, heart failure and arrhythmias, a significant number of patients with HCM are asymptomatic. 60,65 HCM is also associated with both sudden cardiac death in young adults, including athletes, and progressive heart failure. 2,66 Although HCM is inherited in the AD mode, a few cases with AR and de novo mutations have also been reported. 67 This disorder is denoted as the disease of sarcomeres, and mutations in two main genes (viz, MYBPC3 and MYH7) are responsible for almost 70% of the identified HCM mutations. Other genes involved in HCM pathogenesis, with a frequency of 1%-5%, are TNNI3, TNNT2, MYL2, TPM1, ACTC1 and MYL3. 68 Rarely, defects in muscle LIM protein (CSRP3) and α-actinin 2 (ACTN2), which encode proteins vital for sarcomere function and structure, lead to HCM. 69 Genetic analysis is helpful for 60%-70% of patients with familial HCM and 30% for the sporadic form. 70 resulting in haploinsufficiency. 76 The loss of canonical splicing site and the emergence of a new acceptor splice site may be a mechanism through which two MYBPC3 intronic variants (viz, c.506-2A>C and c.2308+3G>C) act in HCM pathogenesis. Moreover, c.393-5C>A, located in SCN5A, leads to exon skipping and one small in-frame deletion in the S1-D1 transmembrane of the α subunit of the cardiac sodium channel. Interactions between S1-S3 and S4 segments are a vital factor in cardiac function, and this deletion leads to the loss of function of sodium channels. 77 Sodium channels are composed of several subunits; the functional one, however, is the α subunit. This channel consists of four internal homologous domains containing six transmembrane segments individually. 78

| Non-coding variants in dilated cardiomyopathy
Dilated cardiomyopathy is the most common type of cardiomyopathy and is characterized by a left ventricular ejection fraction of less than 45%, systolic dysfunction and myocardium hypokinesia. 79,80 This disorder is an indication for heart transplantation, and it is also associated with increased risks of arrhythmia-related mortality. 81 The first investigations, between 1974 and 1985, reported a prevalence of 1:2500 for DCM patients, but recent epidemiological studies have estimated a prevalence of 1:250. 82,83 The clinical manifestations of DCM include sudden cardiac death, heart failure and thromboembolism. 84 Many factors such as alcohol and cocaine abuse, myocarditis, Coxsackieviruses, beriberi, haemochromatosis, Chagas disease, drugs and pregnancy play a significant role in DCM aetiology. 80,[84][85][86][87] In addition, between 15% and 35% of DCM cases are idiopathic, and several genes that affect cytoskeletal proteins, Z-disks, sarcomeres, desmosomes and extracellular matrices are involved in DCM pathology. 88 95,96 The current DCM genetic paradigm has nearly 50% sensitivity in mutation detection. This is due to the heterogeneity and low frequency (3%-5%) of the identified mutations compared with patients suffering from large DCM with no characterized variants. 94 (Table S3).
Heart failure is a public health problem affecting 1%-2% of the adult population in developed countries. 106 This condition is a clinical syndrome manifesting itself in all types of cardiomyopathies. The estimation of the incidence of heart failure in DCM patients is challenging owing to a variety of factors that should be considered in patient selection. Indeed, only a few clinical trials and studies have been hitherto conducted on the aetiology of heart failure. In a study by Kubanek et al., 107 32% of the enrolled DCM patients presented with heart failure, and 66% had experienced hospitalization for heart failure once before recruitment. In another large cohort study on 881 DCM patients, the most prevalent clinical manifestation, with a higher incidence rate among women (64% vs. 54%), was heart failure. 108 The majority of HCM patients present with heart failure with a preserved ejection fraction rather than heart failure with a reduced ejection fraction (<40%). In a cohort consisting of 1000 HCM patients aged between 30 and 59 years, approximately 50% of the study population had heart failure and experienced mild-tosevere symptoms. 109 A large investigation on a European cardiomyopathy registry revealed that the prevalence of heart failure among RCM patients was high (83%). This high rate was inconsistent with another cohort by Ammash et al., 110  In contrast to rs1739843, which was associated with both ischemic and nonischemic heart failure, rs6787362 was associated only with ischemic heart failure. The results of one prospective metaanalysis pooling the data of four previously published cohorts of the Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium indicated that 2 of 14 high-signal SNPs were located in intronic regions: rs11118620 in LOC100129376 and rs11880198 in GNA15. 111

| Non-coding variants in other cardiomyopathies
Another type of cardiomyopathy is arrhythmogenic right ventricular dysplasia/cardiomyopathy (ARVD/ARVC), characterized by sudden death, syncope, heart failure, ventricular arrhythmias and palpitations. ARVC affects between 1/1000 and 1/5000 in the general population, and it usually affects young people, especially athletes.
The first mutation reported for ARVC was a defect in plakoglobin.
Restrictive cardiomyopathy is a rare cardiac disorder characterized by increased myocardial stiffness leading to impairment in the ventricular filling. 110 affect Ca2 + affinity, and the interactions between these proteins lead to cardiomyopathy development. 136 Despite the development of high-throughput sequencing methods, the rate of successful genotyping in RCM patients is 30%. [133][134][135] The remaining patients may be carriers of putative variants in other parts of the genome. In our search, we identified 54 non-coding variants reported for RCM, and the most frequently reported variants among them were in ALMS1 (26/54 [48.15%]) (Table S5).

| NON -CODING VARIANTS IN CONG ENITAL HE ART DIS E A S E S
Congenital heart diseases affect the outflow tract, the septum and the valves, and they are the most common birth defect with a prevalence of 0.8 to 1 child per 100 live births. 137,138 This group of diseases is classified based on haemodynamic and anatomic lesions into five major subtypes: outflow tract defects, 139 abnormal left-right relationships, conotruncal defects and impairments affecting the inflow.
Thirty percent of patients are affected by severe and lethal forms of congenital heart diseases, and surgical intervention in the first year of life is vital for them. 137 Progress in surgery has conferred a survival rate of 95%; still, in developing countries, these diseases remain the major cause of child mortality. 137,140 Congenital heart diseases are genetically heterogeneous, with many patients being affected by the isolated form. The isolated form is a condition in which there is only one heart defect and no additional abnormalities or syndromes are present. 141 For all the conducted studies, the aetiology and molecular mechanisms of these diseases have yet to be elucidated. 142 Up to now, more than 50 genes and point mutations associated with congenital heart diseases have been reported. Among them, genes related to cardiac development such as TFs GATA4 and NKX2-5 constitute a considerable portion. Chromosomal copy number variants (CNVs), signalling pathways related to cardiac morphogenesis (the Notch and Jagged pathways) and chromosomal abnormalities (chromosome 21 trisomy) account for nearly 25% of cases. 143 In the majority of patients with congenital heart diseases, especially familial forms, no causative variants and single candidate genes have been identified, which highlights the role of de novo mutations and the polygenic inheritance mode in such patients. Research has indicated that 10% of patients with congenital heart diseases are carriers of de novo mutations. 144  Hedgehog developmental pathways and morphogenetic pathways; they are, therefore, candidate genes for major processes in heart development. 145,146 In patients with congenital heart diseases, similar to other unexplained genetic disorders, non-coding variants may contribute to pathogenesis. In a study by Reamon-Buettner et al., the 3′-UTR of TBX5, which is a TF expressed in the heart, was sequenced in patients with congenital heart diseases and 10 variants were identified. Among these variants, the prevalence of 1 variant, c.*1101C>T (rs6489956), was considerably different between the case group and the healthy controls.  (Table S6).

| NON -CODING VARIANTS IN THOR ACIC AORTIC ANEURYS MS AND DISS EC TIONS
Thoracic aortic aneurysms constitute a silent and asymptomatic pathological state. They are characterized by an enlarged thoracic aorta, and they affect 1 per 100,000 people in the general population. 151,152 The detection of thoracic aortic aneurysms is difficult before the occurrence of complications such as dissections and ruptures. 153 Thoracic aortic aneurysms comprise a multifactorial disorder, and they are associated with many risk factors such as genetic factors (e.g. congenital defects and hypertension) and environmental factors (e.g. smoking and aging). 154 Conventionally, thoracic aortic aneurysms are categorized into two main forms: syndromic and nonsyndromic. Many syndromes, including Ehlers-Danlos syndrome, 155 Marfan syndrome and Loeys-Dietz syndrome, are associated with thoracic aortic aneurysms. 156 Despite surgical intervention, syndromic thoracic aortic aneurysms have a poor prognosis by comparison with the nonsyndromic form. 157 Nonsyndromic thoracic aortic aneurysms are more prevalent; still, patient detection remains a challenge on account of the fact that some genes are involved in the pathogenesis of the 2 forms. 158 Thoracic aortic disease is the consequence of a single mutated gene inherited in the AD mode in patients with a positive family history, and it constitutes 20% of cases with thoracic aortic aneurysms. Defects in genes, including TGFB2, TGFBR2, TGFBR1 and SMAD3, are responsible for 10% of familial nonsyndromic thoracic aortic aneurysms/dissections. Additionally, mutated ACTA2 accounts for 12%-21% of familial thoracic aortic aneurysms/dissections, and the remaining identified genes represent only 1%-2% of individuals with nonsyndromic thoracic aortic aneurysms/dissections. 159,160 In recent years, the remaining unex- FBN1 encodes fibrillin-1, a glycoprotein that plays a role in maintaining fibre integrity. 161 Exon skipping is another mechanism whereby some variants affect the normal process. In a previous study, targeted NGS revealed 1 novel splice-site variant, c.871+1G>A in SMAD3, in two patients with nonsyndromic familial thoracic aortic aneurysms/dissections. In that study, aortic tissue was subjected to mRNA extraction, followed by RT-PCR. Additionally, cDNA amplification on exons 5 to 8 revealed the skipping of exon 6, leading to a 213-nucleotide deletion. Sequence analysis was then conducted as the confirmation test; the result showed that the shorter fragment did not have the entire exon 6. Afterward, in silico analysis indicated that SMAD3 conformation was essential for the function of this protein and its interaction with other proteins. Smad family proteins are TFs binding to DNA sequences, and any changes and alterations may affect transcription. 162 Previous investigations have shown that the major portion of the candidate causal variants of SMAD3 is in the MH2 domain within exon 6 163 and that acceptor splice-site variants usually result in proteins with impaired function. 164 In addition, Aubart et al. 163 and Regalado et al. 165 categorized loss-of-function variants located in SMAD3 as candidate causal ones.
Moreover, SMAD3 encodes a protein that plays a role in the cellular TGFβ signalling pathway, and defects in this gene cause disorganization in the fragmentation of the elastic fibre, the media layer and collagen accumulation, all of which are involved in aortic aneurysm development. [166][167][168] A functional study by Ying Wang revealed that a variant of SMAD4 increased the risk of thoracic aortic aneurysms/ dissections. Additionally, 202 thoracic aortic aneurysm/dissection cases were genotyped by five tagging SNPs of SMAD4, rs12455792, located in the 5′-UTR of SMAD4, which is a binding site for TFs. A significant finding in a prior study indicated that rs12455792 might regulate the pathophysiological mechanisms related to smooth muscle cells such as proteoglycan degradation, apoptosis and accumulated fibre levels. 169 In our search, 150 variants were associated with thoracic aortic aneurysms. Among them, 107/150 (71.33%) were identified in FBN1 (Table S7).

| NON -CODING VARIANTS IN CORONARY ARTERY DIS E A S E S AND S TROK E S
Coronary artery diseases are inflammatory, atherosclerotic cardiovascular diseases with various clinical manifestations such as sudden cardiac death, myocardial infarction, and both stable and unstable angina. Atherosclerotic coronary artery diseases are accountable for more than 80% of sudden cardiac death cases. 170 Both genetic and environmental factors are responsible for coronary artery disease aetiology, and the heritability rate of this disorder is estimated to range between 40% and 60%. 171 Different medications such as statins, aspirin and β-blockers have been prescribed, conferring a better prognosis in some patients. 172 GWAS studies have indicated that 9p21.3, containing the CDKN2A and CDKN2B regulating cell cycle, is associated with coronary artery diseases. [173][174][175] Mutated genes such as ABCA1, LDLR, APOB100, ARH, PCSK9 and CYP7A1 in Tangier disease and familial hypercholesterolemia are responsible for premature coronary artery diseases. [176][177][178][179][180] In addition, 396 SNPs within nine chromosomal regions have been reported to be associated with coronary artery diseases. 181 This group of diseases has a complex genetic architecture. Indeed, although genes involved in many biological pathways such as vascular tone and remodelling, lipid metabolism and inflammation have been identified, the precise molecular mechanism is still unknown. 182,183 Coronary artery diseases and myocardial infarction were the first diseases targeted in GWAS studies. 184 A study by Huang revealed the association between 3′-UTR mutations of MEF2A and coronary artery diseases.
Totally, 238 individuals affected with coronary artery diseases were recruited in that study, the results of which showed that the TA haplotype carrier of rs325380 had a meaningful association with coronary artery disease development. Given that UTRs are involved in gene expression and all post-transcriptional processes, any defects in these areas may affect the normal process and lead to a pathological state. 185 In our study, we identified 19 variants, of which eight were located in LIPA (Table S8).

| NON -COD ING VARIANTS IN OTHER C ARDIOVA SCUL AR DIS E A S E S
Strokes, defined as focal neurological defects, rank second after ischemic heart diseases in terms of mortality among cardiovascular diseases. 186 Strokes are categorized into two main types: ischemic and haemorrhagic. Ischemic strokes were reported to have an occurrence rate of 85% in a previous investigation. 187 Many genetic and nongenetic risk factors such as sex, ethnicity, age, smoking, obesity and diabetes play roles in stroke development. 188 Furthermore, strokes are the consequence of a considerable number of rare single-gene disorders. 189 Cerebral AD arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) is categorized as the most frequent single-gene disorder that causes ischemic strokes. 190 Despite the identification of 160 defects in the NOTCH3 gene associated with severe cerebral small vessel diseases, the mechanism of strokes remains unclear and challenging. Indeed, for all the investigations thus far performed on the issue, no strong replicable associations have been reported. 189,191 Several studies have implemented the exome approach to identify rare variants responsible for the development of complex diseases. A GWAS analysis by Söderholm et al. revealed an intronic variant, rs1842681 in LOC105372028, leading to the expression of protein phosphatase 1, which is involved in brain plasticity. 192 As was mentioned above, in recent years, GWAS studies have enhanced our understanding of cardiovascular disease genetics.
Another use of GWAS is to identify the genetic architecture of common complex and observational traits related to cardiovascular diseases such as hypertension. 193 Hypertension is considered the leading cause of morbidity and mortality the world over.
It leads to various pathological states such as heart failure, atrial fibrillation and coronary artery diseases. 194 The renin-angiotensin system is responsible for hypertension development; nonetheless, enzymes and proteins involved in this system have yet to be characterized fully. 195 (Table S9).

The definition of sudden cardiac death by the World Health
Organization is death within 1 h after symptom manifestation or 24 h after categorization as an asymptomatic patient. 199,200 Given that out-of-hospital sudden cardiac death has a 60% occurrence rate, the precise mechanism involved in sudden cardiac death pathogenesis usually remains challenging. 201 Myocardial infarction in patients aged between 45 and 50 years or older is the most causative factor in sudden cardiac death. 202 Inherited disorders such as cardiomyopathies and channelopathies constitute 5%-10% of sudden cardiac death cases. 203 Sudden cardiac death due to inherited disorders is the consequence of defects in both sarcomere/desmosome proteins, regulatory proteins and ion channels. 204  Cardiac arrhythmias are defined as any variations in the rate or rhythm of the normal heart. Abnormal impulse formations and disturbances in conduction are two major reasons responsible for arrhythmias. Long QT syndrome, Brugada syndrome and short QT syndrome are all known as this disease entity. 206 The identification of genetic components underlying arrhythmias highlights the role of ion channels. Ion channels are protein complexes that are located in the cardiomyocyte sarcolemma, and they play a role in ion flow conduction. 207 In addition to the abovementioned disorders, atrial fibrillation is the most prevalent type of arrhythmia in that it affects 33 million people worldwide. 208,209 Environmental and genetic factors are both involved in the pathogenesis of atrial fibrillation. 210 Defects in genes such as MYL4, NPPA and KCNQ1 are responsible for this cardiac condition. 211 GWAS studies on atrial fibrillation have revealed the role of non-coding loci. The first GWAS study on atrial fibrillation in 2007 indicated that individuals carrying the non-coding 4q25 locus near the PITX2 gene were 60% more susceptible to this abnormal heart rhythm. 212 In our analysis, we identified 233 variants reported for arrhythmias.

| OVERL AP IN LO CI A SSO CIATED WITH D IFFERENT C VDS
Previous studies revealed that there is a considerable association between CVDs, major depressive disorder (MDD), 213 The results indicated that MDD has a considerable genetic correlation with CAD, atrial fibrillation, pulse pressure and heart failure. 213 Mechanisms underlying vulnerability to CVD in SMDs have not yet identified. People with SMDs struggle with loneliness. 216 Several genetic variants associated with this disorder have been identified in one recent GWASs. This study highlights the role of shared genetic architecture and polygenic overlap between SMDs and CVD. 215

| DISCUSS ION
The current literature features a few non-coding candidate likely candidate causal variants associated with Mendelian disorders. 217,218 The recent emergence of NGS (e.g. WGS and WES) has ushered in considerable advances in clinical genetics; however, 50% of patients remain without a definite diagnosis. 219 Further, despite the use of NGS in the identification of changes in different regions of the genome such as insertions or deletions (indels), SNV inversions and translocations, CNVs, and structural variants, variants within the non-coding parts of the genome and their effects have remained poorly understood. 220,221 Previous publications highlight the role of non-coding variants contributing to the disease risk. However, they only discussed on one specific disease. This is the first comprehensive review collected evidence from published studies on non-coding genetic variants associated with various types of cardiovascular diseases.
They are also involved in different mechanisms such as transcription process regulation by promoters that are located 0.5 kb from the transcription start sites and recruit RNA polymerase II, TFs and enhancer elements. 2,[225][226][227][228] Previous GWAS studies have revealed that non-coding variants associated with disease impose risk by changing and affecting functional DNA elements related to gene expression regulation. In addition, these types of variants have a considerable heritability rate, and they are categorized as an effective determinant of being susceptible to disease. 226 Approximately 90% of all identified SNPs associated with a specific phenotype by GWAS are located within a non-coding region. 229 identified risk alleles, including non-coding variants, which should be considered in disease aetiology. 232 Previous studies have also indicated that the role of these non-coding variants is not limited to cardiovascular diseases inasmuch as they have also been reported for other complex disorders. A GWAS study suggested that non-coding variants were associated with obesity and type II diabetes. 233,234 The identification of non-coding variants and underlying molecular mechanisms is challenging, and it is investigated via quantitative trait loci (QTL)-mapping approaches. 236 In addition to QTL approaches, technologies based on the genome-wide detection of CNVs have assisted in identifying large causative genomic CNVs associated with disorders. 237 For instance, a common deletion, 1q21, associated with thrombocytopenia was detected by genome-wide CNV technology. 238 The identification of non-coding variants is not restricted to genetic diagnosis; these variants can also be therapeutic targets.
For instance, variants in PCSK9, which has a role in the inhibition of LDL-C circulation, can be a treatment target. 239 In conclusion, novel genetic approaches and technologies, data sets, and the results of GWAS studies can be drawn upon to unravel the complex genetic architecture of cardiovascular diseases. The ultimate goal in the identification of non-coding variants is to provide both a better understanding of the pathophysiological mechanisms involved in cardiovascular diseases and effective treatments.

ACK N O WLE D G E M ENTS
We appreciate the support from Cardiogenetic Research Center, Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare that they have no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data generated or analysed during this study are included in this published article (and its supplementary information files).