In silico Analysis of Common Autism Spectrum Disorder Genetic Risk Variations

Autism spectrum disorder (ASD) is a chronic neurological and developmental disability characterised by inability to develop social relationships, trouble expressing feelings, and repeated behaviours - clinically defined as stereotyped behaviour - that affect how people interact, learn, and behave. Because of the vast range of types and severity of symptoms, it is classified as a "spectrum" disorder. Over the last two decades, the prevalence of ASD has progressively increased, and one out of every 160 children worldwide is estimated to have an ASD. Over 75 percent of ASD patients show psychiatric disorders like depression, stress, bipolar disorder, Tourette syndrome, attention deficit hyperactivity disorder (ADHD). In the present study, in silico analysis was done to identify different rare mutations in genes implicated in ASD. Single nucleotide polymorphisms in ADNP, ARID1B, ASH1L, CHD2, CHD8, DYRK1A, POGZ, SHANK3, and SYNGAP1 genes were identified to be associated with ASD aetiology. A single mutation in these genes can result in defective chromatin remodeling, altering the function of several genes and potentially causing intellectual impairment and autism spectrum disorder (ASD). Understanding and analyzing these SNPs linked to ASD as risk factors can aid in the early detection and diagnosis of the disorder.

Autism spectrum disorder is a complex developmental disorder that involves persistent challenges with social interaction, restricted interests, repetitive behaviors, impaired communication ability and deficits in socialemotional reciprocity 1 . As per Global Burden of Disease Study (GBD) autism affects 52.9 million children under the age of five 2 . According to WHO 2021 report, the majority of young children suffering from autism live in underdeveloped nations or low and middle-income countries. In South Asia and Sub-Saharan Africa, more than one million children suffer from ASD, and the Middle East, Central Asia, and North Africa have the highest rates of childhood autism 3 .
Children diagnosed with ASD at the age of two to four years are reported to have an accelerated total brain volume growth in the frontal and temporal region and as they reach the age of ten to fifteen years, the brain's volumetric capacity declines 4 . ASD is referred to as a "spectrum" disorder since there is a wide diversification of its types and severity of symptoms 5 . One of the first indicators of ASD is a delay in language development and children with ASD commonly show impaired language development leading to a lack of social communication 1 . Autism is caused by a combination of inherited, nongenetic, and environmental factors, and the majority of them appear to influence key aspects of early brain development 6 . Some appear to affect the nerve cells interactions with one another in the brain, while others appear to influence how various areas of the brain communicate with one another 7 .
Variations in gene and genomes studies have identified an increasing number of SNPs associated with ASD over the past decade 8,9 . It has been estimated that SNPs explain between 17 and 60% of ASD, hence their contribution must not be overlooked 10,11 . The functions of the genetic variants responsible for the association with ASD are, however, poorly understood. A single nucleotide polymorphism (SNP) can lead to translation of the incorrect amino acid, resulting in defective or nonfunctional proteins. Such mutations in the lysinespecific methyltransferase 2H enzyme have been reported to alter histone methylation and affects brain development, increasing the risk of ASD and other neural disorders 12,13 . Abnormal regulation of gene expression, disruption of proper neural development, and chromatin deregulation, results in aberrant chromatin remodeling. This interferes with the usual expression of genes essential in brain development, as well as abnormalities in synaptic adaptability and ultimately leads to ASD 14 .
In the present study, in silico analysis was carried out to identify different genes rare mutations associated with ASD as risk factors. These SNPs possess the potential to serve as important biomarkers for early detection of ASD, monitoring the course of ASD, and to develop possible drug therapies for ASD [ Figure 1].

METHODOLOGY
Online bioinformatics tools were used to extract the data. Genetics Home Reference (GHR) database which is a MedlinePlus trusted health information, provided by National Library for Medicine was used to collect data of the genes whose mutations led to autism spectrum disorder which is found under the tab Genetics followed by Genetic Conditions 15 . Pathogenic and nonpathogenic mutant alleles associated with ASD were studied using the database retrieved from variation viewer tab under the related information column of National Center for Biotechnology Information (NCBI) site for the said gene which we found out from Genetics Home Reference database 16 . Location of the gene, families to which they belong, and function of the proteins encoded by them were analyzed by MedlinePlus database and documented. NCBI-gene site was used to identify transcript mutation and Rapid Stain Identification Series (rsIDs). The NCBI-Single Nucleotide Polymorphisms (SNP) database was used to examine and identify mutations in codon sequences that resulted in amino acid sequence changes. Variation and phenotype, clinical significance, and effect of the mutated genes on the amino acid sequence was recorded.

Activity dependent neuroprotector homeobox (ADNP)
ADNP gene produces a neuro-protective peptide that operates during early embryogenesis, particularly during neurulation, and promotes glia-derived, survival-promoting chemicals that protect impaired nerve cells from cell death 17, 18 . ADNP gene is located on Chromosome 20 band 13's long arm (q) and the encoded protein controls the expression of genes like BRG1 and CHD4, which are involved in normal brain development through chromatin remodeling 19 . The substantial neuroprotective activity of ADNP protein can be attributed entirely to the NAP domain, an octapeptide Asn-Ala-Pro-Val-Ser-Ile-Pro-Alaor NAPVSPIQ 20 . ADNP protein binds to DNA and interacts with SWI/SNF complexes by connecting the C-terminus to three of its key components throughout the remodeling process, which affect the shape of chromatin directly 21, 22 . The majority of ADNP variants leads to the formation of short ADNP protein that can bind to DNA but not interact with SWI/SNF complexes. Variations at the terminus of a protein are typically unlikely to influence protein function 23 . Decreased ADNP expression in haploinsufficient populations has been shown to effectively deregulate the feedback loop by blocking wild-type protein from interacting to the promoter binding domain or occupying alternate target sequences 23 . These changes likely explain the intellectual disability in case of ASD. Around 24 mutations (small intragenic deletions, insertions, missense, and splice site variations) were found in ADNP gene. Normal coding amino acids like tyrosine and leucine are mutated to terminating codons by frameshift mutations in ADNP gene, which may result in the creation of shortened proteins and disordered chromatin remodeling. Activation of many genes, as well as the development and function of several human tissues and organs, including the brain are all affected by chromatin remodeling disruptions. Furthermore, some of these mutations may prevent the production of neurotransmitters like epinephrine and norepinephrine, which are important for neural communication and mood regulation by encoding different amino acids. For example, a mutation of asparagine which is required for neurotransmitter production to lysine.

AT-rich interaction domain 1B (ARID1B)
ARID1B plays a pivotal role in controlling gene activity by forming a DNA-binding protein of the Brahma-associated factor chromatin remodeling complexes of numerous distinct SWI/ SNF protein complexes which modulate gene activity by chromatin remodeling 24,25 . ARID1B contains over 2,000 amino acid residues yet it has only two characterized protein domains, an ATrich interaction domain (ARID) and domain of undefined function 3518 (DUF3518) 22 . ARID1B's DNA-binding ability is impaired, and the BAF complex's function is jeopardized as a result of missense mutations in the ARID domain 26 . In BAF complexes, the DUF3518 domain has been shown to interact with the BRG1 and BRM helicase subunits 24 . Because the ARID1B component may bind to DNA, it is thought to contribute in the targeting of SWI/SNF complexes to the chromatin ought to be remodeled 24 . Missense mutations in any of these domains can cause BAF complexes to become non -functional. Nonsense-mediated mRNA decay (NMD) is primarily triggered by nonsense and frameshift mutations, which cause premature translation termination 27 . As a result, these mutations result in the NMD of the ARID1B mRNA rather than the expression of mutant ARID1B protein. Truncating mutations that circumvent NMD typically result in a distinct and more extreme phenotype than the dominant negative impacts of the mutant protein 25 . Human learning and memory may be affected by frameshift mutations in the ARID1B gene, which replace glutamine with arginine, glycine, or serine. Similarly, mutations causing normal coding amino acids like threonine and alanine to a different branched amino proline is likely to disrupt the structure of the ARID1B protein. Stop-gain mutation of glycine affects immunomodulatory in the peripheral nervous system. ARID1B gene variants linked to ASD may lead to a reduction ARID1B protein level or a disturbance in the protein's chromatin remodeling activity. As a result, the ARID1B gene plays an essential part in brain development. Further research needs to be conducted to determine whether individuals with ARID1B haploinsufficiency have altered brain development, which contributes to the intellectual disability and speech difficulty defined by ASD 28,29 .

ASH1 like histone lysine methyltransferase (ASH1L)
The lysine-specific methyltransferase 2H enzyme is produced by ASH1L gene and can be found in a range of organs and tissues throughout the body 31 . Lysine-specific methyltransferase 2H functions as a histone methyltransferase enzyme that modify histone proteins. Histone methyltransferases control the function of few genes by methylation of histone 27 . Additionally, certain genes involved in brain development are activated by lysine-specific methyltransferase 2H. Histone methylation is disrupted by the lack of a functioning lysine-specific methyltransferase 2H enzyme 32 . Loss of ASH1L gene has been shown to impair embryonic and postnatal brain development, as well as the formation of neuronal networks in the developing hypothalamus, which is required for optimal feeding behaviors and initial postnatal growth 32, 33 . At least seven ASH1L gene mutations were found to be associated with ASD. Some ASH1L gene mutations linked to ASD induce a single amino acid alteration in the lysine-specific methyltransferase 2H enzyme, while others disrupt neuron are received by another. It serves as a framework for the connections between neurons and is also involved in dendritic spine development and maturation.
genetic material from the ASH1L gene sequence or produce an early stop signal, rendering the enzyme inactive and affecting neural connection. For example; when alanine to serine and arginine to glutamine missense mutations occur in ASH1L gene, alanine production is impaired, which is a source of energy in the CNS, and the body's metabolism is also affected as arginine production is hindered. Chromodomain helicase DNA binding protein 2 (CHD2) CHD2 gene codes for chromodomain DNA helicase protein 2 (CHD2) which controls gene activity through chromatin remodeling and is present in all cells of the body 34 . This protein plays a significant role in the function and growth of neurons present in the brain. It is proposed that the protein translated from the mutant CHD2 gene causes enhanced neuronal excitability due to the action of GABAergic neuron excitability or other electrophysiological channels and cause convulsions, and have little or no effect on RNA expression levels 35 . CHD2 related neurodevelopmental disorders are autosomal dominant disorders caused by a de novo pathogenic variant. Twenty-seven CHD2 gene variants linked to ASD were investigated. These mutations may result in moderate alterations in the expression of multiple genes, due to their induced functional disruption of the CHD2 protein, all of which affect development of brain and increase the possibility of ASD. For example, valine is used for energy production while serine helps in biosynthesis of purines and pyrimidines. The frameshift mutation in CHD2 gene of valine and serine results in a stop codon that completely disrupts chromatin remodeling and can possibly lead to ASD. Dual specificity tyrosine phosphorylation regulated kinase 1A (DYRK1A) DYRK1A encodes a kinase enzyme involved in the phosphorylation of proteins which helps in regulating the activity of proliferation and differentiation of cells and play important role in the development of the nervous system 31 . Dendritic spines assist convey nerve impulses and facilitate communication between neurons, and the DYRK1A enzyme is involved in their production and maturation from dendrites in neurons 36 . A nonsense mutation results in the synthesis of a kinase domain protein with a C-terminally shortened C-terminus (DYRK1A-E396ter), which is then processed by the proteasome into an inactive form, demonstrating overall loss-offunction of DYRK1A 37 . Anxiety, microcephaly (an abnormally small head), intellectual disabilities, and speech issues are other specific traits seen in ASD patients with DYRK1A gene mutations 38 .
Chronic seizures (epilepsy), distinctive facial features, poor muscular tone (hypotonia), foot deformities, and walking impairments may be experienced by these individuals 38 . At least twentyfive DYRK1A gene mutations associated with ASD were identified. DYRK1A enzyme dysfunction or absence causes aberrant gene expression regulation and affects normal brain development.
For example, frameshift mutation in DYRK1A gene changes normal coding tyrosine to a termination codon which can lead to production of a truncated protein which in turn can have many determinantal effects on the brain.

Pogo transposable element derived with ZNF domain (POGZ)
POGZ proteins are zinc finger proteins which are able to bind to the chromatin for chromatin remodeling owing to its structure which consist of a unique pattern of amino acids and one or more zinc ions and they are capable of binding. Zinc finger domain's folded form stabilizes the protein, and allows it to interact with many other molecules 39 . POGZ protein modulates gene expression, which is critical for brain development, by interacting with the SP1 transcription factor, heterochromatin protein 1 (HP1), and chromodomain helicase DNA-binding protein 4 (CHD4) and acting as a chromatin regulator 40,41 . The large majority of previously reported de novo POGZ gene mutations in patients with neurodevelopmental disorders (NDDs) are nonsense and frameshift mutations dispersed between the C2H2 Zn finger and centromere protein-B-like DNA-binding (CENP-DB) domains, as well as within the CENP-DB domain itself. The large majority of reported de novo POGZ genetic variants in patients with neurodevelopmental disorders (NDDs) are Twenty-two pathogenic mutations in the POGZ gene were investigated, which may result in a POGZ protein with reduced chromatin binding ability, causing aberrant chromatin remodeling and eventually compromising the normal expression of genes essential in brain development 42 . Mutations in UTR-5' and frameshift mutations in the POGZ gene can affect expression level and mRNA translation kinetics by producing DNA binding elements or cis-regulatory motifs.

SH3 and multiple ankyrin repeat domains 3 (SHANK3)
SHANK3 protein is vital in the working of synapses because it works as a platform that maintains the connections between neurons, ensuring that signals sent by one neuron are recognized by another 43 . The N-terminal region of SHANK3 comprising of the Shank/ProSAP N-terminal (SPN) domain and a set of ankyrin (Ank) repeats has been reported with several missense mutations 44 . These mutations impact the Abi1 binding site in SHANK3, which is the core location of the SHANK protein, and missense mutations cause alterations outside of established interacting motifs in the C-terminal region of SHANK proteins, making it impossible to evaluate the precise role of these mutations 44 . In the N-terminal motif of Shank/ProSAP (SPN), which is essential in dendritic spine growth and maturation, seven ankyrin repeats (Ank) have been identified as a hotspot for missense mutations 45 . It has already been reported that disruption in communication between neurons contributes to the development of ASD 46 . A minimum of 24 SHANK3 gene mutations associated with ASD were identified out of which most of them impairing SHANK3 protein function or synthesis. For example, mutation of proline to alanine impedes the assembly of several proteins and the rate of peptide bond production by the ribosome, cds-synon mutation of arginine and tyrosine, frameshift mutation of glutamic acid into a stop codon that handicaps memory and focus power, and missense mutation of isoleucine into phenylalanine may directly affects gene function.

Synaptic Ras GTPase activating protein 1 (SYNGAP1)
SynGAP protein is encoded by SYNGAP1 gene, located in synapses in the brain, and regulates important biochemical signaling pathways that enable learning and memory 47 . SynGAP regulates synapse adaptations and promotes proper brain architecture by linking nerve cells, and it is especially vital during a crucial stage of early developing brain that influences long-term psychological capability 48 . Pre-mRNA splicing in the SYNGAP1 gene is dependent on the concordance of "cis" sequences that define exonintron boundaries, which is required for proper protein translation, and regulatory sequences recognized by splicing machinery. Improper exon and intron recognition reported by point mutations at these consensus sequences may lead to the development of an abnormal transcript of the affected gene 45 . At least thirty SYNGAP1 gene mutations were identified, these mutations hampered or completely blocked the function or production of SynGAP protein associated with ASD. These mutations may be the underlying cause of behavioral impairments associated with ASD. For example, frameshift mutation of amino acids (tryptophan, leucine, valine and arginine) to stop codons and also mutations affecting normal coding the branched-chain amino acids like leucine, isoleucine, and valine with multiple functions in the brain can affect the synaptic plasticity in the brain.

CONCLUSION
The relevance of SNPs in ASD-associated genes as a risk factor was investigated in this study using in silico analysis. Mutations in the ADNP, ARID1B, ASH1L, CHD2, and POGZ genes have been identified to disrupt the development and functioning of the brain by altering gene regulation and causing abnormal chromatin modeling. Many of the SNP mutations studied result in a premature stop signal, resulting in a nonfunctional enzyme that is exceptionally short. Furthermore, these mutations in other genes associated with ASD, SHANK3 and SYNGAP1 disrupt the function of synapses and, as a result, cell-to-cell communication. These mutations may cause changes in synaptic adaption, which may exacerbate the behavioral problems associated with ASD. These SNPs can act as biomarkers and provide insight into the etiology of ASD. Furthermore, the SNPs identified above may aid in the development of ASD medical interventions. More examination and validation of the identified SNPs will be required to assess their clinical importance and applicability in translation research as novel targets.