Transcriptomic profiling of mice brain under Bex3 regulation

BEX family genes are expressed in various tissues and play significant roles in neuronal development. A mouse model of Bex3 gene knock-out was generated in this study, using the CRISPR-Cas9 system. Transcriptomic analysis of the brain was performed to identify genes and pathways under Bex3 regulation. Essential biological functions under the control of Bex3 related to brain development were identified. Ninety-five genes were differentially expressed under Bex3−/− regulation, with 53 down and 42 up. Among down-regulated genes, LOC102633156 is a member of zf-C2H2, Xlr3a is an X-linked lymphocyte regulated gene, LOC101056144 is a hippocampal related gene, 2210418O10Rik and Fam205a3 are cortex related genes. Among the upregulated genes, Zfp967 is a zf protein, Tgtp2 is a T cell-specific regulator, Trpc2 is a neuron-related gene, and Evi2 is related to NF1. A total of 34 KEGG disease terms were identified under the Bex3−/− regulation. The most prominent is non-syndromic X-linked mental retardation, where Fgd1 is enriched. Similarly, IRF, MBD, SAND, zf-BED, and zf-C2H2 were significantly enriched transcription factors. A further study is required to confirm and explain each aspect that has been identified in this study.


Introduction
To date, five BEX genes have been recognized in the genome of humans (BEX1-5), the chimp, the mouse (Bex1-4 and Bex6), while four in the rat genome (Bex1-4). According to the phylogenetic grouping, both mice and rats lack the Bex5 member of the Bex gene family. Except for Bex6, which is found on chromosome 16 of the murine genome, all of the other members are located on the X-chromosome (Alvarez et al., 2005). These genes exhibit moderate sequence homology and are predominantly expressed in the brain (Brown and Kay 1999). A promising characteristic of the BEX genes is that they offer high expression in the mouse brain and are responsible for more than 12% of the rat brain's expressed sequence tags (Brown and Kay 1999;Alvarez et al., 2005). BEX proteins have a role in transcription control and signalling pathways in neurodegenerative diseases, cell cycle, and tumour growth (Naderi et al., 2012;Zhou et al., 2013;Meng et al., 2014;Fernandez et al., 2015).
BEX3 is a pro-apoptotic agent facilitated by the neurotrophic receptor p75NTR (Mukai et al., 2003). Others reported a reduction of tumour formation in human breast cancer in BEX3 xenograft mouse models (Tong et al., 2003). Besides, BEX3-induced downregulation of DRG-1 stimulated PC12 cell proliferation indicates its function in tumour suppression (Yu et al., 2006).
BEX3 is universally expressed in all human tissues (Uhlen et al., 2015). It is suggested that the BEX3 may be involved in various types of cancers (Krizman et al., 1999). The tumour suppressor role of BEX3 protein was discovered in a xenograft MDA-MB-231 cancer cell model in mice. The overexpression of mBex3 significantly reduced the tumour formation while compared to the control cells (Tong et al., 2003). This pro-apoptotic characteristic of the mBex3 was projected by its interaction with p75NTR (Mukai et al., 2000). However, since the p75NTR interaction was not seen in the human models, and the rat Bex3 showed its interaction with TrkA instead, the function of the Bex3 is controversial (Tong et al., 2003;Alvarez et al., 2005;Calvo et al., 2015). In recent years, with the rapid development of high-throughput sequencing technology RNA sequencing, has been widely used for transcriptomic analysis (Denoeud et al., 2008). Previously, we have applied this technology to study the lung and brain transcriptomes under the control of different genes (Oo et al., 2020;Sah et al., 2020). In this study, a mouse model of Bex3 KO was generated using the CRISPR Cas-9 system. The brain transcriptome analysis was performed. We found that several brain-related terms were enriched, indicating the potential biological functional role of Bex3 in the brain.

Housing of the animals
The Institutional Animal Care Committee and Animal Experimental Ethics Committee of NENU have approved the study with the endorsement number (NENU/IACUC, AP2018011). All the recommendations for The Use of Laboratory Animals of NIH, USA, are followed strictly. Mice were housed in IVC cages (5-6 each cage) at rotations of the 12/12 light-dark cycle in a pathogen-free atmosphere with free access to food and water.

sgRNAs designing and plasmids construction
The CRISPR Cas9 system is based on the 20bps nucleotide complementarity (Cong et al., 2013), termed as sgRNAs, followed by a three bps nucleotides, NGG, termed as Protospacer Adjacent Motif (PAM) (Jinek et al., 2013). Benchling database tool (https://benchling.com), was used to design sgRNAs. The sgRNAs designed and the respective positions on the genomic loci are shown in Table 1. The sgRNAs were authenticated about off-target effects using the BLAST tool of NCBI (https://blast.ncbi. nlm.nih.gov/Blast.cgi). The plasmid pX330 (px330-U6-Chimeric_BB-CBhSpCas9, collected from Feng Zhang through Addgene, plasmid# 42230) was linearized using BbsI. Two pairs of the sgRNAs, Bex3_T1F and Bex3_T1R and Bex3_T2F and Bex3_T2R, were annealed accordingly and ligated to the gel-purified pX330 plasmid. The ligation was double confirmed with BbsI restriction digestion and Sanger sequence ( Fig S1). These plasmids were named Bex3_T1 and Bex3_T2. The potential off-targets were identified for each sgRNA, designed the primers to amplify the respective fragments by PCR, and sequenced accordingly ( Figures S2 and S3).

Microinjection and genotyping
All the prerequisites were performed. Maximum eggs (oocytes) were extracted from each mouse, collected in M2 medium plates ( Figure 1A). The oocytes were incubated at 37 °C for 2-4 h in the M16 media (Sigma, USA). Similarly, the plasmid vectors were directly injected into the pronuclei of each oocyte following the standard protocol ( Figure 1B). After injection, the oocytes were incubated for 4-6 h at 37 °C and were transferred directly to the pseudo-pregnant CD1 females ( Figure 1C). Olympus IX71 inverted microscope, and Narishige microinjector was used.
Bex3 specific primers targeting the upstream and downstream of sgRNAs were designed manually. The PCR was carried out according to the prescribed protocol (Oo et al., 2020). GelDoc was used to observe the gel pictures (Malumbres et al., 1997). The transgenic littermates were separated accordingly. The chimaeras were obtained by crossing the littermates.

Gene expression analysis
Total RNA extraction was performed according to the prescribed instructions using Trizol reagent (Takara, Dalian, China) from the brain of Bex3 -/and WT. The cDNA was synthesized using a reverse transcription  kit following the manufacturers' instructions (Takara, Dalian). RT-qPCR was conducted with SYBR green mix in triplicates (Takara, Dalian, China), following previous protocols (Oo et al., 2020). GraphPad Prism 8 was used to draw the graphs.

Library preparation for transcriptome analysis
Total RNA from the brain of Bex3 -/and wild-type mice (each, n=3) was extracted (pooled transcripts) and followed by a further step of DNase I treatment to remove the genomic DNA contamination. The quality of the extracted RNA was tested on Nanodrop, NanoDropTM One spectrophotometer (ThermoFisher Scientific, USA), and Agilent 2100 Bioanalyzer (Santa Clara, USA). One μg of RNA for each transcript was used for the RNA-seq library. RNA-seq was conducted on the BGISEQ-500 platform with paired-end reads. Clean reads were assigned to the mouse genome (GCF_000001635.26_GRCm38.p6) by using the Bowtie2 tool (Langmead and Salzberg 2012).

GO analysis
All the DEGs were assigned to the Gene Ontology (GO) database. To perform GO enrichment analysis, the phyper function in the R program was used. All DEGs were assigned to each term in the Gene Ontology database (http://www.geneontology.org). The number of genes in each term was computed, and the hypergeometric test was employed to find GO terms that are considerably enriched in DEGs equated to the background of all genes in reference species. Bonferroni correction was applied for the p-value (Abdi 2005). Q value (corrected p-value) < 0.05 was determined as significantly enriched GO terms in DEGs.

KEGG analysis
The classification of the KEGG pathway was conducted by assigning all the DEGs to the KEGG pathway database (www.genome.jp/kegg). The pHYPER function in the R program was applied to execute the enrichment analysis to the annotated classification of the KEGG Pathway. Pathway with Q value ≤ 0.05 was defined as significantly enriched in differentially expressed genes (Kanehisa et al., 2008).

Statistical analysis
All the results are shown as means ± SEM (SEM). P value <0.05 (unpaired Student's t-test) was considered as statistically significant. All graphs were prepared with GraphPad Prism 8 for Mac (GraphPad Software).

Strategy of Bex3-/-mouse generation & screening of mutants
The Bex3 is a small gene of 1726bps, located on the X-chromosome. There are three exons, but only the third exon is coding for the protein. We have targeted third exon. Primers for genotyping were designed flanking the sgRNAs upstream and downstream. A detailed sketch is provided in Figure 2. The px330 plasmid has been used as a vector. The map and design of the plasmid is provided ( Figure S5). Six pups were obtained after 19 days of the oocyte transfer. Fifteen days later to the birth, these pups were weaned, cut their fingers for labelling purposes. The same biopsies were used for genotyping, following the standard protocols. The primers used are shown in Table 2. Two transgenic mice were obtained successfully. The deleted (knocked-out) fragment was confirmed by the Sanger sequence following the standard protocols. These two mice were mated with a wild-type C57BL/6 background to segregate the alleles. F2 progenies were used as experimental organisms.

Brain transcriptome study under Bex3 regulation
The cDNA libraries were constructed from brain mRNA of seven-weeks-old Bex3 -/and WT mice in three replicates (n=3). BGISEQ-500 platform was used for RNA-seq. The average yield obtained from Bex3 -/and WT was 1.18Gb or 23.55M and 23.56M reads for each sample, respectively. Adapter sequences and low-quality reads were filtered out; each sample generated an average of 1.18Gb data, with a Q30 base percentage of 89.33% and 90.59%, while Q20 percentage was 97.43% and 89.77% (Table 3). The clean reads were aligned to the reference genome of the mouse (GCF_000001635.26_GRCm38.p6), and matching competency between clean reads was identified using Bowtie2 (Langmead and Salzberg 2012). Transcript expression levels were computed and presented using RSEM (Li and Dewey 2011). A total of 18,128 genes were detected. The distribution of genes was counted based on three different FPKM cases (Fragment Per Kilo Millions). These three categories were FPKM £1, FPKM1-10, and FPKM³10 (Data not shown). In the first category, about 5362 genes were expressed in Bex3 -/-, and 5311 genes were expressed in WT mice; in the second category, about 6049 genes were expressed in Bex3 -/-, and 6017 genes in WT mice; in the third category, 6717 genes were expressed in Bex3 -/ ,-and 6800 genes in WT mice were used as control.
The genes expressed only in KO mice are termed differentially expressed genes (DEGs). DEGs were identified accordingly (Audic and Claverie 1997). A volcano map of the genes expressed is provided ( Figure S4). The criteria set for identification of DEGs was log 2 fold change > 1, and FDR 0.001. According to the set criteria, a total of 95 unigenes were recognized as expressed differentially between Bex3 -/and WT, out of which 42 were upregulated, while 53 were down-regulated. Highly significant DEGs are provied (Table S1). A heat map of DEGs is shown in Figure 3. The colour intensity from 0.0 to 6.0 indicates the gene expression level; WT and Bex3 -/are compared side by side. Moreover, the statistics of the DEGs is provided in the bar graph. The red bar shows the upregulated genes, while the blue bar shows down-regulated genes.

GO enrichment analysis
The Gene Ontology (GO) is a typical gene function classification system that systematically explains the attributes of genes and their products in several organisms. There are three subcategories in the GO annotation: 1) biological process, 2) cellular components, and 3) molecular function (Ashburner et al., 2000). The principal molecular functions identified were GTPase activity, GTP binding, organic acid-binding, etc. ( Figure 4A). Under biological process, the enriched terms were cellular response to interferon, defence response to protozoan, adhesion of symbiont to host, cellular response to interferon-gamma, defence response, etc. ( Figure 4B). The most enrichment cellular components are symbiont-containing vacuole membrane, haemoglobin complex, Golgi apparatus, etc. ( Figure 4C). Other important GO terms, including 'cation channel complex, ' 'mesaxon, ' 'neuronal ribonucleoprotein, ' may also be essential to brain function. We found that most of the GO terms are membrane and extracellular-related, indicating that Bex3 may be necessary for developing neural networks and extracellular signalling.

KEGG pathway analysis
The analysis of the KEGG pathway is used to know the biological function of gene networks. KEGG pathway classification is divided here into five categories: 1) cellular processes, 2) environmental information processing, 3) genetic information processing, 4) metabolism, and 5) organismal systems.
In the KEGG classification, 31 out of 95 Bex3 -/regulated DEGs were assigned to multiple pathways from different categories. KEGG Pathway Enrichment Analysis was then conducted to find the significantly enriched pathway in DEGs mapped to the entire genome background. Figure 5 shows five different processes (marked with colours) with selected items. The most significantly enriched pathways     with the defined set of criteria were TNF signalling pathway, NOD-like receptor signalling pathway, Lysine degradation, Cytokine-cytokine receptor interaction, etc. (Figure 6). A brief description is provided in Table S2.

KEGG disease-associated pathway analysis
The identified DEGs were assigned to KEGG disease enrichment. Brief description is provided (Table S3). The KEGG disease enriched under Bex3 -/regulation were thalassemia, sickle cell anaemia, transient neonatal diabetes mellitus, postaxial polydactyl, non-syndromic X-linked mental retardation, and others (Figure 7).

Encoding transcription factor proteins
The essential regulatory proteins are termed transcription factors that play a role in multiple biological processes. The TFs enriched under Bex3 -/were interferon regulatory factors, Methyl-DpG binding domain, SAND DNAbinding protein domain, BED zinc finger, zinc finger, C2H2 type transcription factors (Figure 8).

Validation of DEGs via qRT-PCR
To verify the reliability of RNA-seq data, certain DEGs were randomly selected and subjected to qRT-PCR analysis. The expression level of these genes was found consistent. The primers are provided (Table S4) The selected genes were Tmem254c, Slc24a5, Xntrpc, Evi2, Tmem81, Tmsb151, and Trpc2. qRT-PCR was performed as explained earlier (Oo et al., 2020). The results are shown in the Figure 9.

Discussion
RNA-guided genome manipulation based on type II prokaryotic CRISPR/Cas system can be effectively used for gene modification (Jinek et al., 2013;Shen et al., 2013). Using sgRNA, Cas9 can be programmed to catalyze DSB at any targeted site defined by the sgRNA sequence followed by a PAM (Jinek et al., 2013). Plasmid constructions consist of sgRNA, and mRNA encoding Cas9 has been directly injected into embryos to quickly generate transgenic mouse models with various altered alleles (Shen et al., 2013;Wang et al., 2013). Thus, the CRISPR-Cas9 holds massive assurance for editing organisms that are otherwise genetically stubborn. In this study, a Bex3 -/mice model was generated using the CRISPR Cas9 system. The targeted gene selected from the brain expressed X-linked gene family, based on its importance in multiple physiological functions. All Bex members consist of three exons. Only the last exon codes for proteins (Fernandez et al., 2015). Two loci were targeted by designing sgRNAs to delete the coding exon. A brain transcriptome study was conducted in this study using RNA-seq methodology. A total of 95 unigenes were recognized, out of which 42 were upregulated, while 53 were down-regulated. Tgtp2, a glucose transporter, is highly expressed, log 2 FC 8.1. Tgtp2 was found to be upregulated in spermatocyte-derived GC-2spd(ts) cells Figure 5. KEGG Pathway classification of DEGs. Y-axis denotes the category of the KEGG pathway, and X-axis represents the number of genes aligned to each class. Blue colour indicates cellular process, orange colour indicates Environmental information processing and sea blue indicates genetic information processing, yellow means metabolism while the green represents organismal systems.  under controlled conditions (Kurihara et al., 2017). Trpc2, a TRPC subfamily member, is highly expressed with log 2 FC value 6.4. TRPC2 is highly restricted to the dendritic tip of vomeronasal sensory neurons (Zufall et al., 2005). The gene has been established to play a role in mature sperm and the vomeronasal sensory system (Yildirim and Birnbaumer 2007). Evi2, a highly expressed gene with a log 2 FC value 4.90. EVI2 has been shown as a possible candidate in NF1 disease . Neurofibromatosis type 1 (NF1) is one of the most ordinary inherited human disorders. The potential role of Evi-2 in murine neoplastic disease and the map representation of the human homolog  indicates a possible role for EVI2 in NF1 . Another highly expressed gene, Gm45935, with log 2 FC value 5.88, is involved in Lysine degradation metabolic pathways. Zbp1 is moderately expressed with log 2 FC value 3.9. Researchers have identified that ZBP1 endorses translocation of the β-actin transcript to actinrich protrusions in primary fibroblasts and neurons (Huttelmaier et al., 2005). Ccl21a, with log 2 FC value 3.6. The CC chemokine CCL21 is a strong chemoattractant for lymphocytes and dendritic cells in vitro (Chen et al., 2002). Other researchers demonstrated a specific action of the chemokines, CCL19, and CCL21, delivering a unique paradigm to study HIV-1 latency in vitro (Saleh et al., 2007). Gbp4, with log 2 FC value 2.8, is a multi-function gene. Gbp chr3 deficient mice were exceedingly vulnerable to T. gondii pathogenicity, causing a high parasite load in immune organs (Yamamoto et al., 2012). Similarly, there are several important genes among the downregulated genes among DEGs. Xlr3a, with log 2 FC value -6.75, belongs to a new subfamily in the Xlr multigene family. Like Xlrl, they are upregulated during B-cell terminal differentiation in normal and neoplastic B-cells and crosshybridize with a message in testis RNA (Bergsagel et al., 1994). Evi2b, is down regulated with log 2 FC value -4.7. EVI2B is a transmembrane protein universally expressed in hematopoietic cells. It was initially discovered as a common virus incorporation site in murine retrovirallyinduced leukaemias, indicating that Evi2b might be a proto-oncogene (Kaufmann et al., 1999). Cyren is down regulated with log 2 FC value -3.14. Researchers showed that CYREN (cell cycle regulator of NHEJ) is a cell-cyclespecific inhibitor of cNHEJ (Arnoult et al., 2017). Xlr3, an important gene, is down regulated with log 2 FC value of -6.7. A cluster of X-linked genes has been identified to show transcriptional repression of parental alleles in the developing brain. Imprinting these three Xlr3b, Xlr4b, and Xlr4c, genes is independent of X-chromosome deactivation and has a vibrant and dense form of tissue and time specificity (Raefski and O'Neill 2005).
Interferon regulatory factors are proteins that regulate the transcription of interferons. Irf7 (Log 2 FC value 1.2) is a multifunctional transcription factor. Aberrant production of type I IFNs is associated with many diseases such as cancer and autoimmune disorders (Tirone et al., 2021). Setdb2, a member of the Methyl-DpG binding domain, is responsible for multiple functions, including oncogenic roles. Zbed6, a member of the BED zinc finger, is responsible for several physiological functions. Seven transcription factors were enriched in Zinc finger C2H2 type. We found that under biological process, among the enriched terms hypothetically related to brain development and functions are cerebellar granular layer development and olfactory bulb mitral cell layer development. Other important GO terms, including 'cation channel complex, ' 'mesaxon, ' 'neuronal ribonucleoprotein, ' may also be necessary to brain physiology. GO analysis discovered that several terms are membrane and extracellularrelated; Bex3 may be essential for developing neuronal networks and extracellular signalling (Calvo et al., 2015;Navas-Pérez et al., 2020). TNF signalling pathway, NODlike receptor signalling pathway, Lysine degradation, Cytokine-cytokine receptor interaction, etc. were the most enriched KEGG pathways. Thalassemia, sickle cell anaemia, transient neonatal diabetes mellitus, postaxial polydactyl, non-syndromic X-linked mental retardation, and others were identified KEGG diseases enriched under the control of Bex3 -/-.

Conclusion
A mouse model of Bex3 KO was generated in this study. We did not see any phenotypic differences between the transgenic and normal mice. The transcriptome analysis was performed of the brain. Several DEGs identified in this study are crucial for brain physiology. Moreover, these genes play a crucial role in the proper function of the neurons. Ontology analysis indicated its role in the nervous system as well as in the immune system. A further study is required to confirm these results using respective markers.

Figure S2. Off target loci for each sgRNA and primers sequence for PCR amplification.
The CRISPR Cas9 system is potentially targeting undesired loci, which is termed as off-target breaks. To identify the off-target effects of sgRNAs for Bex2, the similarity scores more than 15/15 were selected for PCR amplification and Sanger sequencing. Since the off-targets similarity scores were less than 15/15 for Bex3, the PCR amplification and Sanger sequencing were skipped, rather than the deletion of the fragment was relayed based on mRNA level by RT-qPCR as compared with WT mice (n=3). Figure S3. Off target analysis. The potential off target loci were PCR amplified and sanger sequenced. The chromatogram showed no off-target breaks. Figure S4. Volcano map of the genes expressed. Bex3 -/brain vs WT brain. Red dots indicate up-regulated genes, green dots indicate down-regulated genes, while grey dots indicate non-DEGs.   Note. Several genes are involved in multiple pathways. Here, only the primary involvement is shown.