Exploration of the involvement of LncRNA in HIV-associated encephalitis using bioinformatics

Background HIV-associated encephalitis (HIVE) is one of the common complications of HIV infection, and the pathogenesis of HIVE remains unclear, while lncRNA might be involved it. In this study, we made re-annotation on the expression profiling from the HIVE study in the public database, identified the lncRNA that might be involved in HIVE, and explored the possible mechanism. Methods In the GEO public database, the microarray expression profile (GSE35864) of three regions of brain tissues (white matter, frontal cortex and basal ganglia brain tissues) was chosen, updated annotation was performed to construct the non-cording-RNA (ncRNA) microarray data. Morpheus was used to identify the differential expressed ncRNA, and Genbank of NCBI was used to identify lncRNAs. The StarBase, PITA and miRDB databases were used to predict the target miRNAs of lncRNA. The TargetScan, PicTar and MiRanda databases were used to predict the target genes of miRNAs. The GO and KEGG pathway analysis were used to make function analysis on the targets genes. Results Seventeen differentially expressed lncRNAs were observed in the white matter of brain tissues, for which 352 target miRNAs and 6,659 target genes were predicted. The GO function analysis indicated that the lncRNAs were mainly involved in the nuclear transcription and translation processes. The KEGG pathway analysis showed that the target genes were significantly enriched in 33 signal pathways, of which 11 were clearly related to the nervous system function. Discussion The brand-new and different microarray results can be obtained through the updated annotation of the chips, and it is feasible to identify lncRNAs from ordinary chips. The results suggest that lncRNA may be involved in the occurrence and development of HIVE, which provides a new direction for further research on the diagnosis and treatment of HIVE.


INTRODUCTION
Cognitive impairment is one of the challenges that HIV patients may face (Clifford & Ances, 2013). HIV-1 enters the central nervous system through the blood-brain barrier at the initial infection stage, and a virus replicating area isolated from the body is formed in the central nervous system (Stam et al., 2013). Before the introduction of highly active anti-retroviral therapy (HAART), many HIV patients would soon suffer severe cognitive impairment, which is called HIV associated dementia (HAD), patients with HAD usually suffer from HIV-associated encephalitis (HIVE) (Masliah et al., 2000). Although HAART is very effective at present, HIV-induced brain inflammation has been frequently noticed in autopsy, and neurocognitive test results are abnormal in most HIV patients (Clifford & Ances, 2013). Currently, HIV cannot be radically eradicated by any HIV therapy, and the anti-retroviral viruses can hardly pass the blood-brain barrier, so the central nervous system may become a virus repository that might promote the occurrence and development of HIVE (Kumar et al., 2007). At present, the pathogenesis of HIVE is not very clear. Studying the molecular signaling pathways invovled in HIVE would be significance for the prevention and treatment of HIVE.
In the human genome, more than 70% of the genes are transcribed into RNAs, but less than 2% of them are protein coding genes, and most of them are noncoding RNAs (Costa, 2010). Non-coding RNAs (ncRNAs) regulate the expression of targeted genes through various pathways, and thus participate in various life processes of cells, tissues and organisms. According to the length, ncRNAs can be divided into small ncRNAs (<200 bp) and long ncRNAs (lncRNAs) (>200 bp). In recent years, studies have found that there are interactions between RNAs of different lengths, especially the relationship between lncRNA, miRNA and mRNA, which form a regulatory network of lncRNA-miRNA-mRNA. LncRNAs could be the ''molecular sponges'' of miRNAs, that is, lncRNAs could bind to target miRNAs leading to the ''silencing effect'' attenuation of miRNAs on target genes, thereby regulating the target genes of miRNAs (Salmena et al., 2011). This study is aiming to identify the lncRNA might be involved in HIVE from expression profile, and explore the possible mechanisms.
In recent years, expression microarray technology has played an important role in the research on the disease occurrence and development, and many research results can be reviewed and downloaded in the public database. Because updated annotation of microarray results has always been continuing, new results and novel revelation can be accomplished through the re-annotation and analysis of published studies on microarray results. In this study, we retrieved the HIVE study related microarray data from the GEO database, re-annotated and analyzed these data, identified the lncRNAs that might be involved in the pathogenesis of HIVE, and performed the correlation analysis (the work flow was shown in Fig. 1), aiming to verify the feasibility of identifying lncRNAs from expression profile from public database and to explore the possible mechanisms of lncRNAs participating in pathogenesis of HIVE.

Microarray data
The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo), curated by the National Center for Biotechnology Information (NCBI), is a public functional genomics database, and the data could be downloaded for free. In the GEO database, the microarray expression profile (GSE35864) in three regions of the brain, basal ganglia, white matter, and frontal cortex, in normal, HIV infected, HIV infected with neurocognitive impairment, and HIV infected with both neurocognitive impairment and encephalitis patients was chosen. Twenty-four human subjects in four groups were examined: Group A (n = 6) HIV-1 uninfected with no neuropathological abnormalities at autopsy; Group B (n = 6) HIV-1-infected (HIV+) neuropsychologically normal with no neuropathology; Group C (n = 7) HIV+ with substantial HIV-associated neurocognitive impairment (HAND) as defined below, with no encephalitis (HIVE) or substantial neuropathological defect; Group D (n = 5) HIV+ with HAND and HIVE. RNA from neocortex, white matter, and neostriatum was processed with the Affymetrix Human Genome U133 Plus 2.0 Array platform.

Identifying differentially expressed lncRNAs
First, the latest annotation files of HG-U133_Plus_2 Annotations, CSV format, Release 36 (7/12/16) of the Affymetrix Human Genome U133 Plus 2.0 Array were downloaded from the website http://www.affymetrix.com/support/technical/annotationfilesmain.affx, including the probe set ID, gene symbol, gene title, ensemble gene ID, Refseq transcript ID and information related to the probe. The gene expression data of the chips corresponded to the probe ID, and meanwhile the probes were labeled with Refseq transcript ID through the NetAffx annotation. The probes with the ''NR_'' logo were identified in Refseq ID (NR representing nonencoding RNA). Morpheus (https://software.broadinstitute.org/morpheus/) was used to analyze online and identify the differential expressed ''NR_'' between the groups (Group A vs. Group D, Group B vs. Group D and Group C vs. Group D) of each region tissue (white matter, frontal cortex and basal ganglia brain tissues). This analysis was based on the t -test and adjusted according to the characteristic that the noise of microarray data was correlated with the peak value of expression data. We considered that p-value <0.01 was statistically significant. The number of upregulation and downregulation was 100). The Wayne map was drawn based on the intersection of the above results to select differential ''NR_'' participating in the occurrence and development of HIVE. The pseudogenes, rRNAs, microRNAs and other short RNAs (including tRNAs, snRNAs and snoRNAs) were filtered out through Genbank from NCBI database. The final remainder was the differential expressed lncRNAs.

Function of lncRNA target genes and pathway cluster analysis
The Database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov/) is an online program that provides a comprehensive set of functional annotation tools for researchers to understand biological meaning behind many genes. Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed for the identified DEGs using the DAVID database. After GO functional enrichment analysis, we considered the biology process terms with p-value <0.05 was statistically significant. For KEGG analysis, we considered the subpathway with p-value <0.05 was statistically significant.

Identifying differentially expressed lncRNAs
We re-annotated the GSE35864 and preliminarily retained 15,901 probes only with the ''NR_'' logo in the Refseq transcript ID. The Morpheus online tool was used to analyze and identify the differentially expressed probes with the ''NR_'' logo between the groups (Group A vs. Group D, Group B vs. Group D and Group C vs. Group D) of each region tissue (white matter, frontal cortex and basal ganglia brain tissues). As shown in Fig. 2A, there were differentially expressed probes with the ''NR_'' logo between different groups in the white matter, among which those that may be involved in the occurrence and development of HIVE were identified, and a total of 63 ncRNAs were identified. As shown in Fig. 2B, there were differentially expressed probes with the ''NR_'' logo between different groups in the frontal cortex, and the intersection was selected from each group. There was only 1 probe labeled with ''NR_'' intersected. As shown in Fig. 2C, there were differentially expressed probes with the ''NR_'' logo between different groups in the basal ganglia, and the intersection was selected from each group. There were 12 common probes labeled with ''NR_'' intersected. All the differential expressed ncRNAs with the ''NR_'' logo were searched in the GenBank, and 17 lncRNAs were identified in the white matter, without any differentially expressed lncRNAs in the frontal cortex and basal ganglia. These 17 lncRNAs were LINC00308, LOC100507387, SCOC-AS1, ALMS1-IT1, LINC00639, LOC101928847, LOC100134368, ZNF670-ZNF695, SHANK2-AS3, MEG9, SNHG7, TMEM44-AS1, LRRC8C-DT, MASP1, MAPT-AS1, TBX5-AS1, LINC01770. As shown in Fig. 3, the expression of LINC00308, LOC100507387, SHANK2-AS3, SNHG7, MAPT-AS1 were significantly increasing in Group D, and the others were significantly decreasing.

Prediction of target miRNAs of lncRNAs and target genes
The target micRNAs of the lncRNAs were identified from the database, and a total of 352 target miRNAs were predicted in 17 differentially expressed lncRNAs ( Table 1). The 6,659 corresponding target genes of miRNAs were predicted by TargetScan, PicTar and MiRanda.

DISCUSSION
There are two main highlights in the research.
(1) The differentially expressed lncRNAs were identified through the re-annotation of published microarray results.
(2) The target miRNAs of the lncRNAs and target genes were predicted using a bioinformatics method, and GO function and KEGG pathway analyses were performed to learn about the possible mechanisms of lncRNA involved in the occurrence and development of HIVE. In recent years, hundreds of lncRNAs have been discovered, and the changes in the expression of lncRNAs have been associated with the occurrence and development of many diseases. Plenty of evidence has shown that lncRNA is involved in the replication process of the virus (Zhang et al., 2013) and that lncRNA is involved in the infection process of HIV through changes in the cellular environment (Barichievy, Naidoo & Mhlanga, 2015). However, the role of lncRNA in the occurrence and development of HIV-related encephalitis remains unclear. The mRNA, miRNA and lncRNA that were related to the diseases were identified by microarray and bioinformatic method, which has been applied in the study of many human diseases, just the same in the study of HIV-related encephalitis. Because the annotation of microarray results has been continuously updated, some new results may be obtained by the re-annotation and re-analysis of published chips in the common database. In the GEO public database, we retrieved more comprehensive microarray results of HIVE related study (multi-group and multi-organization types), and only differential analysis of the expression of mRNA in different brain tissues (the white matter, frontal cortex and basal ganglia brain tissues) of each group was carried out. We re-annotated the microarray results and identified possible ncRNA probe results to construct the ncRNA microarray results. In addition, we then compared and analyzed the results to identiry differentially expressed ncRNAs that may be involved in the occurrence and development of HIVE. We identified 63 probes with the ''NR_'' logo in the white matter, one probe with the ''NR_'' logo in the frontal cortex, and 12 probes with the ''NR_'' logo in basal ganglia. All the probes with the ''NR_'' logo were retrieved, and it was found that only 17 probes with the ''NR_'' logo in the white matter were identified as lncRNAs. Among these 17 lncRNAs, expression of five were increasing in Group D, and the others were decreasing. As we found differentially expressed lncRNAs only in white matter, we speculated that cerebral white matter lesions may play an important role in the pathogenesis of HIV-associated encephalitis, which was also consistent with previous research results. The central nervous system injury affected by HIV-1 usually manifested microglial nodules comprised of multinucleated giant cells and inflammatory cells. These lesions are particularly in white matter (Fischer-Smith et al., 2001) Neuronal damage in HIVE is generally attributed to fully activated microglia/macrophages, especially in white matter (Roberts, Masliah & Fox, 2004). In addition, multinucleated giant cells and perivascular demyelination leading to white matter pallor are typical features of HIVE. In addition, these lncRNAs could be used as markers of white matter damage of HIVE. Based on the lncRNA-miRNA-mRNA mechanism,we used bioinformatics tools to predict target miRNAs and target genes of these 17 lncRNAs. GO and KEGG analysis were carried out to make the correlation cluster analysis on target genes, in order to explore the potential mechanisms of lncRNAs participating in HIVE. The GO cell component (CC) analysis results revealed that the target genes were significantly clustered in the nucleus, cytoplasm, Golgi apparatus, lysosome, plasma membrane, etc. The GO molecular function (MF) analysis showed that the target genes were significantly clustered in the transcription factor activity, protein serine/threonine kinase activity, transcription regulator activity, ubiquitin-specific protease activity and other molecular functions. The GO biological process (BP) analysis revealed that the target genes were significantly clustered in the regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism, signal transduction, cell communication, transport and other biological processes. Therefore, it was proven that lncRNAs may be involved in the occurrence and development of HIVE by way of their participation in the process of nuclear transcription and translation. The KEGG pathway analysis showed the target genes were significantly clustered in the mucin type O-glycan biosynthesis, proteoglycans in cancer, pathways in cancer, the glutamatergic synapse, long-term depression and other relevant pathways. In previous research, it was found that a variety of pathways were involved in neurological disorders and even in the occurrence of HIVE. The pathway of glutamatergic synapse was related with the occurrence of encephalitis, and the patients with anti-NMDAR encephalitis had a diminished function of the glutamatergic synapse (Hughes et al., 2010). The expression changes of the glutamatergic synapse in the brain tissues were associated with the occurrence of hepatic encephalopathy (Montana, Verkhratsky & Parpura, 2014). For the axon guidance pathway, it has been reported of relevant gene expression impairment of the axon guidance pathway and its downstream pathway (including MAPK pathway, calcium signaling pathway, Jak-STAT signaling pathway and VEGF signaling pathway) in the brain tissues of the patients with HIV-associated dementia, which provided new ideas for the diagnosis and treatment of HIV-associated dementia (Zhou et al., 2012). Both Rap1 signaling pathway and Ras signaling pathway were involved in such nervous system functions as glutamatergic synaptic transmission (Imamura et al., 2003), synaptic excitability (Imamura et al., 2004), synaptic reversibility (Masliah et al., 2004), etc. The abnormal expression of signaling pathway can cause encephalitis and other neuronal dysfunctions. The changes in the synaptic vesicle, especially the synaptic vesicle cycle, can cause abnormal neurotransmitter activity (Cortes-Saladelafont et al., 2016). For the ErbB signaling pathway, there were significant changes in the gene expression of ErbB signaling pathway in the brain tissues of the patients with HIV-associated dementia (Shityakov, Dandekar & Forster, 2015). The TGF-β signaling pathway was also involved in the pathophysiology of the nervous system, which can limit inflammation and reduce neurological damage in the nervous system infection process (Cekanaviciute et al., 2014). The TGF-β signaling pathway was also related with the tolerance of dendritic cells (Esebanmen & Langridge, 2017). Together with the STAT2 signaling pathway, the TGF-β signaling pathway can inhibit the progression of autoimMune encephalopathy (Xu et al., 2014). For the cGMP-PKG signaling pathway, its abnormality had something to do with the diseases of the nervous system, and reduced kinase activity in the cGMP-PKG signaling pathway was found in rats with hepatic encephalopathy. Cognitive disorders could be relieved when the NO/sGC/cGMP/PKG signaling pathway was inhibited in diabetic rats. The iNOS-NO-cGMP signaling pathway also was involved in nervous system inflammation and myelin formation (Raposo et al., 2014). The activation of the Wnt signaling pathway could promote the occurrence of autoimMune encephalitis (Schneider et al., 2016), and the Wnt signaling pathway was involved in the immunity and tolerance of dendritic cells (Swafford & Manicassamy, 2015). Moreover, plasma Dickkopf-related protein 1, the antagonist of the Wnt signaling pathway, was associated with HIV-related cognitive deficits (Yu et al., 2017). Therefore, KEGG analysis showed that most of the significant clustering pathways were related with the function of the nervous system. Thus, the differentially expressed lncRNAs, act as ''molecular sponges'', could affect the function of their target miRNAs, and thereby regulating target genes. It can be confirmed that lncRNA is indeed involved in the occurrence and development of HIV-related encephalopathy through the lncRNA-miRNA-gene mechanism. In addition, it could also be indirectly confirmed that it is feasible to construct a new chips by re-annotation and identify differentially expressed lncRNA from a expression chip.

CONCLUSIONS
The brand-new microarray results from a different perspective can be constructed through the updated annotation of the precious and published microarray data. In this study, the lncRNA results were obtained through the re-annotation of microarray data, which provides the foundation for further research on the role of lncRNA in the occurrence and development of HIVE. From the point of view of the lncRNA-miRNA-target genes, cluster analysis was performed by various bioinformatic methods to explore the role of lncRNA. GO analysis showed that lncRNA may be involved in the occurrence and development of HIVE via its participation in the nuclear transcription and translation. The KEGG pathway analysis showed that most of the KEGG pathways with statistical significance were associated with the function of the nervous system. Therefore, we can speculate that lncRNA is indeed involved in the occurrence and development of HIVE, which is of great significance for future research on lncRNA on HIVE. It is also proved that it is feasible to identify lncRNAs from public database.