Host transcriptome-guided drug repurposing for COVID-19 treatment: a meta-analysis based approach

Background Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been declared a pandemic by the World Health Organization, and the identification of effective therapeutic strategy is a need of the hour to combat SARS-CoV-2 infection. In this scenario, the drug repurposing approach is widely used for the rapid identification of potential drugs against SARS-CoV-2, considering viral and host factors. Methods We adopted a host transcriptome-based drug repurposing strategy utilizing the publicly available high throughput gene expression data on SARS-CoV-2 and other respiratory infection viruses. Based on the consistency in expression status of host factors in different cell types and previous evidence reported in the literature, pro-viral factors of SARS-CoV-2 identified and subject to drug repurposing analysis based on DrugBank and Connectivity Map (CMap) using the web tool, CLUE. Results The upregulated pro-viral factors such as TYMP, PTGS2, C1S, CFB, IFI44, XAF1, CXCL2, and CXCL3 were identified in early infection models of SARS-CoV-2. By further analysis of the drug-perturbed expression profiles in the connectivity map, 27 drugs that can reverse the expression of pro-viral factors were identified, and importantly, twelve of them reported to have anti-viral activity. The direct inhibition of the PTGS2 gene product can be considered as another therapeutic strategy for SARS-CoV-2 infection and could suggest six approved PTGS2 inhibitor drugs for the treatment of COVID-19. The computational study could propose candidate repurposable drugs against COVID-19, and further experimental studies are required for validation.


INTRODUCTION
COVID-19 is a pulmonary syndrome caused by a novel strain of coronavirus, and according to the World Health Organization report as on 13th April 2020, 1,773,084 people have been infected with about 111,650 deaths globally (Coronavirus Disease, 2019). The primary mode of SARS CoV-2 transmission is through respiratory droplets generated during coughing and sneezing by infected patients . Symptoms include dry cough, fatigue myalgia, fever, and dyspnea; however, the disease progresses to severe illness and leads to death in 6% of confirmed cases due to massive alveolar damage and progressive respiratory failure (Rothe et al., 2020). Coronaviruses possess the largest genomes (26.4-31.7 kb) of all RNA viruses encoding four main structural proteins contain spike (S), membrane (M), envelope (E), and nucleocapsid (N) proteins. After the virus enters the host cell, the genome is transcribed and then translated (Cascella et al., 2020). Invading respiratory viruses would either suppress or evade the innate immune and adaptive responses on the host's side and increase virulence, which leads to disease outcome.
With its limited genome size, the virus extensively utilizes the host factors for their replication via inducing alterations in the host gene expression resulting in modulated immune response (De Wilde et al., 2018). Therefore, the transcriptome analysis of host cell upon virus infection is useful for identifying host immune response dynamics and also significant host factors that would facilitate virus infection. To develop effective therapeutic strategies, it is necessary to understand the expression of host factors upon SARS-CoV-2 infection. According to the current treatment scenario, drug repurposing from FDA approved drugs would be an effective alternative method that would improve the host factor against SARS-CoV-2 virus infection. The transcriptome guided drug repurposing approaches utilize the drug perturbed expression profiles to identify potential drug candidates, which show anti-correlation with the disease signature (Arakelyan et al., 2019). In the present study, initially, we attempted to conduct the transcriptome analyses by utilizing the publicly available host transcriptome profiles against SARS-CoV-2 and also other respiratory virus infections that could provide information on altered host factors upon infection. Next, based on the identified SARS-CoV-2 induced pro-viral host factors, drug repurposing analyses were performed to identify the possible drugs for the treatment of the pandemic infection caused by COVID-19, which would help to decrease the mortality of COVID-19 patients globally.

Data retrieval and sample selection
Gene expression profile datasets based on expression profiling by high-throughput sequencing and microarray in response to different coronaviruses such as Middle East respiratory syndrome coronavirus (MERS-CoV), SARS-CoV-1 and SARS-CoV-2 infection in the human host were retrieved from Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) and OmicsDI (https://www.omicsdi.org/). The datasets with coronavirus infected group and mock control group were included in the study. Sixteen datasets were obtained, and three of them were based on expression profiling by high-throughput sequencing (RNA-Seq), and the remaining thirteen were from microarray experiments. The cell types of the identified datasets are human airway epithelium cells (HAE), Calu-3 lung adenocarcinoma, primary fibroblasts, primary human microvascular endothelial cells, primary human dendritic cells. The only available transcriptome profile dataset on SARS-CoV-2 (as of 9th April 2020) (GSE147507) was based on the expression profiling in 24-h infected cells (Blanco-Melo et al., 2020). So we have selected only samples from 24-h infected conditions and the corresponding mock controls for the meta-analysis. Based on the above criteria, thirteen datasets (GSE147507, GSE17400, GSE45042, GSE47962, GSE33267, GSE37827, GSE122876, GSE100496, GSE86528, GSE100509, GSE79172, GSE81909, GSE48142) were selected for the metaanalysis. A total of 104 samples from the thirteen datasets selected for the meta-analysis (Table S1). Among the 104 samples, 52 samples belong to respiratory virus-infected condition, and 52 of them were mock control samples. The overall workflow implemented in the study is reported in Fig. 1.

Data processing and statistical meta-analysis
The preprocessing and statistical meta-analysis of the selected 104 samples were performed using the web-based tool NetworkAnalyst (Zhou et al., 2015). The identifiers of transcripts from high throughput sequencing and different microarray platforms were converted to Entrez gene IDs. The preprocessed microarray and RNA-Seq datasets were normalized using the log2 transformation. All the datasets passed the integrity check and subjected to study-specific batch effect adjustment using the ComBat methods in the Network Analyst tool. The Density plot and PCA-3D plot before and after the batch-effect adjustment indicate the proper incorporation of the datasets (Fig. S1). A total of 9,836 features from 104 samples consists of different respiratory virus infection models with two experimental conditions, 24-h infected condition and mock control were subjected to analysis. The meta-analysis was performed based on combining effect sizes method using the Fixed Effect Model with the significance level 0.01 and the differentially expressed genes in 24 h infection conditions of different respiratory viruses vs. mock control were obtained.
The RNA-Seq dataset with the GEO ID GSE147507 was used to identify the genes altered in SARS-CoV-2 infected conditions. The dataset consists of host transcriptome profiling based on two cell lines, A549 and NHBE. The 24 h SARS-CoV-2 infected and mock control samples from the two cell lines were subjected to separate differential expression analysis using the Gene expression table option of Network Analyst. Each dataset consists of 6 samples corresponding to 24 h SARS-CoV-2 and mock-control. The gene expression table uploaded with the options, organism-human, data type-bulk RNA-seq and the gene level summarization-mean. There are 16,375 and 15,770 features selected for A549 and NHBE cells, respectively. The unannotated genes were filtered and log2-counts per million normalization applied. After normalization, the graphical outputs, box plots, PCA, and density plots were verified to check the quality of the data. After that, differential expression analysis in case vs. mock condition performed using the method, DESeq2 and the differentially expressed genes were obtained. The significant thresholds, adjusted P-value 0.02 and log2 fold change 0.3 were applied to identify the significantly differentially expressed genes. Among the significant genes, the genes showing consistent tendency in NHBE and A549 cells (31 genes) were selected for further analysis. Among the 31 genes, the pro and antiviral factors were identified based on the literature survey. The heatmap2 tool in Galaxy was used to generate the heat map of the differentially expressed genes.

Protein interaction analysis and gene set enrichment analysis
Protein level interaction analysis was performed using the STRING Program (Szklarczyk et al., 2019). The selected differentially expressed genes were submitted to a multi-gene entry option in STRING to obtain the protein level interaction network. Cluster option was used to identify the cluster of interactions in the network and the obtained interaction details were used to construct a protein-protein interaction network using Cytoscape (Shannon et al., 2003). The Pathway and Gene Ontology enrichment details were obtained from STRING based on the FDR cut off <0.01.

Drug repurposing analysis
The DrugBank Version 5.1.4 (http://www.drugbank.ca/) was used to obtain the drug-target link between existing drug molecules and the DEGs. The differentially expressed genes showing consistent tendency were subjected to target search in DrugBank. Only approved or experimental drug groups were included in the study. Drug repurposing analysis was performed using the Connectivity Map using the web application CLUE (https://clue.io) (Subramanian et al., 2017). The query option in the tools was selected to find the perturbagens that give rise to opposing expression signature of the pro-viral factors identified in the study. The query requires a minimum of 10 genes, so 8 pro-viral factors (TYMP, PTGS2, C1S, CFB, IFI44, XAF1, CXCL2 and CXCL3) showing consistent up-regulation in NHBE and A549 cells, and two pro-viral factors (NFKB1 and TLR2) upregulated in NHBE cells were subjected to analysis. The up-regulated pro-viral factors were compared to each signature in the CMap reference (Gene expression (L1000) Touchstone) database. The heatmap of the connectivity score (tau) of perturbagens (2837 small-molecule compounds) was obtained. The top-scored compounds with CMap tau score <−99 with the highest anti-correlation with the upregulated ten pro-viral genes were identified.

Identification of differentially expressed host genes with COVID-19 infection
In this study, the host factors in response to SARS-CoV-2 and other coronavirus infections were analyzed using a computational approach. A meta-analysis strategy utilized to identify differentially expressed genes common in the human host infection mediated by different respiratory infection viruses. The 16 datasets from gene expression profiling studies based on high-throughput sequencing and microarray experiments were obtained from GEO (Table S1). The publicly available host gene expression profiles of respiratory infection viruses till 9th April 2020 was used for the analysis. The dataset on SARS-CoV-2 (GEO ID: GSE147507) consists of 24-h infected and the mock-control samples of primary human lung epithelium (NHBE) and transformed lung alveolar(A549) cells (Blanco-Melo et al., 2020). A total of 104 samples (52 respiratory virus-infected and 52 mock control samples) selected for the meta-analysis considering only 24 h infected samples from different datasets consisting of respiratory virus-infected human host models of SARS-CoV-2, SARS-CoV, MERS-CoV, and Respiratory syncytial virus (RSV). The differential expression analysis of various respiratory infection viruses vs. mock-control by meta-analysis reported 2,125 genes based on the FDR < 0.01 (Table S2). Next, the differentially expressed genes, specifically in SARS-CoV-2 infected conditions, were identified in A549 and NHBE cells. The Table S3 reports 143 and 260 differentially expressed genes in A549 and NHBE cells, respectively, based on the adjusted P value cutoff < 0.02. Together in NHBE and A549 cells, a total of 371 unique genes were reported as differentially expressed (Table S3). The Venn diagram reports the overlap of the common genes in a meta-analysis of different respiratory virus infection vs. mock and SARS-CoV-2 vs. mock conditions in 24 h infection models ( Fig. 2A). Only 19 differentially expressed genes were found to be common in the meta-analysis of different respiratory infection viruses and SARS-CoV-2 specific analysis in different cell lines, which indicate SARS-CoV-2 specific gene signatures in a 24 h host infection models. The Venn diagram reports 32 genes common between A549 and NHBE cells (highlighted in Table S3). Among that, 31 genes noticed to be upregulated in both NHBE and A549 cells. The remaining one gene, KRT4 found to be down-regulated in A549 and upregulated in NHBE cells. The 31 genes showing consistent upregulation tendency in SARS-CoV-2 infected NHBE, and A549 cells were selected for further analysis ( Table 1). Among that, 19 genes overlapping with the meta-analysis data ( Fig. 2A) observed to have the same upregulation tendency and are included in Table 1. Therefore, the 19 genes are common in 24-h infection models of the different respiratory infection viruses, and that includes IFI6, IFIT1, MX1, IRF9, IRF7, OAS1, IFIH1, IFI27, PLSCR1, OAS2, IFITM1, CXCL2, IFI44, PTGS2, BCL2A1, CXCL3, XAF1, and EDN1. The heat map of the thirty-one significant genes was reported in Fig. 2B and has shown the same tendency (Table 1) in the original analysis reported by Blanco-Melo et al. (2020), though the statistical parameters found to vary.

Functional annotation and pathway enrichment of significant host genes in SARS-CoV-2 infection
The selected 31 genes reported in Table 1 were subjected to protein level interaction and functional enrichment analysis using the web tool, STRING. The protein-protein interaction network was constructed mainly considering interactions with high confidence score >0.9. A significant network with 94 edges was obtained with a PPI enrichment P-value: < 1.0E−16. The fourteen nodes of the network formed a cluster with a dense overlap of 88 edges, and those nodes included, IFI6, IFI27, IFIT1, IFIT3, IFITM1, IFITM3,  Fig. 3A. The node size in the network indicates the connectivity degree, and the gene IRF7 was found to have the highest degree of 14. The cluster forming genes are part of the Interferon alpha/beta signaling, and importantly, the pathway enrichment analysis based on the REACTOME pathway reports this signaling as the top enriched pathway with a significant FDR of 2.32E−24 and enriched with 14 genes ( Table 2). The other significant pathways are cytokine signaling in the immune system enriched with 16 genes, the immune System pathway with 20 genes and the Antiviral mechanism by IFN-stimulated gens with six enriched genes. The top enriched pathways and biological processes were reported in Table 2. The other relevant pathways from KEGG and REACTOME enriched by the set of genes were reported in Table S4. The Gene Ontology (GO) enrichment analysis of the 31 genes was reported in Table S4. The top five biological processes include defense response, type I interferon signaling pathway, innate immune response, response to the virus, and response to other organisms. The enriched molecular functions are 2′-5′-oligoadenylate synthetase activity and double-stranded RNA binding (Table S4).

Drug repurposing against SARS-CoV-2
To find potential drug molecules for repurposing against SARS-CoV-2 infection, we have done the DrugBank search with the 31 upregulated genes. Drug groups with approved or experimental status were considered for the analysis, and the identified drug molecules for the target genes were tabulated in Table 3. Approved or experimental drugs were identified as modulators of the protein product of three genes, CFB, ASS1, IFNGR1 and PTGS2. The pharmacological action unknown for drugs targeting CFB and ASS1 and a protein binder reported for IFNGR1. Importantly, PTGS2 is a pro-viral factor, and 76 approved drug molecules identified as inhibitor/antagonist of PTGS2 protein (Table 3). Next, we have utilized a gene expression signature-based drug repurposing strategy based on CMap (Connectivity Map) using the web tool, CLUE. A gene set can be queried to drug perturbation signatures to identify and rank drugs according to the similarity in gene expression. A positive score indicates the similarity and negative score indicate the      reverse effect of the drug signatures with the queried genes, and its magnitude contributes to the magnitude of similarity. Among the 28 significantly upregulated genes, eight genes were found to be pro-viral factors that are important for the viral infection and need to be targeted for the therapy point of view. Along with the eight genes, NFKB1 and TLR2 also found to be pro-viral and upregulated in SARS-CoV-2 infected NHBE cells. Therefore, the ten upregulated genes, TYMP, PTGS2, C1S, CFB, IFI44, XAF1, CXCL2, CXCL3, NFKB1 and TLR2 were subjected to connectivity map-based gene expression signature search to find therapeutic compounds. We could identify compounds with significantly correlated and anti-correlated signatures with that of the ten pro-viral host factors considered for the analysis. The top-scored compounds with CMap connectivity score (tau score) <−99 with the highest anti-correlation with the upregulated ten pro-viral genes are reported in Table 4. There were 27 compounds with significant anti-correlation, and the heat map obtained from CLUE analysis shown in Fig. 3B ( Table 4). The approved drugs among those include estrone (estrogen receptor agonist), hexylresorcinol (polyphenol oxidase inhibitor), pentobarbital (GABA receptor modulator), nitrendipine (calcium channel blocker), phenazopyridine (targeting SCN1A), heraclenol (Cobamamide), alprazolam (benzodiazepine receptor agonist), bromocriptine (dopamine receptor agonist) and WT-171 (Vorinostat) (HDAC inhibitor). Twelve of the 27 top drugs found to have reported anti-viral activity against different viruses which includes, ritanserin, JAK3-inhibitor-I (BRD-K72541103), tipifarnib, W-12, Topoisomerase inhibitor (BRD-K52640952), hexylresorcinol, LY-303511, SB-239063, SD-169, alprazolam, dilazep, danoprevir and are highlighted in Table 4 and the structure of the drug molecules are shown in Fig. 4A. Next, we analyzed the potential impact of the reported 27 drugs on the natural defense against infection. For that, we have checked the correlation between drug-perturbed expression profile with the 23 anti-viral proteins reported from the study using the connectivity map. Among the 27 drugs obtained based on pro-viral factors, seven drugs, tipifarnib, dilazep, GW-843682X, estrone, myricetin, guggulsterone, bromocriptine reported having mean connectivity score <−90 and can down-regulate the anti-viral proteins. Therefore, the above mentioned seven drugs can impact the host natural defense against viral infection and therefore, caution needs to be taken. We also checked the connectivity score of the PTGS2 inhibitors obtained from DrugBank (Table 3). Among the 76 drugs targeting PTGS2, six of them noticed to have a high negative tau score (tau score <−75), which indicates the possibility of reducing the expression of pro-viral factors (Table 3).Those drugs include lenalidomide (anti-inflammatory drug), celecoxib (anti-inflammatory drug), tenoxicam (anti-inflammatory agent with analgesic and antipyretic properties), meclofenamic acid (anti-inflammatory and antipyretic properties), Sulfasalazine (anti-inflammatory drug), loxoprofen(non-steroidal anti-inflammatory drug) (Fig. 4B). The anti-inflammatory drugs which target PTGS2, mefenamic acid, tolmetin, ibuprofen and dexketoprofen, the generic medication drug triamcinolone (generic medication) and the anti-cancer drug etoposide, found to report high positive tau score which indicates a high correlation with the expression level of pro-viral host factors noticed in the SARS-CoV-2 infection and the possibility of promoting the viral infection by the medication. Based on the current analysis, the potential PTGS2 inhibitors considered for drug repurposing against SARS-CoV-2 are reported in Fig. 4B.

DISCUSSION
A meta-analysis on publicly available gene expression profiles was adopted to identify respiratory virus infection mediated host response. We could identify common host genes in 24-h infection models of different respiratory infection viruses and SARS-CoV-2. A set of differentially expressed 31 genes showing consistent expression pattern in SARS-CoV-2 infected conditions in NHBE and A549 cells were identified. Apart from mouse epithelial cell lines, NHBE, and A549 are the commonly used human lung epithelial-derived cell lines to study the SARS coronaviruses (both classical and CoV-2) infection. Though A549 cells are susceptible to SARS-CoV infection, owing to the low expression level of the ACE2 receptor, this cell line is not highly permissive for SARS-CoV-2 and the infection rate is low (Harcourt et al., 2020). But, A549 cells supplemented with a vector expressing ACE2 enabled SARS-CoV-2 to replicate even in low infection conditions (Blanco-Melo et al., 2020). Based on the updates in the GEO dataset (GSE147507), the upregulated host factors common in both A549 cells expressing exogenous ACE2 receptor and Calu-3 cell lines in addition to NHBE cells are analyzed and there are nine common genes noticed, which includes CXCL3, ASS1, UBE2L6, IRF7, C1S, SERPINA3, IRF9, IFI6 and CFB. The present study is based on a limited number of datasets on SARS-CoV-2 and possible to have more candidate genes with more samples with increased sequencing depth. However, the study tried to pick the robust set of genes with consistent expression patterns in two different cell types. The pathway enrichment analysis highlights the Interferon alpha/beta signaling pathway, cytokine signaling, and immune system as the enriched pathways and defense and immune response associated process as the enriched biological processes, which is consistent with the host response observed in various virus-infected conditions. During the initial stage of virus infection, the innate immune system will be stimulated to establish the first line of defense. Interferons (IFN) are a multigene family of inducible cytokines plays a critical role in initiating host antiviral responses (Boasso, 2013) and are commonly grouped as type I which includes IFN-a (leukocyte), IFN-β (fibroblast), and IFN-ω while type II IFN is also known as immune IFN (IFN-γ) (Schlaepfer et al., 2019). In most mild cases, type I IFN is highly effective at inhibiting viral replication during early short periods of viremia. However, in severe forms of viral replication, the ability of type I IFN to inhibit viral replication is overwhelmed (Long et al., 2009). Studies showed that in BALB/c mice infected with severe acute respiratory syndrome (SARS), there was a delay in Type I INF response leading to enhanced viral replication resulting in elevated lung cytokine/chemokine levels, vascular leakage, and impaired virus-specific T cell responses (Channappanavar et al., 2016).
IFN induces immune responses by activating the JAK/STAT pathway, which in turn forms a complex with interferon regulatory factor (IRF) and migrates to the nucleus in order to stimulate the expressions of over 300 IFN-stimulated genes (ISGs) that is necessary to inhibit viral replication (Teijaro, 2016). Among all ISGs, our analysis showed that IFN-a-inducible protein 6 (IFI6) and IFN-a-inducible protein 27 were elevated. In response to viral infection, among the IRF family members, particularly IRF-1, IRF-3, and IRF-7 are necessary for the production of type I IFN and also IFN-inducible genes (Bego, Mercier & Cohen, 2012). In our analyses, we also observed that there was an increase in IRF7 (log2FC=2.17) and IRF9 (log2FC = 1.80) during SARS-CoV-2 infection and also in other respiratory virus infection. Altogether, an increase in the level of IFN, IRFs, and ISGs plays a central role in antiviral responses, which might be during an early stage of infection, and we suggest that it could be used as a diagnostic marker upon virus infection like SARS-CoV-2. Moreover, in another study, it has been shown that the multiplication of SARS-CoV in cell culture can be strongly inhibited by pretreatment with interferon-beta (Spiegel et al., 2004). So, we could suggest that treatment with Interferona/β mimics immediately after infection might stimulate the JAK/STAT pathway, which in turn stimulates ISGs and ultimately enhance antiviral effects.
Interferon-inducible transmembrane proteins (IFITMs) are identified as a key ISGs induced by interferon and interfere with virus entry. IFITMs (IFITM1/2/3/7) are induced by IFN and are necessary for innate immunity (Chen et al., 2019). IFITM2 and IFITM3 might reduce the infectivity of viruses by regulating virus-endosome fusion rates and accelerating the trafficking of virus-endosome to lysosomes (Spence et al., 2019). Similar to IFITM, 2′-5′-oligoadenylate synthetases (OAS1, OAS2, OAS3) are induced by type I IFN interferons and in the presence of viral dsRNA OAS catalyze oligomerisation of ATP to form 2′-5′-linked adenosine oligomers (2-5A), which would involve in the activation of RNase L degradative pathway that would cleave the viral RNA and ultimately control viral infection (Kristiansen et al., 2010;Leisching et al., 2019). In our analyses, we observed that there was up-regulation in IFITM (1 and 3) and OAS (1,2,3) in virus-infected cells, which could be due to the early stage of infection.
Interferon-induced protein with tetratricopeptide repeats (IFITs) is strongly induced by IFN-a/β are strong inducers, whereas IFN-γ acts as a weak inducer (Fensterl & Sen, 2011). IFIT3 is necessary for maintaining cell survival by decreasing the rate of apoptotic cell death. The knockdown of IFIT3 decreased the production of antiviral cytokine, IFN-β (Hsu et al., 2013). Our observation showed that there was an increased expression of IFIT 1 and 3 in virus-infected cells. Additionally, there was a significant increase in the anti-apoptotic protein, Bcl-2 related protein A, which might involve in protecting cell death against viral infection.
Interferon-induced with helicase C domain 1 (IFIH1) gene belonging to helicase family and encodes a cytoplasmic receptor critical for viral RNA sensing. During viral entry, IFIH interacts with viral RNA and leads to polymerization of IFIH1 molecules into a filament, which assembly assemble further to initiate signaling cascade to induce type 1 IFN production and leads to activation of antiviral genes (Asgari et al., 2017). The lSAM and HD domain-containing deoxynucleoside triphosphate triphosphohydrolase1 (SAMHD1) protein binds to viral RNA and exhibits exonuclease activity on single-stranded nucleic acids. Studies showed that the degradation of SAMHD1 might enhance the replication of herpes viruses in co-infected patients (Kim et al., 2013). MX dynamin-like GTPase 1 protein recognize and binds to the nucleocapsids of invading viruses and prevents the intracellular transport of the nucleocapsids into the cell nucleus, which ultimately leads to an early block of the viral transcription and replication (Patzina, Haller & Kochs, 2014). In our analyses, there was an increase in IFIH1, SAMHD1, MX1during viral infection, which might be due to the early stage of infection.
E3 ubiquitin ligase deltex 3L is another protein induced by interferon and it forms complex with poly (ADP-ribose) polymerase PARP9 and targets both host histone H2BJ to promote ISG expression and also Viral 3C protease to disrupt viral assembly in both nucleus and cytoplasm (Zhang et al., 2015). Phospholipid scramblase (PLSCR1)-a multiply palmitoylated, and lipid-raft-associated endofacial plasma membrane protein are induced by INF. Upon induction, newly synthesized PLSCR1 is not palmitoylated, and so it easily enters into nuclei via importin a/β nucleopore transport where it binds to DNA and induces the expression of certain critical antiviral genes, including ISG15, ISG54, p56, and guanylate binding proteins (Wiedmer et al., 2003;Dong et al., 2004). Serine protease inhibitors (serpins) being elements of the innate immune system and belong to the largest and most diverse family of protease inhibitors. Serpins reportedly interfere with viral replication at both the entry and the reverse transcription stages (Asmal et al., 2012). The gene SERPINE encodes plasminogen activator inhibitor (PAI) and has been reported that PAI-2 expression significantly reduced the surface expression of the virus receptor molecules DAF, CAR, and ICAM-1 and thereby inhibits the binding of Virus and exhibits antiviral effect (Congote, 2006). In our study, we observed that there was an increase in the level of serpine in virus-infected cells. Argininosuccinate synthase (ASS) catalyzes the reversible ATP-dependent ligation of citrulline and aspartate to generate argininosuccinate. Importantly, ASS physically interacts with bacterial lipopolysaccharides and lipid A and inactivates their biological activities (Satoh et al., 2006). However, much detailed mechanism was not investigated and in our study, along with ASS1, there was an increased in E3 ubiquitin ligase deltex 3L, PLSCR1 and serpin were increased, which might be one of the antiviral effects induced by the host cell during the early stage of viral infection.
Besides antiviral proteins in the host system, a few pro-viral proteins were consistently increased at the mRNA level involved in virus-mediated infection, and they include TP, Cox-2, complement 1s and factor B, IFI44, XAF4, CXCL3. Thymidine phosphorylase (TP) is a potent angiogenic factor and a putative marker of cellular oxidative stress and is upregulated in patients of HBV and HCV infected liver tissue with an early event, and it becomes more prominent as the disease progresses to cirrhosis. However, further research has not been carried out on TP during viral infection (Mimidis et al., 2005). Endothelin-1 (ET-1) has been shown to exhibit several physiologic functions, including salt and water homeostasis, vascular tone, and inflammation. Increased cytokines production during viral infection would increase the levels of endothelin-1 in the cells (Bouallegue, BouDaou & Srivastava, 2007). Upon stimulation, ET1 is reported to increase the production IL-6, TNF-alpha, ICAM, VCAM, and e-selectin, which would potentiate inflammation (Christ-Crain, Schuetz & Müller, 2008). Thus, the use of endothelin blockers might effectively reduce the virus-induced lung inflammation.
Prostaglandin-endoperoxide synthase 2 (PTGS2) or Cyclo-oxygenase2 (Cox-2) is an inducible pro-inflammatory enzyme. The structural proteins from the SARS-CoV reported to induce the expression of COX-2 in vitro (Liu et al., 2007), and there by increased expression of PGs in the blood of SARS-CoV-infected individuals (Lee et al., 2004). Additionally, the activity of COX might be required for efficient entry and also for an initial step in RNA replication and they suggested that this could be targeted for anti-CoV therapy (Raaben et al., 2007). COX-2 also reported to play a crucial role in limiting the anti-viral cytokine/interferon response to viral infection and thereby thus use of effective COX-2 selective inhibitor during early viral infection, may enhance and/or prolong endogenous interferon responses, and thereby might increase anti-viral immunity (Kirkby et al., 2013). In our analyses, we observed that COX-2 expression was increased in virus-infected cells.
The complement is an important element of innate immunity that functions to recognize and eliminate invading microbes. In our current study, we observed that increased expression of classical pathway protein C1s and alternative pathway protein Factor B during viral infection. In a SARS-CoV-infected mice study, it was reported that complement activation results in immune-mediated damage in the lung of C3 deficient mice, which suggested that inhibition of the complement pathway might be an effective therapeutic strategy to inhibit coronavirus-mediated respiratory diseases (Gralinski et al., 2018). Chemokines being a family of small, secretory proteins are expressed in constitutively or in an inducible manner. The important role of chemokines is to attract leukocytes to sites of infection/inflammation and has been reported that chemokines CXCL2 and CXCL3 are increased in the herpes virus, arenavirus, and rhabdovirus (Melchjorsen, Sørensen & Paludan, 2003) and thus targeting CXCL2/3 could be an effective therapeutic target during viral infection.
IFI44 is a cytoplasmic protein and a type I IFN-induced protein and is upregulated upon a variety of virus infections. Generally, ISGs display antiviral functions; however, IFI44 has been reported to negatively modulate antiviral responses induced by multiple viral systems. IFI44, by inhibiting activates NF-κB which migrates to the nucleus and activates pro-inflammatory cytokines (DeDiego et al., 2014). IFI44 is responsible for the pathogenesis of some viruses, including coronaviruses (DeDiego et al., 2019). XAF1 is an apoptosis-promoting factor by binding to XIAP, which involves in caspase suppression and inhibits cell death. It has been reported that XAF1 promotes apoptosis in DENV2infected HUVECs by binding to XIAP and also collaborates with TNF-related apoptosisinducing ligand (TRAIL)-induced cell death (Leaman et al., 2002). In our analysis, we have observed that XAF1 and IFI44 protein in virus-infected cells, which could be targeted against viral infection.
At present, the drug-repurposing approaches on SARS-CoV-2 based on different host factor is being carried out (Zhou et al., 2020;Ge et al., 2020). We have used host transcriptome based pro-viral factors for the drug repurposing against SARS-CoV-2 early infection conditions. Among the 31 up-regulated genes, the pro-viral host factors in SARS-CoV-2 infection were identified by literature-survey, which includes, TYMP, PTGS2, C1S, CFB, IFI44, XAF1, CXCL2 and CXCL3. The genes, TYMP, C1S and CFB, were found to be unique in SARS-CoV-2 infection. The other genes, PTGS2, IFI44, XAF1, CXCL2 and CXCL3, reports the up-regulation tendency in the other respiratory virus infections as well. When considering the pro-viral factors identified from the study for the therapeutic strategies against SARS-CoV-2, drugs targeting PTGS2 are found to be of potential use and are obtained from the DrugBank search. We could notice that six anti-inflammatory drugs targeting PTGS2, lenalidomide, celecoxib, tenoxicam, meclofenamic acid, sulfasalazine, and loxoprofen can induce anti-correlated expression signature with the pro-viral factors and can be considered for the therapeutic purpose of SARS-CoV-2 infection. Importantly, the PTGS2 inhibiting drugs celecoxib and loxoprofen already reported to be useful for the treatment of viral infection. The drug, celecoxib, was shown to decrease inflammatory gene expression in the context of TC-83 virus infection (Risner et al., 2019). The drug, loxoprofen found to be useful for patients with acute upper respiratory tract infection, including those with influenza infection (Azuma et al., 2011). Importantly, the PTGS2 inhibitors reported in the study, indomethacin, naproxen, ibuprofen, and thalidomide are currently under the clinical trials for treating COVID-19.
The approved drug molecules obtained from CMap analysis are from different therapy area including the menopausal hormone therapy (estrone), antiseptic (hexylresorcinol), anti-anxiety agent (pentobarbital), antihypertensive agent (nitrendipine), analgesic (phenazopyridine), neurological (heraclenol), anxiety disorder (alprazolam), dopaminergic activity (bromocriptine) and anticoagulant activity (WT-171 (Vorinostat)). However, anti-anxiety agent, pentobarbital is reported to have various adverse effects hepatotoxicity, laryngospasam, anemia, bradycardia and respiratory depression (AbouKhaled & Hirsch, 2008). Anti-viral activity has been reported for many of the experimental or investigational compounds identified from our study. The study by Nukuzuma et al. (2009) reports that the 5HT(2A)R antagonist, ritanserin have an inhibitory effect on human polyomavirus, JCV infection and reproduction. JAK3 inhibition is another anti-viral strategy that is found to block multiple cytokines and protect against a super inflammatory response (Xu et al., 2012). Tipifarnib may inhibit the prenylation step of the Hepatitis Delta Virus (HDV) replication (Lempp & Urban, 2017).
Anti-viral effect against species associated with upper respiratory tract infection (URTIs) or known to cause acute sore throat observed with the drug hexylresorcinol (Shephard & Zybeshari, 2015). Alprazolam and dilazep found to be effective in Influenza Virus Infection and HIV, respectively, and danaprevir is a hepatitis C virus protease inhibitor (Freire-Garabal et al., 1993;Zeng et al., 2014;Gane et al., 2014). Apart from that, the calmodulin antagonist and topoisomerase I inhibitor have reported antiviral activity against the Dengue virus and HIV infection (Bautista-Carbajal, 2017;Showalter & Blair, 2016). Therapeutically targeting the kinases such as PI3K and P38 kinase found to be useful in viral infection (Wu et al., 2015, Griego et al., 2000. The proposed compounds from the study can be considered for the validation experiments.

CONCLUSIONS
In summary, we have used publicly available human host gene expression profiles in early infection conditions of SARS-CoV-2 and other respiratory infection viruses and identified the important host factors in SARS-CoV-2 infection. Considering the consistent expression signature and literature evidence, we could identify 31 upregulated host factors, and out of them, eight are pro-viral factors in SARS-CoV-2 infection. The study is an effort to identify repurposed drugs for the treatment of SARS-CoV-2 infection, considering the pro-viral host factors. The connectivity map based repurposing proposed twelve compounds with evident antiviral activity from the literature survey. Apart from that, we could propose that inhibition of PTGS2 can be a potential therapeutic strategy for viral infection, and the proposed six approved PTGS2 inhibitor drugs can be repurposed for the treatment of SARS-CoV-2 infection. The study is based on the presently available dataset in public domain databases and may fluctuate based on the sample selection in the analysis. Therefore, we have considered strict criteria of consistency in expression patterns and evidence from the literature in the selection of host factors for repurposing studies. The rapid therapeutic recommendations from this computational approach can be repeated as more RNA-Seq datasets become available and could be done on protein-level and need to be validated by wet-lab experiments and clinical trials.