Genetic and Epigenetic Host–Virus Network to Investigate Pathogenesis and Identify Biomarkers for Drug Repurposing of Human Respiratory Syncytial Virus via Real-World Two-Side RNA-Seq Data: Systems Biology and Deep-Learning Approach

Human respiratory syncytial virus (hRSV) affects more than 33 million people each year, but there are currently no effective drugs or vaccines approved. In this study, we first constructed a candidate host–pathogen interspecies genome-wide genetic and epigenetic network (HPI-GWGEN) via big-data mining. Then, we employed reversed dynamic methods via two-side host–pathogen RNA-seq time-profile data to prune false positives in candidate HPI-GWGEN to obtain the real HPI-GWGEN. With the aid of principal-network projection and the annotation of KEGG pathways, we can extract core signaling pathways during hRSV infection to investigate the pathogenic mechanism of hRSV infection and select the corresponding significant biomarkers as drug targets, i.e., TRAF6, STAT3, IRF3, TYK2, and MAVS. Finally, in order to discover potential molecular drugs, we trained a DNN-based DTI model by drug–target interaction databases to predict candidate molecular drugs for these drug targets. After screening these candidate molecular drugs by three drug design specifications simultaneously, i.e., regulation ability, sensitivity, and toxicity. We finally selected acitretin, RS-67333, and phenformin to combine as a potential multimolecule drug for the therapeutic treatment of hRSV infection.


Introduction
Human Respiratory Syncytial Virus (hRSV) is a negative-sense, single-stranded RNA virus that causes infections of the respiratory tract. More than 80% of children have been infected at least once by 2 years of age, and half of these children have had RSV twice [1]. In 2015, Li et al. have approximately estimated that 33.1 million RSV-associated lower respiratory tract infections (LRTI), resulted in about 3.2 million hospital admissions and 59,000 deaths in children younger than 5 years [2]. Thompson et al. have estimated that RSV accounts for approximately 10,000 deaths annually in elders who are over the age of 65 years in the US [3]. As stated above, we concluded that RSV not only infects children but also elders. In a recent study, it indicated that RSV infection accounted for 11 percent of hospitalizations for pneumonia, 11 percent for chronic obstructive pulmonary disease, 5 percent for congestive heart failure, and 7 percent for asthma [3]. Thus, RSV is not only a threat itself, but it can also develop other complications that are more threatening.
Currently, no powerful drug is available to prevent RSV. Although Ribavirin is the only drug approved by the Food and Drug Administration (FDA) for the treatment of RSV, it is very costly, teratogenic, and inconvenient [1]. In 1996, Randolph and Wang did a double-blind trial on hRSV. Although there were trends in the direction of benefit, the The whole flowchart of the systematic drug discovery method by big-data mining and dynamic modeling of PPIN and GRN for investigating hRSV pathogenic mechanism by two-side RNA-Seq data of hRSV and human A549 cell via systems biology methods and deep-learning approach, including system identification, DNN-based DTI model, and drug-specification screening Figure 1. The whole flowchart of the systematic drug discovery method by big-data mining and dynamic modeling of PPIN and GRN for investigating hRSV pathogenic mechanism by two-side RNA-Seq data of hRSV and human A549 cell via systems biology methods and deep-learning approach, including system identification, DNN-based DTI model, and drug-specification screening to select molecular drugs which can target significant biomarkers can comprise a multimolecule drug as the clinical-trial recommendation for hRSV treatment.

Overview of Systematic Drug Discovery for hRSV Infection via Systems-Biology Method
In order to investigate the pathogenic mechanism for hRSV infection, we tried to reverse the hRSV infection progress and construct the real HPI-GWGEN via two-side RNA-Seq data of hRSV and human A549 cells. Since A549 is the most widespread cell line and hRSV is a respiratory virus, it seems reasonable to use a lung cell line even though hRSV mostly infects respiratory epithelial cells first. In addition, the construction of the candidate HPI-GWGEN is obtained from various datasets and cell lines, so we need the system identification to prune false positives to let real HPI-GWGEN suitable for hRSV infection. The GenBank accession number of hRSV is AF035006, which is a recombinant mutant virus of subgroup A. Given that the real HPI-GWGEN was too complicated to annotate by KEGG pathways, we applied the PNP to rank the significance of all node proteins, genes, microRNA, lncRNA, and virus genes in real HPI-GWGEN and then extract the most significant nodes to consist of core HPI-GWGEN. With the annotation of KEGG pathways, we could construct the core HPI signaling pathways of hRSV infection. From the downstream of cellular dysfunctions and the upstream of regulations of core signaling pathways of hRSV infection, we could finally discover the significant biomarkers of the pathogenic mechanism as drug targets for the therapeutic treatment of hRSV infection. After finding the drug targets, we purposed a DNN-DTI model to help us find the potential drugs which target these biomarkers. The DNN-DTI model was trained by drug-target interaction databases and was used to predict the candidate drugs that could interact with target genes. Subsequently, we purposed three drug specifications, i.e., regulation ability, sensitivity, and toxicity on molecular drugs to select a multimolecule drug as the clinical trial recommendation for hRSV treatment. The flowchart of the whole systematic drug discovery procedure is shown in Figure 1.

Construction of Candidate HPI-GWGEN by Database Mining and Integration
The purpose of constructing candidate HPI-GWGEN by big-data mining from databases is to investigate the possible interactions and regulations among proteins and genes between the host and hRSV. In the candidate HPI-GWGEN, if any two genes and proteins may have a regulatory or interaction relationship, we will set it true, and then use the Boolean type to express the entire HPI-GWGEN.
The candidate HPI-GWGEN contains two networks, including the candidate HPI protein-protein interaction network (HPI-PPIN) and the candidate HPI gene-regulation network (HPI-GRN). The host intraspecies of candidate HPI-PPIN were constructed by the big-data mining of several databases, including the Database of Interacting Proteins (DIP) [6], the Biological General Repository for Interaction Datasets (BioGRID) [11], the Biomolecular Interaction Network Database (BIND) [12], IntAct [13], and the Molecular Interaction Database (MINT) [14]. The host intraspecies of candidate HPI-GRN include miRNA, lncRNA, and transcription factors regulation. The candidate host miRNA and lncRNA regulations were constructed by the big-data mining of the Target Scan Human database [15], CircuitsDB [16], and starBase v2.0 [17]; the candidate host transcription factors regulations were constructed by the big-data mining of ITFP [18], TRANSFAC [19], and HTRIdb [20]. Due to the lack of hRSV-host interactions and regulations database, we assumed that all the hRSV-host interactions and regulations in candidate HPI-GWGEN are true; then, we purposed a system order identification method [7,8] to estimate the true order by two-side host-pathogen RNA-Seq time-profile data and eliminate the false positives in candidate HPI-GWGEN to obtain real HPI-GWGEN.

HPI RNA-Seq Time-Profile Data of Human A549 Cell and hRSV
The HPI RNA-seq data of the human A549 cell and hRSV used in the study are from National Center for Biotechnology Information (GEO number: GSE55963). This dataset included 8 time points: 0, 2, 4, 8, 12, 16, 20, and 24 hpi (hour postinfection) on human A549 cell and hRSV. These two-side RNA-seq time-profile data were employed to identify the true system parameters of candidate HPI-GWGEN by system-identification methods. We also purposed Genecode v35/v27 annotation on this data; the genes were sorted into five types, including proteins, miRNA, lncRNA, receptors, and transcription factors.

Construction of Dynamic Model of HPI-GWGEN for hRSV Infection
For candidate HPI-PPIN, the dynamic-interaction model of the host proteins can be modeled as the following equation: where p i H (t), p j H (t) indicate the expression level of the ith host protein, the jth host protein at time point t, respectively; HH i and HP i indicate the number of host proteins and pathogen proteins interacting with the ith host protein, respectively; I ij HH and I ik HP indicate the interaction ability between the ith host protein and the jth host protein and between the ith host protein and the kth pathogen protein, respectively; α Hi , −γ Hi , β Hi indicate the translation rate, the degradation rate, and the basal activity level of the ith host protein, respectively; n Hi (t) is the stochastic noise of the ith host protein at time point t; H indicates the total number of host proteins. Basal activity level can model the unknown regulations or interactions, ex. methylation or phosphorylation. For candidate HPI-PPIN, the dynamic-interaction model of the pathogen proteins can be modeled as the following equation: where p i P (t) and p j H (t) indicate the expression level of the ith pathogen protein and the jth host protein at time point t, respectively; PH i and PP i indicate the number of host proteins and pathogen proteins interacting with the ith host pathogen, respectively; I ij PH and I ik PP indicate the interaction ability between the ith pathogen protein and the jth host protein and between the ith pathogen protein and the kth pathogen protein, respectively; α Pi , −γ Pi , and β Pi indicate the translation rate, the degradation rate, and the basal activity level of the ith pathogen protein, respectively; n Pi (t) is the stochastic noise of the ith pathogen protein at time point t; P indicates the total number of pathogen proteins. For candidate HPI-GRN, the dynamic-regulation models of the host genes, which including proteins, miRNAs, lncRNAs, transcription factors, and receptors, can be modeled by the following equations: where g i H (t), f j H (t), m k H (t), and l q H (t) indicate the expression level of the ith host genes, the jth host transcription factors, the kth host miRNAs, and the kth host lncRNAs at time point t, respectively; GT i , GM i , and GL i indicate the number of host transcription factors, miRNA, and lncRNA interacting with ith host gene, respectively; R ij GT , R ik GM , and R iq GL indicate the regulation ability of the ith host gene regulated by the jth host transcription factor, the kth host miRNA, and the qth host lncRNA respectively; −γ Gi and β Gi indicate the degradation rate and the basal activity level of the ith host gene, respectively; n Gi (t) is the stochastic noise of the ith host gene at time point t; G indicates the total number of host genes. For candidate HPI-GRN, the dynamic-regulation models of the pathogen genes can be modeled as the following equations: where g i P (t), f j H (t), m k H (t), and l q H (t) indicate the expression level of the ith pathogen gene, the jth host transcription factor, the kth host miRNA, and the kth host lncRNA at time point t, respectively; VT i , V M i , and VL i indicate the number of host transcription factors, miRNAs, and lncRNAs interacting with the ith pathogen (virus) gene, respectively; R ij VT , R ik V M , R iq VL , and R iz VV indicate the regulation ability of the ith pathogen (virus) gene regulated by the jth host transcription factor, kth host miRNA, qth host lncRNA, and the zth pathogen gene respectively; −γ Vi and β Vi indicate the degradation rate and the basal activity level of the ith pathogen (virus) gene, respectively; n Vi (t) is the stochastic noise of the ith pathogen (virus) gene at time point t; V indicates the total number of pathogen (virus) genes.

Parameter Estimation of Dynamic Model for Candidate HPI-GWGEN by System Identification Method for hRSV Infection Progression
Since the databases are noisy, there are a large number of false positives in the candidate HPI-GWGEN. Based on the discrete-time dynamic model Equations (1)-(4), we purposed system identification based on the HPI RNA-seq time-profile data (GSE55963) to prune false-positive regulations and interactions in candidate HPI-GWGEN to obtain the real HPI-GWGEN during the hRSV infection. In our raw RNA-seq data, there are only 8 time points that may lead to the overfitting in the parameter estimation of the HPI-GWGEN, so we applied the cubic spline interpolation to expand our time points to prevent overfitting in the parameter estimation process. Then, we purposed the Akaike information criterion (AIC) to find the correct system order by the trade-off between the model complexity and least square parameter estimation error [7].
To estimate the parameters in Equations (1)-(4), we rearranged each equation in linear regression form as follows: . . .
Then, we could transform Equations (5)-(8) to the following augmented regression equations, respectively. T indicates the total number of time points of HPI RNA-seq time-profile data after interpolation.
Biomedicines 2023, 11, 1531 Due to the inherent biological mechanism, we need to set constraints on some parameters, such as degradation rate ≤ 0 and translation rate ≥ 0. Thus, real parameters of regulation and interaction order were estimated by solving the following constrained least square problems, respectively: Biomedicines 2023, 11, 1531 9 of 26 After solving the above constrained least square problems to estimate parameters of regulations and interactions for each protein, gene, miRNA, and lncRNA of the host cell and hRSV, we purposed the Akaike information criterion (AIC) to prune false positives [8]. AIC considers both the model complexity (order) and estimated error to find the fittest parameter order. The AIC value for each system order was shown as the following equations, and our goal is to find the minimum AIC value as the real parameter order.
where dim(Î Hi ), dim(Î Pi ), dim(R Gi ), and dim(R Vi ) denote the parameter vector dimension of each model, respectively. We solved constrained least square Equations (13)-(16) by using the MATLAB lsqlin function and then calculated the AIC value by Equations (17)- (20). Increasing the number of parameters would decrease the least square error term in the AIC equation but it would increase the model complexity (dimension) in the second term. By finding the minimum AIC value [7,8], we would find the trade-off between model complexity and estimated error to find the real order of interactions and regulations of each gene, miRNA and lncRNA of infected cells, and hRSV. The HPI-GWGEN is still very complex and cannot be annotated by KEGG pathways; therefore the principal-network projection (PNP) method is employed to extract the core HPI-GWGEN from real HPI-GWGEN. For the convenience of PNP, the HPI-GWGEN is represented by the following matrix: In Equation (21), HP, PP, HG, HL, HM, and PG represent host protein, pathogen protein, host gene, host lncRNA, host miRNA, and pathogen gene, respectively; I and R represent interaction ability in PPIN and regulation ability in GRN, respectively. H, P, G, M, L, and V represent the number of host protein, pathogen protein, host gene, host lncRNA, host miRNA, and pathogen gene, respectively.

Extracting Core HPI-GWGEN via Principal-Network Projection
Since real HPI-GWGEN was still too complex to be annotated by KEGG pathways to investigate the pathogenic mechanism of hRSV infection, we purposed principal-network projection methods to extract the core HPI-GWGEN. The PNP method is an application of singular-value decomposition [21] to rank each node (genes, proteins, miRNA, and lncRNA) in HPI-GWGEN so as to know the significance of each node in HPI-GWGEN. First, we did the singular value decomposition (SVD) on HPI-GWGEN as follows: where H, P, G, M, L, and V represent the number of host protein, pathogen protein, host gene, host miRNA, host lncRNA, and pathogen (virus) gene, respectively.
In order to extract the core HPI-GWGEN, we chose top-k singular values in S which contain 85% energy, and top-k singular vectors of U and V which consist of the principal-network structure of HPI-GWGEN matrix M as core HPI-GWGEN. The top-k singular values satisfy with the following inequality: Then, in order to rank each node (row) of HPI-GWGEN M in (21) and obtain the projection value on the core HPI-GWGEN, we define the projection value for each row M i as follows: where M i and V j T represent the ith row of matrix M and the jth column of singular matrix V, respectively. R represents the row number of matrix M which means the total number of all nodes in HPI-GWGEN (i.e., H + P + G + M + L + V).
At the end, we finally obtained the projection value of each node, and we could rank them based on their significance of each node in HPI-GWGEN to obtain the top-6000 significant nodes to consist of core HPI-GWGEN. Then, we proposed DAVID to help us do KEGG enrichment analysis by top-6000 significant genes, proteins, miRNA, and lncRNA in core HPI-GWGEN. With the aid of KEGG enrichment analysis and annotation of the core HPI-GWGEN to obtain core HPI signaling pathways in Figure 4 by KEGG pathways, we investigated the pathogenic mechanism for hRSV infection. By the core HPI signaling pathways, we selected five significant biomarkers as our drug target for hRSV infection treatment.

Systematic Drug Repurposing Design of hRSV Infection via DNN-Based DTI Model and Drug Specifications
After obtaining the five significant biomarkers as drug targets for hRSV infection treatment, the DNN-DTI model is employed to be trained by drug-target databases to predict the potential molecular drugs for these five significant biomarkers of hRSV infection. Based on the prediction of the DNN-based DTI model, we chose several candidate drugs which can target these five drug targets for hRSV infection treatment. Due to the complexity of drug mechanisms in human beings, we purposed three drug design specifications (i.e., regulation ability, sensitivity, and toxicity) to make our candidate drug helpful in hRSV infection treatment. The flowchart of the drug-discovery design is shown in Figure 2.

DNN-Based DTI Model for Drug Repurposing of hRSV Infection
Before training the DNN model as a DTI model by drug-target interaction database to predict potential molecular drugs for each biomarker of hRSV infection, we first preprocessed the drug-target interaction (DTI) data as shown in Figure 2. We collected the drug-target interaction data from several DTI databases including ChEMBL, BindingDB, Pubchem, UniProt, and DrugBank. In order to input these data into the DNN model, we must transform them into numerical vectors. We utilized PyBioMed to transform drug chemical structures and protein sequences into numerical feature vectors. At the same time, the drug feature vector and protein feature vector were concatenated as the following drug-target vector: where F D and F T represent the drug feature vector and target (gene) feature vector. The drug-target feature vector F DT would be the input of the DNN-based DTI model. The collected DTI data includes 80,291 positive interactions and 100,294 unknown as negative interactions. Due to the imbalance of the dataset, we randomly downsampled the negative interaction dataset to 80,291. Then, we divided the whole dataset into a training set (fourth three) and a testing set (fourth one). As all the feature vectors were located in different scales, we applied standardization to each training data to solve this problem. Moreover, because of the complexity of each drug-target feature vector, it might let the DNN model hard-learn the features or increase the computational complexity of the DNN model. We proposed PCA to reduce the dimension of every feature vector. Every drug-target feature vector was reduced to 1000 dimensions as the input layer of 1000 DNN neurons, as shown in Figure 2. At the beginning, we integrated drug-target databases, transformed them to feature vectors, and performed data preprocessing. Then, we employed drug-target pairs in DTI databases to train DNN as a DTI model. At the end, we used the well-trained DTI model to predict the candidate molecular drugs for each drug target (biomarker) in Table 4 and then filtered the candidate molecular drugs with three drug specifications as potential molecular drugs. These potential molecular drugs are selected to consist of a multimolecular drug to target these biomarkers as shown in Table 5 Before training the DNN model as a DTI model by drug-target interaction database to predict potential molecular drugs for each biomarker of hRSV infection, we first preprocessed the drug-target interaction (DTI) data as shown in Figure 2. We collected the At the beginning, we integrated drug-target databases, transformed them to feature vectors, and performed data preprocessing. Then, we employed drug-target pairs in DTI databases to train DNN as a DTI model. At the end, we used the well-trained DTI model to predict the candidate molecular drugs for each drug target (biomarker) in Table 4 and then filtered the candidate molecular drugs with three drug specifications as potential molecular drugs. These potential molecular drugs are selected to consist of a multimolecular drug to target these biomarkers as shown in Table 5.
For the deep neural-network model, we employed four hidden layers which contained 512, 256, 128, and 64 neurons, respectively. In a hidden layer, we used ReLU as the activation function and dropout set 0.2 to avoid overfitting [22]. The ReLU function has a strong biological underpinning [23] and helps the DNN model learn the nonlinearity. As drug-target interaction basically was a binary classification problem, the output layer only contained one neuron to indicate the probability of drug-target interaction.
As we discussed before, DTI was a binary classification problem. We proposed binary cross-entropy as the loss function which was shown as the following equation: where p n andp n represent whether the nth sample is interacted and the probability of the nth sample predicted in the DNN model, respectively. After defining the loss function, we applied ADAM [24] as our backward propagation algorithm and completed the whole DNN-DTI model architecture. The model is trained by Keras with 64 batch size and 200 epochs, and we also proposed 5-fold validation to evaluate the predicted performance of the trained model. To visualize the high performance of the DNN-based DTI model, we plotted the receiver operating characteristic (ROC) curve in Figure 6. ROC curve is used for the binary classification model and is aimed to examine whether the model can distinguish the positive and negative sample. The area under the ROC curve (AUC), which was shown in Figure 6, was also applied for a benchmark in the binary classification problem, and the higher the AUC value, the model performance is better.

Drug Design Specifications
After obtaining the candidate molecular drugs for five biomarkers by the prediction of the proposed DNN-based DTI model, we started considering the reliability of these drugs, so we purposed three benchmarks as design specifications, including regulation ability, toxicity, and sensitivity to make sure the drugs were reliable. The LINCS L1000 database [25] is used for the specification of regulation ability and the PRISM database [26] is used for the specification of sensitivity. The sensitivity indicates the drug utility perturbation for human cells. The most important of all is toxicity, and we employed ADMETlab 2.0 [27] to specify the toxicity (LC50). LC50 is the abbreviation of lethal concentration 50%, which means the higher the LC50 value is, the lower toxicity for the human being.

Extracting Core Signaling Pathways via System Identification and Principal-Network Projection Approach
The overall flowchart is shown in Figure 1. First, we constructed candidate HPI-GWGEN by database mining and integration. Then, we proposed a system identification approach to eliminate the false positives in candidate HPI-GWGEN and then obtained the real HPI-GWGEN. The node and edge number between the candidate and real HPI-GWGEN are shown in Table 1. After the deletion of the false positives in candidate HPI-GWGEN for real HPI-GWGEN of hRSV infection, we still need to apply the PNP method because the real HPI-GWGEN is still pretty complex for the annotation of KEGG pathways. Therefore, we used the PNP method to extract core HPI-GWGEN with 85% energy of real HPI-GWGEN and rank all nodes (genes and proteins) according to their projection (significance) values. To visualize the real HPI-GWGEN and core HPI-GWGEN, we used the software Cytoscape [28] to visualize the whole networks and help us intuitively understand the power of PNP, as shown in Figure 3. We also uploaded top-6000-significance nodes to the DAVID functional annotation tool [29] to help us investigate core HPI signaling pathways for the pathogenic mechanism of the hRSV infecting progression. DAVID presented the enrichment analysis of KEGG which indicates that hRSV infection may involve in what kinds of pathways. The KEGG enrichment analysis is shown in Table 2.

Investigation of Core HPI Signaling Pathways for Pathogenic Mechanism of hRSV Infection Progression
According to the KEGG enrichment analysis of core HPI-GWGEN and the annotation of KEGG pathways, we found the core HPI signaling pathways in the hRSV infection progression, as shown in Figure 4. With the aid of core HPI signaling pathways and their downstream target genes, we could investigate the pathogenic mechanism of hRSV infection and find key biomarkers to help us design a multimolecule drug for therapeutic treatment of hRSV infection in the next step. In this section, we introduced these core HPI signaling pathways and the potential pathogenic mechanism of the biomarkers of hRSV infection.

Investigation of Core HPI Signaling Pathways for Pathogenic Mechanism of hRSV Infection Progression
According to the KEGG enrichment analysis of core HPI-GWGEN and the annotation of KEGG pathways, we found the core HPI signaling pathways in the hRSV infection progression, as shown in Figure 4. With the aid of core HPI signaling pathways and their downstream target genes, we could investigate the pathogenic mechanism of hRSV infection and find key biomarkers to help us design a multimolecule drug for therapeutic treatment of hRSV infection in the next step. In this section, we introduced these core HPI signaling pathways and the potential pathogenic mechanism of the biomarkers of hRSV infection.

The Significant Signaling Pathways Involved with Biomarkers TRAF6 and RELA
As shown in Figure 4, TRAF6 is activated by both the IL-17α pathway and the TLR2 pathway. IL-17α is stimulated by the microenvironment and interacted with its receptor IL-17Rα. The second pathway involved in TRAF6 is the TLR2/MyD88 pathway. RSV utilizes receptor TLR2 for launching a proinflammatory response in the RSV infected progression [30]. RSV G protein binds to receptor TLR2 and then activates with MyD88 [31]. MyD88 is an adaptor protein that mediates toll and interleukin (IL)-1 receptor signaling [32]. TRAF6 interacts with signaling proteins MAP2K1, MAPK8, CEBPB and the MAP3K7-TAB2 complex. For its downstream, TRAF6 finally activated TFs FOXO3, RELA, JUN, and CEBPB, respectively. TF FOXO3 is one of the FOXO family members that promotes the expression of cyclin-dependent kinase-inhibitor genes CDKN1A and IL6. Target gene CDKN1A triggers a proliferation arrest [33]. In addition to TFs, FOXO3 and RELA upregulated IL6; the mir-coRNA LET-7i also directly inhibits IL6 expression. Ligand IL6 induces phosphorylation and nuclear entry of the STAT3 [34] and then IL6 causes a positive feedback loop during hRSV infection [35]. In the IL6 feedback loop, there is also an important TF called RELA. RELA/NF-κB is key in innate inflammation, controlling the expression of inflammatory chemokines as well as mucosal interferons (IFNs) through a process of regulated transcriptional elongation [36]. TF RELA upregulates the target genes ICAM1, AP-1, IL6, and MMP9. Although we do not identify any virus gene binding to ICAM1, one study shows that ICAM1 facilitates RSV entry and infection of human epithelial cells by binding to its F protein which means that ICAM1 is significant for viral replication [37]. According to the severity of hRSV determined by environmental exposure to lipopolysaccharide (LPS), we also find LPS induces ICAM1, which is a positive correlation. Nuclear transcription factor JUN was involved in the activation of many cellular functions, including cell proliferation and apoptosis [38]. In clinical trials, MMP9 demonstrates that it is involved in the antiviral and anti-inflammatory effects [39].
In summary, ligand IL17α and hRSV protein G activate these two signaling pathways, i.e., IL-17α pathway and TLR2 pathway; then, these signals are transducted through their downstream signaling protein TRAF6. TRAF6 interacted with many downstream target genes involved in the upregulations of inflammation, cell proliferation, host defense, and the downregulations of apoptosis. Especially IL6 forms a positive feedback loop to influence on IL6 pathways and its downstream signaling proteins, STAT3 and TF RELA, which also involve in another significant pathway for hRSV infection, i.e., the RIG-1/MAVS pathway [40]. As TRAF6 and RELA are located at the pivot of two signaling pathways and are involved in many apoptosis-related, LPS-related, proliferation-related target genes in HPI signaling pathways, we considered TRAF6 and RELA as biomarkers of hRSV infection. The RIG-1/MAVS pathway will be discussed in the following section in detail.

The Significant Signaling Pathways Involved with Biomarkers MAVS and IRF3
In this section, we will discuss the RIG-1/MAVS pathway. RIG-1/MAVS pathway is the initial intracellular sensor for hRSV infection and the upstream of the canonical NF-κB pathway [41]. RIG-I is upstream of the MAVS protein which is known as an interferon-β promoter stimulator or virus-induced signaling adaptor [42]. RIG-1 also senses nucleic acids derived from viruses and triggers antiviral innate immune responses [43]. It means RIG-1 plays a vital role in hRSV infection. According to the core HPI signaling pathways, as shown in Figure 4, hRSV NS1 protein was identified as interacting with MAVS which was consistent with a recent study [44]. The interaction of NS1 and MAVS disrupts the RIG-1/MAVS pathway and the downstream TF IFNA1 and target gene IL6, which leads to the upregulations of inflammation [44]. We also proved that TRAF3 interacts with hRSV NS2 protein in a recent study [45] by the proposed system-identification approach. MAVS interacted with TRAF3, and TRAF3 interacted with the TKB1-IKB1CE complex which resulted in the phosphorylation of IRF3. Further both the TKB1-IKB1CE complex and ISG15 interacted with IRF3 too. Therefore, IRF3 was viewed as another biomarker in our signaling pathways because of the multi-interaction in this pathway. IRF3 upregulates the target gene CXCL8, TF JUN, and IFNA1. The identification of high expression of CXCL8 is consistent with a recent study [46]. Target gene CXCL8 is a key driver of the antiviral inflammatory response during hRSV infection [46] and CXCL8 also correlates positively with disease severity during RSV infection [47]. The phosphorylation of IRF3 leads to its dimerization and translocation to the nucleus, where it drives the expression of IFNA1. After upregulating IFNA1, IFNA1 also affects another pathway and another biomarker TYK2. We will discuss the IFNA1 pathway in detail in the following section.

The Significant Signaling Pathways Involved with Biomarker TYK2
Tyrosine kinase 2 (TYK2), a kinase belonging to the JAK family, is constitutively bound to receptor IFNAR1 [48]. From the upstream of the IFN pathway, ligand IFNA1 interacts with receptor IFNAR1 because of the RIG-1/MAVS pathway or the microenvironment. Subsequently, TYK2 phosphorylates STAT2 and STAT1 and binds to TF IRF9 [49]. TF IRF9 translocates to the nucleus and upregulates target genes OAS2 to upregulate host defense and CDKN1A to downregulate the apoptosis. A recent study showed that the inhibition of OAS2 expression can inhibit the antiviral effect of IFN against hRSV [50]. CDKN1A is a target gene involved in proliferation arrest [51]. TYK2 also interacts with STAT3, which is vital to virus infection. TYK2 induces STAT3 phosphorylation and then STAT3 interacts with RELA, which is another significant biomarker [52]. According to a recent study, patients with STAT3 mutation presented with more viral infections [53]. Based on the above discussion, we can explain why the IFN signaling pathway is so important in hRSV infection and why TYK2 is selected as a biomarker in the pathogenic mechanism of hRSV infection.

TNF Signaling Pathway
Tumor necrosis factor TNF is a proinflammatory cytokine that plays a vital role in the innate host defense [54]. As we know, RELA activation in infected cells is important to ligand TNF production [55]. Thus, the activation of RELA leads to the activation of the TNF signaling pathway. At the downstream of the TNF signaling pathway, TFs CREB1, CEBPB, and JUN were activated. CREB1 can upregulate with target genes ICAM1, FOS, IL6, and MMP9. CEBPB can upregulate with target genes ICAM1, FOS, and CXCL8. A recent study shows that a replicating virus and RELA activation are required for the high expression of CXCL8 [56]. CXCL8 is mainly involved in the initiation and amplification of acute inflammatory reactions [57]. The high expression of IL6 also inhibits cell apoptosis which has an advantage to viral replicating. From the above analysis, we can know that RELA activation induces the TNF signaling pathway, and then the TNF signaling pathway can upregulate the target genes ICAM1, FOS, IL6, and CXCL8 to cause the upregulation of inflammation and the downregulation of apoptosis. IL6 is a cytokine that transmits defense signals from a pathogen invasion or tissue damage site to stimulate acute phase reactions and immune responses to prepare for host defense [58]. Since the TNF signaling pathway targets IL6, and IL6 transmits a signal to RELA, the activation of RELA, the production of TNF, and the high expression of IL6 form a positive cyclic signaling pathway.

Conclusion of HPI Signaling Pathways during hRSV Infection
First, we discussed IL17 and TLR2 pathways. The hRSV G protein first interacted with TLR2 and then upregulated IL6 at the downstream. After the activation of the IL6 pathway, ligand IL6 interacted with STAT3 indirectly and then interacted with TF RELA. RELA also induces the production of TNF, and then the TNF signaling pathway targets IL6 at the downstream. This process induced a positive feedback loop during the hRSV infection. Subsequently, we concluded the RIG-1/MAVS pathway was important to viral replication. In this signaling pathway, hRSV protein NS1 interacted with MAVS, and hRSV protein NS2 interacted with IRF3. At the downstream of the pathway, IRF3 upregulated TF IFNA1, and the IFN pathway was activated too. The IFN pathway upregulated target genes OAS2 and CDKN1A which led to the regulation of cellular functions, i.e., apoptosis, cell proliferation, and virus defense. Based on the above discussion, we could know that all these signaling pathways were highly correlated via analyzing the core HPI signaling pathway during hRSV infection.

Multimolecule Drug Repurposing by DNN-Based DTI Model and Drug Design Specifications
After identifying significant biomarkers TYK2, RELA, IRF3, TRAF6, and MAVS as drug targets for the therapeutic treatment of hRSV infection, we purposed the DNN-based DTI model to discover potential multimolecule drugs for our drug targets. The proposed DNN-based DTI model was good at predicting unknown drug-target pairs for these drug targets (biomarkers) after the training via DTI databases, as discussed in Section 2.7 and shown in Figure 2. After candidate molecular drugs were predicted for these five significant biomarkers by the well-trained DNN-based DTI model in Section 2.7, we filtered these potential drugs from the candidate molecular drugs with three drug design specifications including regulation ability, sensitivity, and toxicity as a multimolecule drug for a further clinical trial.

DNN-Based DTI Model
The flowchart of multimolecular drug design via the DNN-based DTI model trained by DTI databases as discussed in Section 2.7 is shown in Figure 2. The architecture of DNN consisted of one input layer, four hidden layers, and one output layer. We set batch size as 64, epoch as 200, and Adam algorithm [24] as optimizer. We also used fivefold cross-validation to evaluate the performance of the DNN-based DTI model to prevent overfitting, which is shown in Table 3. The accuracy and loss followed by each epoch are shown in Figure 5. Since the DNN-based DTI model is binary classification essentially, we used the ROC curve and AUC to evaluate the drug-target prediction performance by the DNN-based DTI model. The ROC curve is shown in Figure 6 and AUC is 0.981, which indicates that the drug-target prediction performance of the proposed DNN-based DTI model is powerful in the drug-target binary classification problem.

Multimolecule Drug Repurposing for hRSV Infection Treatment
After training the DNN-DTI model by DTI data in the DTI database, as shown in Figure 2, we can predict candidate molecular drugs for five biomarkers as drug targets. Due to the complexity of the drug mechanism, we screened candidate molecular drugs with three drug specifications, including regulation ability, sensitivity, and toxicity, simultaneously, as shown in Table 4. The multimolecule drug we selected, based on three design specifications and their interactions with biomarkers, was shown in Table 5.

Multimolecule Drug Repurposing for hRSV Infection Treatment
After training the DNN-DTI model by DTI data in the DTI database, as shown in Figure 2, we can predict candidate molecular drugs for five biomarkers as drug targets Due to the complexity of the drug mechanism, we screened candidate molecular drug with three drug specifications, including regulation ability, sensitivity, and toxicity, sim ultaneously, as shown in Table 4. The multimolecule drug we selected, based on thre design specifications and their interactions with biomarkers, was shown in Table 5.     Table 5. The potential multimolecule drug screened by three drug design specifications from Table 4 for the therapeutic treatment of hRSV infection via a systematic drug-discovery approach.

Discussion
Although hRSV infection affects tens of millions of people each y mechanism is barely known. Nowadays, there is still no vaccine or eff prevent hRSV infection. In this study, we first constructed a candidat big-data mining and used the system-identification method by two-s RNA-seq data to identify the true system-parameter orders to prune fal didate HPI-GWGEN as the real HPI-GWGEN in the hRSV infection. S the PNP method to extract the significant part of the network matrix for to further annotate as core HPI signaling pathways by KEGG pathway the core HPI signaling pathways were employed to investigate the path during hRSV infection and identify significant biomarkers for therapeu fourth step was to predict candidate molecule drugs for these significa training the DNN-based DTI model by DTI data in DTI databases. The

Discussion
Although hRSV infection affects tens of millions of people each year, its patho mechanism is barely known. Nowadays, there is still no vaccine or effective medic prevent hRSV infection. In this study, we first constructed a candidate HPI-GWG big-data mining and used the system-identification method by two-side host/pat RNA-seq data to identify the true system-parameter orders to prune false positives i didate HPI-GWGEN as the real HPI-GWGEN in the hRSV infection. Secondly, we the PNP method to extract the significant part of the network matrix for core HPI-GW to further annotate as core HPI signaling pathways by KEGG pathways. In the third the core HPI signaling pathways were employed to investigate the pathogenic mech during hRSV infection and identify significant biomarkers for therapeutic treatmen fourth step was to predict candidate molecule drugs for these significant biomarke training the DNN-based DTI model by DTI data in DTI databases. The final step aim select potential molecule drugs by three drug design specifications for these sign

Discussion
Although hRSV infection affects tens of millions of people each year, its pathogenic mechanism is barely known. Nowadays, there is still no vaccine or effective medicine to prevent hRSV infection. In this study, we first constructed a candidate HPI-GWGEN by big-data mining and used the system-identification method by two-side host/pathogen RNA-seq data to identify the true system-parameter orders to prune false positives in candidate HPI-GWGEN as the real HPI-GWGEN in the hRSV infection. Secondly, we used the PNP method to extract the significant part of the network matrix for core HPI-GWGEN to further annotate as core HPI signaling pathways by KEGG pathways. In the third step, the core HPI signaling pathways were employed to investigate the pathogenic mechanism during hRSV infection and identify significant biomarkers for therapeutic treatment. The fourth step was to predict candidate molecule drugs for these significant biomarkers via training the DNN-based DTI model by DTI data in DTI databases. The final step aimed to select potential molecule drugs by three drug design specifications for these significant

Discussion
Although hRSV infection affects tens of millions of people each year, its pathogenic mechanism is barely known. Nowadays, there is still no vaccine or effective medicine to prevent hRSV infection. In this study, we first constructed a candidate HPI-GWGEN by big-data mining and used the system-identification method by two-side host/pathogen RNA-seq data to identify the true system-parameter orders to prune false positives in candidate HPI-GWGEN as the real HPI-GWGEN in the hRSV infection. Secondly, we used the PNP method to extract the significant part of the network matrix for core HPI-GWGEN to further annotate as core HPI signaling pathways by KEGG pathways. In the third step, the core HPI signaling pathways were employed to investigate the pathogenic mechanism during hRSV infection and identify significant biomarkers for therapeutic treatment. The fourth step was to predict candidate molecule drugs for these significant biomarkers via training the DNN-based DTI model by DTI data in DTI databases. The final step aimed to select potential molecule drugs by three drug design specifications for these significant biomarkers to combine as a multimolecule drug for the therapeutic treatment of hRSV infection.
At the end, we selected three kinds of molecular drugs to combine as a multimolecule drug for hRSV infection treatment, which contained acitretin, RS-67333, and phenformin. Acitretin is a retinoic acid derivative, and it has been approved by the US Food and Drug Administration (FDA) [59]. Traditionally, acitretin is an effective treatment for psoriasis [60] and can enhance the RIG-1 signaling pathway [59]. A recent study even showed that STAT3 and STAT1 were downregulated by acitretin [61] and these studies all indicated a high correlation with HPI signaling pathways for hRSV infection. RS-67333 has been investigated as a potential rapid-acting antidepressant, nootropic, and treatment for Alzheimer's disease. RS-67333 can reduce the expression levels of IL6 and TNF [62]. As we have discussed in the previous section, IL6 could cause a positive-feedback loop, which means that downregulating IL6 is pretty important for the hRSV infection treatment. Phenformin was introduced in the U.S. in 1957 to treat noninsulin-dependent diabetes mellitus (NIDDM). Although phenformin was withdrawn in 1977 because of the high incidence of associated lactic acidosis [63], more and more studies repurposed phenformin as the breast cancer and COVID-19 treatment [64,65]. Given the shorter duration of hRSV infection treatment, if the treatment time and dose are shortened and the patient's renal function is pre-examined [66], phenformin may be a good drug for the hRSV infection treatment.
Finally, we discussed the drug-drug interaction of multimolecule drugs. Although drug-drug interactions constitute only a small proportion of adverse drug reactions, they are still important [67]. We used Medscape, WebMD, and Drug.com [68] to find the interactions between three drugs and there were no interactions in their database, which indicates the availability of selected molecular drugs as multimolecule drugs for hRSV infection treatment.

Conclusions
We purposed a systems biology approach by host/pathogen RNA-Seq data to construct core signaling pathways to investigate the pathogenic mechanism of hRSV infection. Then, based on the pathogenic mechanism of hRSV infection, we selected five biomarkers, including TRAF6, IRF3, MAVS, IL6, and STAT3, as our drug targets. By using the systematic drug-discovery method, based on potential molecular drugs of these five biomarkers via the prediction of a well-trained DNN-based DTI model by DTI data in the DTI database, we selected three multimolecule drugs, including acitretin, RS-67333, and phenformin, based on three drug design specifications, including regulation ability, sensitivity, and toxicity as the multimolecule drug of hRSV infection. To the best of our knowledge, these three molecular drugs have no drug-drug interaction. These three molecular drugs can also regulate all biomarkers in hRSV infection.

Conflicts of Interest:
The authors declare no conflict of interest.