Co-controllability of drug-disease-gene network

Controllability of a single network often focuses on the determination of the network’s minimum dominating set, which aims to elaborate how to control the whole network with minimum driver nodes. This paper proposes a new framework, co-controllability of multiple networks, which stresses the control of one network by another network as well as the mutual control characteristics of multiple networks based on minimum dominating sets. We take a drug–disease–gene network that consists of a drug–drug network, a disease–disease network and a gene–gene network as an example to study co-controllability of multiple networks. The results show that driver nodes tend to be conserved, e.g. diseases highly associated with driver nodes of the drug–drug network tend to be driver nodes in the disease–disease network compared with random networks. In addition, co-controllability of multiple networks is probably associated with the networks’ node degree, which is more stringent than controllability of a single network that is mainly determined by the network’s degree distribution. We also find that diseases and drugs tend to be mapped as two different subnetworks of human protein–protein interaction (PPI) network, drugs are inclined to dominate diseases by controlling the PPI network, and the coded proteins of disease-related genes exhibit a low tendency to be drug targets for the control of diseases. The results in this paper not only play an important role in understanding co-controllability of multiple networks, but also are helpful for understanding the mechanisms of drug–disease–gene, disease treatments and drug design in a network-based framework.


Introduction
The study of complex networks and systems promotes the development of systems biology [2-4, 8, 10, 11, 13, 21, 25, 27-32, 40, 45]. Currently, biological network analysis as an efficient way to study the mechanisms of complex biological systems has become more and more important for systems biology [4, 5, 7, 8, 12, 14, 15, 17-22, 33-36, 47]. In general, we use biological data to build a biological network for a biological system, and through the study of the network's characteristics, we try to elaborate how molecules, interactions and structures of the network determine the functions of the biological system [4, 5, 7, 8, 12, 14, 15, 17-22, 33-36, 47]. Biological network analysis not only can help us understand cellular organizations, processes and functions, but also is very helpful for disease diagnosis, treatments and drug design [1, 2, 4, 8, 11, 13, 21, 25, 27-30, 32, 40, 45, 48]. Systems biology strongly focuses on the vertical integration from microscopic processes to macroscopic biological phenomena [2-4, 8, 10, 11, 13, 21, 25, 27-30, 32, 40, 45], while network physiology and network medicine as newly and rapidly developed research fields focus on the horizontal integration of different hierarchies of biological systems [49,50]. In contrast, network physiology tries to understand physiologic functions as behaviors that are out of the collective coordination and communication of different dynamical components [49], and network medicine addresses the correlations among genes, drugs and diseases [50]. Therefore, the new fields of network physiology and network medicine are totally different from systems biology.

Drug-disease-gene network
Recently, a drug-disease-gene network that consists of a drug-drug network (DrDrN), a disease-disease network (DiDiN) and a gene-gene network (GeGeN) was built to elaborate the mechanism of drug-diseasegene in a network-based framework [33]. The network is constructed based on 5018 drug-disease associations [9], 3702 drug-target (drug-gene) associations [37] and 877 disease-gene associations [16,24].
In the network, the DrDrN, the DiDiN and the GeGeN are also composed of two parts respectively, e.g. the DrDrN consists of a DrDrN disease and a DrDrN gene . In the DrDrN disease (21 429 links, 288 drugs), two drugs are connected if they are associated with a same disease [12,33]. In the DrDrN gene (14 665 links, 704 drugs), two drugs are connected if they are associated with a same drug target gene [12,33].
Here, we take the drug-disease-gene network as an example to elaborate the controllability and the cocontrollability of the DrDrN disease , the DrDrN gene , the DiDiN drug , zthe DiDiN gene , the GeGeN disease and the GeGeN drug .

Controllability of drug-disease-gene Network
We often study the controllability of a single network by determining the network's minimum dominating set. In this paper, G V E ( , )is used to denote an undirected network, where V and E correspond to the set of nodes and the set of links respectively, and | | = V n, | | = E l.
× a ( ) ij n n denotes the adjacency matrix of G, and = a 1 ij indicates that node i and node j interact directly, and 0 otherwise.
indicates ∈ ′ i V , and 0 otherwise. Table 1 shows the characteristics of minimum dominating sets of the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease and the GeGeN drug compared with three randomization procedures based on degree distribution conservation, node degree conservation and Erdős-Rényi (E-R) model [10] respectively. In the table, n, l correspond to the number of nodes and the number of links in the networks; n mds indicates the number of nodes or the size of minimum dominating sets; % mds indicates the fraction of nodes that belong to minimum dominating sets; 〈k〉 indicates the average node degree of the networks; and 〈k mds 〉 indicates the average node degree of minimum dominating sets.
From the table, we can see that the controllability of a single network is mainly determined by the network's degree distribution [23], e.g. the DrDrN gene and the randomized DrDrN gene based on degree distribution conservation share the same n mds and almost the same 〈k mds 〉 compared with the randomized DrDrN gene based on node degree conservation and that of the E-R model. We also can see that 〈k mds 〉 is not only associated with the network's degree distribution, but also the network's density, e.g. in densely connected networks such as the DrDrN disease and the DiDiN drug , the average degree of driver nodes tends to be greater than the networks' average degree.
The genes in the GeGeN drug (17 248 links, 1036 genes) contain 19 524 links in human protein-protein interaction (PPI) network, called PPI (GeGeN drug ) and the genes in the GeGeN disease (2173 links, 638 genes) contains 3423 links in human PPI network, called PPI (GeGeN disease ). In figure 2(a), we can see that the GeGeN drug and the GeGeN disease share 123 genes and 85 links, while the PPI (GeGeN drug ) and the PPI (GeGeN disease ) share 123 genes and 461 links. In addition, the GeGeN drug is just a spanning subnetwork of the PPI (GeGeN drug ), and the GeGeN disease is also a spanning subnetwork of the PPI (GeGeN disease ). The above  results may indicate that diseases and drugs can be mapped as a subnetwork of human PPI network respectively, and two subnetworks tend to be different.
shows the characteristics of minimum dominating sets of the GeGeN drug , the GeGeN disease , the PPI (GeGeN drug ) and the PPI (GeGeN disease ). Since the PPI (GeGeN drug ) and the PPI (GeGeN disease ) are densely connected compared with the GeGeN drug and the GeGeN disease respectively, they need fewer driver nodes to control and have a greater 〈k mds 〉.
In the DiDiN drug , only one driver node (OMIM ID: 166710) can control the whole network, and we need two driver nodes (DB00316, DB00783) to control the DrDrN disease . We also can see that in both the GeGeN drug and the GeGeN disease driver nodes are not housekeeping genes (maintenance genes) [38], but 26.13% (29/111) and 41.22% (61/148) are essential genes respectively [46], which may indicate that disease-related genes tend to be essential compared with drug target genes.

Co-controllability of drug-disease-gene network
Controllability of a single network often focuses on the determination of the network's minimum dominating set, which aims to elaborate how to control the whole network with minimum driver nodes [23,26,[41][42][43][44]. A large biological system is actually composed of plenty of small biological networks that cooperate with one another to complete the functions of the biological system, which motivates us to study co-controllability of multiple biological networks to elaborate the mechanisms of biological systems. Here, we propose a new framework, co-controllability of multiple networks, which stresses how a network can control another network and the mutual control of multiple networks. Figures 3 and 4 illustrate the co-controllability of multiple networks.
Here, we first define some measures to evaluate the co-controllability of multiple networks, which are described as follows: ⎟ denotes a mapping from the minimum dominating set of the network A to that of the network C via the minimum dominating set of the network B (see figures 3(e), (f)).
where | | X is the size of X, C mds is the minimum dominating set of the network C, and (4) denotes the indirect control from the driver nodes of A to that of C via the driver nodes of B.   (6) denotes the indirect co-control between the driver nodes of A and that of C via the driver nodes of B.  (7) denotes the indirect co-control between the network A and the network C λvia the network B. Similarly, it is easy to define direct control and direct co-control (see figure 4): Equation (8) denotes the direct control from the driver nodes of A to that of B, which is defined as the fraction of driver nodes of B that are associated with the driver nodes of A (see figures 4(b), (c)). The greater indicates that the driver nodes of B tend to be conserved with respect to that of A.
Equation (9) Equation (10) denotes the direct co-control between the driver nodes of A and that of B.
Equation (11) denotes the direct co-control between the network A and the network B. for the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease , the GeGeN drug , and the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease , the randomized GeGeN drug based on degree distribution conservation, node degree conservation and E-R model respectively. From figures 5(a)-(c), we can find that driver nodes in real networks tend to be conserved compared with random networks, e.g. driver nodes in the DiDiN drug are highly associated with that of the DrDrN disease and the DrDrN gene . Similarly, in figure 5(b), driver nodes in the DrDrN disease are closely related to that of the DiDiN gene . In figure 5(c), driver nodes in the DiDiN drug are highly associated with that of the GeGeN disease , which may indicate that the gene-gene network can be considered as a bridge linking diseases and drugs. Table 2 shows the results of direct control between the drug networks and the disease networks. We can see in the table that driver nodes of the disease networks as targets of control are highly associated with that of the drug networks. For example, the driver node (OMIM ID: 166710) as a target of control from the DrDrN disease to the DiDiN drug is associated with osteoporosis and bone mineral density variation QTL [16,24]. The driver node (DB00873) as a target of control from the DiDiN drug to the DrDrN disease is associated with loteprednol [16,24]. We can use loteprednol to treat allergic conjunctivitis, uveitis, acne rosacea, selected infective conjunctivitides, herpes zoster keratitis, iritis, cyclitis, and superficial punctate keratitis [16,24,37]. It is used for the treatment and management of seasonal allergic rhinitis [16,24,37]. The driver node (DB00316) as a target of control from the DrDrN disease to the DiDiN gene is associated with acetaminophen [16,24,37]. Acetaminophen (DB00316) is also known as paracetamol, and is widely used due to its analgesic and antipyretic effects. Acetaminophen's therapeutic effects are similar to salicylates, but it lacks gastric ulcerative, antiplatelet and anti-inflammatory effects [16,24,37]. Acetaminophen as an analgesic and antipyretic drug is widely used for the relief of fever, headaches [16,24,37]. Acetaminophen is also a major ingredient in numerous cold and flu medications and many prescription analgesics [16,24,37]. Acetaminophen is often used separately or in combination with pseudoephedrine, dextromethorphan, diphenhydramine, doxylamine, codeine, chlorpheniramine, hydrocodone, or oxycodone [16,24,37].
Driver   [16,24]. Table 3 shows the results of direct control between the drug networks and the gene networks, and we also can find the results of direct control between the disease networks and the gene networks in table 4. From the tables, we can see that driver nodes of the gene networks as drug target genes/disease-causing genes are closely associated with that of the drug networks/the disease networks respectively. For example, driver genes as targets of control from the DiDiN gene to the GeGeN drug are associated with homocystinuriamegaloblastic anemia, cblG complementation type and neural tube defects (MTR), diabetes mellitus, permanent neonatal with neurologic features (KCNJ11), hypertension and malaria (NOS2), adenocarcinoma of lung, gastric cancer, glioblastoma and ovarian cancer (ERBB2), arterial calcification, Cole disease, obesity and hypophosphatemic rickets (ENPP1) [16,24]. Table 5 shows the gene ontology (GO) enrichment of genes as targets of control from the drug/disease networks to the gene networks, and the genes are enriched in GO terms [1]. For instance, driver genes as targets of control from the DrDrN gene to the GeGeN drug involve in response to chemical of biological process with p-value = 4.254 × 10 −9 , ion binding of molecular function with p-value = 4.138 × 10 −4 , and proteinaceous extracellular matrix of cellular component with p-value = 2.481 × 10 −3 [1]. Driver genes as targets of control from the DrDrN gene to the GeGeN disease are related to behavioral response to cocaine of biological process with p-value = 1.493 × 10 −4 , drug binding of molecular function with p-value = 2.495 × 10 −2 , and cell body of cellular component with p-value = 2.495 × 10 −2 [1]. Driver genes as targets of control from the DiDiN gene to the GeGeN drug are associated with cellular response to chemical stimulus of biological process with p-value = 7.341 × 10 −2 , small  molecule binding of molecular function with p-value = 5.084 × 10 −2 , and ATP-sensitive potassium channel complex of cellular component with p-value = 7.655 × 10 −1 [1]. Driver genes as targets of control from the DiDiN gene directly to the GeGeN disease involve in response to stimulus of biological process with p-value = 2.379 × 10 −9 , protein binding of molecular function with p-value = 1.407 × 10 −8 , and cell part of cellular component with p-value = 7.600 × 10 −4 [1]. Table 6 shows the essential genes as targets of control from the drug/disease networks to the gene networks. From the table, we can see that driver genes tend to be essential in the control of the gene networks [46]. For instance, 43.59% (17/39) of genes such as EDARADD, BRCA2, TP53, ITGB4, PAX6, MTR, PIK3CA as targets of control from the DiDiN gene to the GeGeN disease are essential [46]. In addition, we also can see that the driver genes are not housekeeping genes [38]. for the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease , the GeGeN drug , and the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease , the randomized GeGeN drug based on degree distribution conservation, node degree conservation and E-R model respectively. Just as we have found in the above that the gene networks play an important role in the link of diseases and drugs, e.g. in figure 6(c), the GeGeN disease highly controls the DrDrN disease and the DiDiN drug .  -166710  DiDiN gene  GeGeN drug  MTR, KCNJ11, NOS2, ERBB2, ENPP1  157300, 600807, 104300, 143890  DiDiN gene  GeGeN disease  CIITA, JAK2, PAX6, MAPT, NRAS, RET  180300, 252150, 607208, 251880  PIK3CA, NSD1, HBA1, HNF1A, TERT  256000, 130060, 145001, 219100,  DRD4, CCL2, TP53, ITGB4, IFNGR1   for the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease , the GeGeN drug , and the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease , the randomized GeGeN drug based on degree distribution conservation, node degree conservation and E-R model respectively. From the figure, we can find that the results of co-controllability of random networks based on node degree conservation are closer to those of the real networks compared with random networks based on degree distribution conservation and the E-R model (see figures 7(a), (c) and (e), (g)). That is to say, the co-controllability of multiple networks is highly associated with the networks' node degree, which is more stringent than the controllability of a single network that is mainly determined by the network's degree distribution. for the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease and the GeGeN drug . (a)-(c) correspond to the DrDrN disease , the DrDrN gene , the DiDiN drug , the DiDiN gene , the GeGeN disease and the GeGeN drug , respectively. (d)-(f) correspond to the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease and the randomized GeGeN drug based on degree distribution conservation, respectively. (g)-(i) correspond to the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease and the randomized GeGeN drug based on node degree conservation, respectively. (j)-(l) correspond to the randomized DrDrN disease , the randomized DrDrN gene , the randomized DiDiN drug , the randomized DiDiN gene , the randomized GeGeN disease and the randomized GeGeN drug based on the E-R model, respectively. Figure 8 shows the results of for the DrDrN gene , the DiDiN gene , the GeGeN disease and the GeGeN drug . In figures 8 (a)-(c), we can see that driver nodes exhibit a low tendency to be conserved based on indirect control/co-control compared with direct control/co-control. In figure 8(d), we can find that the ability with which the DrDrN gene indirectly controls the DiDiN gene via the GeGeN drug is not only stronger than that of the GeGeN disease , but also is comparable with the direct control from the DrDrN gene to the DiDiN gene . In addition, just as we have found in the above, diseases and drugs can be considered as two different subnetworks of human PPI network. Therefore, we may conclude that drugs tend to dominate diseases by controlling the PPI network, and the coded proteins of disease-related genes exhibit a low tendency to be drug targets for the control of diseases. The results above not for the real networks, the three kinds of random networks based on degree distribution conservation, node degree conservation and E-R model, respectively.   only can help us understand the mechanism between disease phenotypes and herbal formulae, but also discover the effective compounds and their combinations and develop a rational drug design.
In general, network-based approaches give us a new way to understand complex biological systems, and cocontrollability of multiple networks as a new powerful tool can help us study the correlation between the control of biological networks and the control of diseases as well as the mechanisms of drug-disease-gene in a networkbased framework, which is also helpful for new drug design and disease treatments.

Conclusions
This paper proposes a new framework, co-controllability of multiple networks, which stresses the control of one network by another network as well as the mutual control characteristics of multiple networks based on minimum dominating sets. We take a drug-disease-gene network that consists of a drug-drug network, a diseasedisease network and a gene-gene network as an example to study the co-controllability of multiple networks. The results show that driver nodes tend to be conserved, e.g. diseases highly associated with driver nodes of the drugdrug network tend to be driver nodes in the disease-disease network compared with random networks. In addition, the co-controllability of multiple networks is probably associated with the networks' node degree, which is more stringent than the controllability of a single network, which is mainly determined by the network's degree distribution. We also find that diseases and drugs tend to be mapped as two different subnetworks of human PPI network, the drug-drug network tends to dominate the disease-disease network by controlling the PPI network, and the coded proteins of disease-related genes are hard to be drug targets for the control of diseases. The results in this paper not only play an important role in understanding the co-controllability of multiple networks, but also are helpful for understanding the mechanisms of drug-disease-gene, disease treatments and drug design in a network-based framework. In the future work, we will still focus on the network model and study the co-controllability of drug-disease-gene in different species.