HTINet2: herb–target prediction via knowledge graph embedding and residual-like graph neural network

Abstract Target identification is one of the crucial tasks in drug research and development, as it aids in uncovering the action mechanism of herbs/drugs and discovering new therapeutic targets. Although multiple algorithms of herb target prediction have been proposed, due to the incompleteness of clinical knowledge and the limitation of unsupervised models, accurate identification for herb targets still faces huge challenges of data and models. To address this, we proposed a deep learning-based target prediction framework termed HTINet2, which designed three key modules, namely, traditional Chinese medicine (TCM) and clinical knowledge graph embedding, residual graph representation learning, and supervised target prediction. In the first module, we constructed a large-scale knowledge graph that covers the TCM properties and clinical treatment knowledge of herbs, and designed a component of deep knowledge embedding to learn the deep knowledge embedding of herbs and targets. In the remaining two modules, we designed a residual-like graph convolution network to capture the deep interactions among herbs and targets, and a Bayesian personalized ranking loss to conduct supervised training and target prediction. Finally, we designed comprehensive experiments, of which comparison with baselines indicated the excellent performance of HTINet2 (HR@10 increased by 122.7% and NDCG@10 by 35.7%), ablation experiments illustrated the positive effect of our designed modules of HTINet2, and case study demonstrated the reliability of the predicted targets of Artemisia annua and Coptis chinensis based on the knowledge base, literature, and molecular docking.


Introduction
The formulas of traditional Chinese medicine (TCM), characterized by its multi-component and multi-target effects, naturally excels in regulating complex diseases.It has been widely practiced in clinical diagnosis and treatment in China and is gradually gaining global application [1,2].The targets of drugs/herbs are the origin of a drug's therapeutic action in the human body.Therefore, identifying herb-target interaction (HTI) is crucial as it aids in understanding the mechanisms of TCM, guiding rational clinical medication, and new drug development.Traditionally, the discovery of herb's targets primarily relied on wet experiments or clinical trials.However, most herbs contain dozens to thousands of components, making the experimental technologies for discovering the targets of herb, which require simultaneous identification of effective components of herbs and their corresponding biological targets, extremely costly and time-consuming [3,4].In recent years, with the advancement of informatics technology and the accumulation of a vast amount of biomedical data, computational techniques such as machine learning and deep learning are increasingly used to predict the targets of herbs, becoming a mainstream approach [5].Compared with traditional methods, computational prediction significantly reduces the cost of target discovery, enhances prediction accuracy, and accelerates the pace of new drug development [6].
In recent years, researchers have designed various computational methods for predicting the targets of herbs, primarily employing techniques such as network propagation, network embedding, and random walks among other machine learning algorithms.Yang et al. [7] developed a network propagationbased model for herb-target prediction (HTP), motivated by the idea that herbal targets with similar efficacy are also similar.This model used the efficacy-based herb similarity to initialize the values of node in protein-protein interaction (PPI) network, and employs random walks on the network to score candidate proteins.Additionally, Yang et al. [8] proposed a model named heNetRW, which leverages random walks on a heterogeneous herb-target network, demonstrating commendable predictive performance.Wang et al. [9] introduced an HTP method named HTINet based on heterogeneous network embedding.They construct a heterogeneous network of symptoms-herbs-targets, and use network embedding to learn low-dimensional vectors of herbs and proteins, ultimately achieving precise target prediction.
Furthermore, with the advancement of deep learning, various drug prediction methods based on deep neural networks have been proposed.Zhang et al. [10] introduced a multi-view deep learning model, DrugAI, which combines graph neural networks, network embedding, and multi-view learning to predict activation-inhibition relationships between drug and target.Yang et al. [11] proposed a drug repositioning algorithm named DRONet based on network embedding and ranking learning, aimed at discovering new indications for drugs.Lin et al. [12] developed an end-to-end algorithm based on deep neural networks, named GraphCPI, which employs graph neural networks to predict the targets of compounds.Concurrently, several pharmacological databases have been developed.Mohanraj et al. [13,14] constructed the IMPPAT digital database, which focuses on the phytochemicals of Indian medicinal plants and integrates information from traditional books and published research articles.Wu et al. [15] developed SymMap, an integrative database of TCM enhanced by symptom mapping.SymMap aligns TCM with modern medicine at both the phenotypic and molecular levels in shared aspects.These pharmacological databases support and facilitate the discovery of drug-target relationships.
Although various prediction algorithms of drug/herb's targets have been proposed, HTI prediction task still faces several challenges.For example, the existing HTI models do not make full use of TCM knowledge.On the one hand, HTI relationships are closely associated with the TCM properties of herbs, such as their property, f lavor, and meridian, as well as their efficacy.On the other hand, in clinical practice, the therapeutic action of herbs is manifested through the diseases and symptoms they treat.However, to date, only HTINet [9] has considered the information of diseases and symptoms in target prediction, but it does not take into account the TCM properties of herbs.Therefore, integrating both the TCM properties and clinical treatment of herbs into the HTI modeling to achieve more accurate predictions remains a challenge.
Additionally, the existing models predict HTI relationships primarily using unsupervised learning techniques, such as random walks and network embedding [8,9].These methods are unable to use known HTI as the supervisory information of model and fail to capture the full spectrum of known interactions, which hinders further improvement in prediction accuracy.Therefore, designing more efficient supervised deep learning models to enhance the accuracy of HTI poses another challenge.
In this study, we proposed a deep learning-based HTP framework termed HTINet2, which includes three key modules, i.e.TCM knowledge graph embedding (KGE), residual graph representation learning, and supervised target prediction.In the module of KGE, we constructed a knowledge graph (termed TMKG) that fuses the TCM property and clinical treatment knowledge of herbs, and learned the deep knowledge embedding (KE) of herbs and targets.In the module of graph representation learning and target prediction, we designed a residual graph convolution neural network to capture the deep interactions among herbs and targets, and a Bayesian personalized ranking (BPR) to achieve supervised model training.Finally, we designed comprehensive experiments including performance comparison, ablation experiments, hyperparameter analysis, and case study, which indicated the excellent performance and the reliability of predicted results of HTINet2.

Herb-target interactions
The data of HTIs are derived from SymMap [15], which is a known database centered on symptoms, herbs, and molecular.First, we downloaded all the HTI, totaling 267 809 relationships.To ensure the reliability of relationships, we filtered out a relation subset based on an inferred evidence score exceeding the average value of 0.4616 and a P-value ¡0.05.Finally, we obtained a benchmark dataset of 38 002 HTI relationships, including 563 distinct herbs and 2106 distinct targets.

Knowledge graph of TCM and western medicine (TMKG)
To make full use of medical knowledge in modeling HTI prediction, we constructed a knowledge graph of TCM and western medicine through the extraction and integration from multiplex biomedical knowledge bases and TCM books.These databases include SymMap [15], soFDA [16] centered on the ontology of TCM syndromes, STRING [17] v11 including large-scale PPI, KEGG [18] of genomic encyclopedia, and GO of gene ontology [19].The TCM books consist of "Pharmacopoeia of the People's Republic of China 2020", "100 Classic Famous Formulas", "Traditional Chinese Medicine Diagnostics", "Treatise on Febrile Diseases", "Formulas of Traditional Chinese Medicine", and "Internal and External Women and Children's Diseases".To ensure the quality of our constructed knowledge graph, we used programs or scripts for data cleaning, and we invited medical experts to review and correct the data, as detailed in Section 1.1 of Supplemental Materials file 1.

Overall architecture of HTINet2
We proposed HTINet2 (Fig. 1), a HTI prediction framework, which comprises three key modules, namely KG construction and embedding learning, graph representation learning, and target prediction.In the first module, we construct a knowledge graph TMKG centered on molecular, TCM theories and clinical treatments of herbs.Then we applied KE learning algorithms to learn the complex relationships among entities in TMKG, resulting in their respective embedding vectors.In the second module, we constructed a heterogeneous network specifically tailored for target prediction, introducing a learning method of residual graph convolutional network (GCN).This supervised method builds upon KE features to further learn intensified representation of herbs and proteins.In the final module of target prediction, we implement a BPR loss function to optimize the parameters of residual GCN, facilitating the prediction of potential targets.

KE learning of TMKG
To obtain implicit knowledge of the TMKG, we utilized KE algorithms to learn the embedding representations of all entities within the KG.KE algorithms are adept at learning the latent relationships between various entities in the KG (i.e.implicit local features) as well as the overall network structure of the KG (i.e.implicit global features).Subsequently, the learned implicit features are presented as the embedding vectors of nodes, which can be used for downstream tasks such as classification or link prediction, particularly predicting HTI in this study.
Among various KE algorithms, DeepWalk is particularly noteworthy.DeepWalk uses random walks to explore the KG, thereby ( 1 ) where L is the loss function, v i is the current node in the graph, {v i−w , ..., v i+w } represents the neighborhood of the current node, and w is the size of the window.(v i ) is the function mapping nodes to their vector embeddings (as detailed in the Section 1.2 of Supplemental Materials file 1).Additionally, owing to the impact of different KE algorithms and embedding dimensions on these downstream tasks, we compared five KE algorithms, namely LINE [20], GraRep [21], node2vec [22], DeepWalk [23], and HOPE [24], along with different dimensions (i.e.32, 64, 96, 128, 256, and 512).Ultimately, the most effective KGE algorithms and embedding dimensions for the task of HTI prediction are selected.

Graph representation learning for HTI network
Graph representation learning, particularly GCNs, has emerged as one of the key methodologies for learning and representing complex relationships within graph-structured data.To fully leverage the intricate interactions between herbs and targets, thereby forming more precise representation vectors of herbs and targets, we have developed a residual-like graph convolutional neural network [25].This network takes the HTI network as its initial input, with the embedding vectors derived from KE learning module serving as the initial node features.Interaction information of herbs and targets could be propagated through residuallike GCN operations, ultimately learning deeper representation features of them.
Formally, based on the known drug-target relationships in the training set, we constructed a heterogeneous drug-target graph G = V, A , where V is the set of nodes (including herbs and targets), and the adjacency matrix A encodes the interaction relationships between herbs and targets.In this matrix, if a herb i has an interaction with target j, then a ij = 1, otherwise, it is 0. Subsequently, we computed the normalized matrix S = D −0.5 A D −0.5 , to weight the inf luence of each node's degree, where A = A + I represents the adjacency matrix with self-loops (I being the identity matrix), and D is the degree matrix of A.
In the multi-layer GCN we constructed, if the node feature matrix at the k-th layer is denoted as E k , then after the propagation and learning of node features, the features at the (k + 1)-th layer can be represented as where W k represents the weight matrix; it adjusts the relative importance of different input variables and is continuously optimized during the training process of neural networks.To maintain the integrity of neighborhood information, and reduce computational costs, we replaced the nonlinear activation functions in each layer with linear activation functions for feature propagation.In the HTI network, the node degrees of an herb h and a target t are denoted as d h and d t , respectively.Their respective neighbor node sets are represented as R h and R t .Their vector representations in the k-th layer of the graph neural network are e k h and e k t .Consequently, after one layer of GCN feature propagation, their representations at the (k + 1)-th layer can be obtained: Overly , e k t , along with the initial node representations e 0 h and e 0 t .
The residual-like feature aggregation operation captures more complex features of the GCN by considering the feature outputs at each layer, and makes the information transfer between network layers more efficient, finally obtaining more precise representations of herbs and targets.

Herb-target prediction
Building upon graph representation learning module, we designed an optimization loss of BPR for predicting HTI relationships.As a commonly used method in the field of recommendations, BPR loss facilitates learning the interactions between herbs and targets by maximizing the margin between positive and negative samples.This margin is assessed by comparing the scores of herb-target pairs, where the score signifies the level of similarity or matching of a herb-target pair.
Mathematically, for a triple (h, t i , t j ), h represents an herb, t i a target associated with the herb (i.e.positive sample, composed of targets associated with the herb in the training set), and t j a target not associated with the herb (i.e.negative sample, composed of several negative samplings of positive samples).We utilized vector distance, simplified to dot product in this study, to calculate the scores for both the positive and negative sample pairs based on the embedding vectors of herbs and targets.We then constructed a pairwise ranking-based objective loss function, which seeks to maximize the score of positive samples while minimizing that of negative samples, thereby updating and optimizing the parameters of the neural network, as follows: Here, s(x) signifies the Sigmoid function, D h encompasses all the HTI triples in the training set, represents all the parameters for neural network, || || 2 is the regularization term for the parameters, λ denotes the weight of the regularization term, and M is the number of herbs.Ultimately, based on the trained neural network, score for each drug target pair can be obtained, allowing for the ranking prediction of targets based on these scores.

Experimental settings
In our experiments, we divided the 38 002 HTI relationships into a training set, validation set, and test set in an 8:1:1 ratio.We compared HTINet2 with two categories of baseline methods.The first is the methods of traditional link prediction based on common neighbors, including CN, Salton [26], Jaccard [27], HPI [28], LHN-1 [29], AA [30], and RA [31].The other comprises previously proposed models for HTI prediction, such as heNetRW [8] and Prince [32].Detailed introduction for these baselines is described in the Section 1.3 of Supplemental Materials file 1.Furthermore, we selected hit ratio (HR) and normalized discounted cumulative gain (NDCG) as the evaluation metrics [33].For a given herb h, P h@K represents the top-K predicted candidate genes, and G h represents the known genes of h in the test set.HR@K is given by The Normalize Discounted Cumulative Gain (NDCG@K) can be calculated as follows: where DCG represents Discounted Cumulative Gain, IDCG represents Ideal Discounted Cumulative Gain, r i represents the relevance score of the gene at position i in the ranked list, and |REL| represents the results sorted by relevance.

TMKG and KE
We constructed a large-scale knowledge graph of Traditional Chinese and Modern Medicine TMKG by the information extraction and curation from multiple public databases.The TMKG comprises 15 types of entities totaling 74 529 and 31 types of triples totaling 1920 415.For subsequent HTP, TMKG included two types of vital information related to herbs.The first is the TCM properties of herbs, including herbal property and class, taste, meridian, and efficacy, etc.The second is the clinical treatment information of the herbs, including diseases and symptoms.As shown in Fig. 2A and B, we took Artemisia annua and Coptis as examples, and illustrated their connections to syndromes, diseases, efficacy, ingredients, and proteins.
To capture the complex and deep relationships among biological entities (especially herb and target) in TMKG, we utilized multiplex methods of KE to learn the embedding representations of entities, which imply extensive and accurate medical knowledge, e.g. the TCM properties and clinical treatment information of herbs.These embedding vectors encapsulate information pertaining to the current node and its neighboring nodes within the context of a complex knowledge graph.Then we used t-SNE [34] to reduce the dimension of the embedding vectors of these entities learned by different KE models, and visualize them on a two-dimensional plane.We calculated the Silhouette score (SC) for various entities, as shown in Fig. 2C-F and the Section 2.1 of Supplemental Materials file 1.These entities, including herbs, proteins, symptoms, and diseases, etc. are grouped into different clusters.Second, the results also showed that there are obviously differences on the embedding visualization of KE methods with different model characteristics.For example, in the visualization results of GraRep, the cluster distance between herbs and diseases, genes, and efficacy are relatively close, while the distance between herbs and ingredients was relatively far away.For the result of LINE, the herb cluster was in the center; in addition to the distance from the herbal prescription, it has a relatively close distance from other entities' cluster.Finally, the KE model can capture the complex relationship between entities and get a good embedding representation, which provides a basis for fine-tuning the downstream task of modeling HTI prediction.

Overall performance comparison with baselines
In the experiments, we compared the performance of HTINet2 with multiple baseline methods, which include traditional link prediction algorithms and the known HTI prediction methods, i.e. heNetRW and PRINCE.
The experimental results (Table 1 and Fig. 3) showed that in the baseline of local similarity, CN and AA, which are both network topology-based methods, achieved the best performance (HR@5=0.1209and NDCG@10=0.276),followed by RA as well, and the HPI and LHN-1 performed worst.In the two known baselines, heNetRW obtained the highest performance (HR@10=0.0915and NDCG@10=0.0942).The baseline methods based on local similarity most often outperformed the two known baselines.For example, the HR@10 and NDCG@10 of CN are 118.7% and 193.0%higher than those of heNetRW, respectively, indicating that local similarity methods can effectively capture the structural information of HTI network, demonstrating superior performance in HTI prediction.
By comparing the performance of HTINet2 with all baseline methods, the results showed that HTINet2 achieves the best performance (HR@10=0.4458and NDCG@10=0.3746),which are significantly higher than all baseline methods.For example, the HR@10 and NDCG@10 metrics of HTINet2 are 122.8% and 35.7% higher than those of the best baseline (i.e.CN).The excellent performance of HTINet2 is benefit from the pre-training of KE learning and the fine-tuning of residual-like GCN.
To obtain deep knowledge representations for herbs and targets, we utilize different algorithms of pre-training KE to learn the pre-training features of herbs and targets from the TMKG.As shown in Fig. 4 and the Section 2.2 of Supplemental Materials file 1, the results indicate that the pre-training of HTINet2 has a good robustness that all pre-training methods improve prediction performance compared with non-pre-training.Specifically, Deep-Walk and HOPE exhibit the best performance in HTI prediction.This may be attributed to the graph structure of TMKG; both DeepWalk and node2vec employ random walk to capture the structural information.However, the uniform random walk used in DeepWalk might more accurately ref lect the true structure of the graph in the context of HTI prediction tasks.In addition, HOPE is designed to capture high-order proximities in a graph.Capturing deep interactions in the HTI network may be more crucial than considering direct neighbors alone.We have presented the target predictions for all drugs in the dataset, predicted by the HTINet2 model, in the Supplemental Materials file 2.

Performance influence of key components in HTINet2
To evaluate the contribution of different components in HTINet2, we performed ablation experiments of HTINet2, including KE pretraining (termed PT) and residual GCN (termed RES).The results (Table 2 and Fig. 4B) showed that, except in rare metrics (i.e.HR@1 and NDCG@1), the removal of any components from HTINet2 leads to a performance decline in the most terms of metrics.Specifically, the performance decreases the most (HR@10 by 8.69% and NDCG@10 by 7.89%) when removing the residual and KG pretraining components simultaneously.The performance decreases the least (HR@10 reduced by 2.42% and NDCG@10 by 0.51%) when removing the pre-training component.This indicates that each component of HTINet2 contributes to its performance, with the contribution of the residual in GCN contributing more than the KG pre-training.
Above results indicate that pre-training of TMKG enhances the expressive capabilities of subsequent components.Moreover, the interactions between herbs and genes within the KG provide additional context for the model.By utilizing the effective information transformed through residual-like GCN, the model is better equipped to capture the latent characteristics of the nodes, improving the prediction performance.

Interpretability of HTI prediction model
To investigate the interpretability of HTI prediction models, we used t-SNE [34] to reduce the dimension of the embedding vectors of herbs and targets learned by different prediction models, and visualize them on a two-dimensional plane.First, the visualization result (Fig. 5A-F) indicated that five KE methods (i.e.LINE, GraRep, node2vec, DeepWalk, and HOPE) effectively divided herbs and targets into two distinct clusters, showing a clear advantage over random embedding vectors.The results of SC for these clusters indicate there are high SC of GraRep, Deep-Walk, and HOPE, of which two (HOPE and DeepWalk) also achieve high prediction performance.This indicates there is a positive correlation between performance of HTINet2 and clustering of KE models.
To intuitively display and further explain the predicted results of HTINet, we also showed the visualization result (Fig. 5G) after the KE and residual-like GCN of HTINet2.Then we selected some HTI relationships in which herbs and targets are very close in a two-dimensional plane.For example, the Rehmanniae Radix is close to two known targets, PRKAA1 and PRKAA2, with distances of 0.1604 and 0.1521, respectively.The two targets PRKAA1 and PRKAA2 that almost coincide in position are also proteins from the same family.There is a closer distance (0.0545) between the Polygalae Radix and its target TRPV3.We revisited the TMKG,  analyzing the selected HTI relationships and the neighbors of these entities.Figure 5H-I demonstrates these relationships within the TMKG we constructed, where entities with closer vector distances show strong relevance.We have highlighted some possible paths with red lines.For example, the secondorder neighbors of Rehmanniae Radix are PRKAA1 and PRKAA2, and those of Polygalae Radix are TRPV3.On the whole, most herb targets with existing or potential relationships are closely spaced in the embedding visualized results of our HTINet2.

Hyper-parameter sensibility of HTINet2
To evaluate the robustness of HTINet2, we conducted sufficient experiments on hyper-parameter sensitivity analysis.In experiments, the main hyper-parameters of HTINet2 include the number of GCN layers in the Graph Representation Learning module, the embedding dimension of entities (e.g.herbs and targets), and the window size of the DeepWalk in KE learning stage.
When conducting experiments on the number of GCN layers in the component of Graph Representation Learning, we fixed the dimension of embedding vector at 64.The results (Fig. 5C-D, Table 3 of Supplemental Materials file 1) show a noticeable impact of layer quantity on the performance of HTINet2.Specifically, as the number of layers increased, there was a significant improvement in the metrics of HR and NDCG, followed by a reduction and then a stable trend thereafter.
We fixed the number of GCN layers and conducted experiments by adjusting the embedding vector dimensions.The results (Fig. 4E-F, Table 4 of Supplemental Materials file 1) indicate that the performance of our model gradually improved across all metrics with the increase in embedding vector dimensions.This improvement can be attributed to the fact that higher dimensional vectors encapsulate more information, enabling a more comprehensive representation of herbs and targets, improving the prediction performance of HTINet2.Finally, we adjusted the window size of the DeepWalk method during the KE learning stage, and the results (Fig. 4G-H, Table 5 of Supplemental Materials file 1) show that the model exhibits slight sensitivity to the window size.As the window size increases, the overall performance of the model exhibits a declining trend, and there is a slight rebound in performance when the window size is set to 10.

Case study
To demonstrate the reliability of HTINet2's prediction results, we selected the two herbs, namely Artemisia annua and Coptis chinensis, and showed the results of top 15 targets predicted by HTINet2, excluding the targets that were existing in the test set.To verify the potential relationships between herbs and their predicted targets, we conducted three types of validation, including database, literature and molecular docking.Database validation is based on the two authoritative Chinese herb databases, namely ETCM [35] and HERB [36].Literature validation is based on newly published medical literatures from PubMed to identified possible associations.In docking validation, we selected the key compounds of Artemisia annua and Coptis chinensis, namely artemisinin and berberine, and conducted docking-based virtual screening (using molecular docking software Autodock 1.5.7) to investigate the potential relationships between these predicted targets and the two compounds.As detailed in Table .3,the targets highlighted in bold are those that have been verified based on three types of evidences.
• Target validation of Artemisia annua.In the top 15 candidate targets of Artemisia annua, evidence in databases, literature or docking supports 13 of these targets.The ETCM database developed by Zhang et al. [35] recorded TLR4 (ranked first) and TRPM8 (ranked sixth) as the targets of Artemisia annua.The HERB database studied by Fang et al. [36] recorded EIF6 (ranked eighth) as a target of this herb.Gao et al. [37] have explored that Dihydroartemisinin inhibits endothelial cell tube formation by suppression of the STAT3(ranked fourth) signaling pathway.Additionally, Ilamathi et al. [38] have demonstrated Artesunate, an anti-cancer agent, targets STAT3 and effectively suppresses hepatocellular carcinoma.Liu et al. [39] found that EGFR (ranked 11th) is related to the treatment of cervical cancer; oral administration of Dihydroartemisinin for 28 days reduced the expression of p53, EGFR, and Ki-67 antigens.Furthermore, several studies [40][41][42][43] have reported potential targets such as CDH1, AHR, MYC, IL-2, which may be associated with Artemisia annua or its derivatives.In addition, docking results indicated that there is strong binding energy with artemisinin for the four potential targets, namely ADH1A (ranked third, BE=-6.56),MC1R (ranked fifth, BE=-6.58),NR3C1 (ranked 10th, BE=-8.15), and CD86 (ranked 12th, BE=-6.34).For example, artemisinin docked onto the amino acid residue of ADH1A, namely LYS-188 (Fig. 6A).Other results of docking analysis are shown in the Section 2.4 of Supplemental Materials file 1. • Target validation of Coptis chinensis.In the results of the top 15 candidate targets of Coptis chinensis, the evidence support nine of these targets.The study of Tang et al. [44] demonstrated that berberine, a major component of Coptis chinensis, may improve insulin resistance by affecting the expression of the SLC2A4 gene (ranked second) and increasing GLUT4 levels.Shou et al. [45] found that berberine activates PPARδ, initiating transcriptional regulatory functions and promoting the expression of NQO1 (ranked sixth).The CHRM2 gene, a target for the Coptis chinensis, is retrievable in both the HERB and ETCM databases.Additionally, several studies [46,47] have reported potential targets, such as RHO and IL18, which may be related to Coptis chinensis or its derivatives.In addition, the results of docking showed there is strong binding energy with berberine for the four potential targets, including CHRM3 (ranked first, BE=-5.99),POR (ranked fifth, BE=-6.61),PPARA (ranked eighth, BE=-7.61), and PRSS3 (ranked tenth, BE=-6.9).Taking the potential target CHRM3 as an example, berberine docked onto the three amino acid residues of the protein, namely HIS-62, ARG-150, and LEU-192 (Fig. 6B).Other results of docking analysis are shown in the Section 2.4 of Supplemental Materials file 1.
The above case results indicated that our HTINet2 can obtain the reliable prediction results, capable of not only identifying existing drug targets but also suggesting new candidate targets for which evidence has not yet been found.

Discussion
Benefiting from the efficiency of computation methods, computational prediction of drug targets has emerged as a hot topic.In this study, we proposed a deep learning framework to predict herbal targets, which includes TCM and clinical KGE, residual graph representation learning, and supervised target prediction.The comprehensive experiments and case study showed the excellent performance and the reliability of predicted results of HTINet2.
The high performance of HTINet2 benefits primarily from two points.On the one hand, the knowledge graph TMKG we constructed contains rich knowledge, especially both TCM properties and clinical treatment of herbs that help to learn rich KE vectors of herbs and targets.TMKG is a foundational knowledge graph that can be further applied to various application scenarios, e.g.biomedical relationships inference (e.g.genes of diseases, symptoms, or syndromes), intelligent auxiliary diagnosis and treatment (e.g.disease diagnosis or drug/treatment recommendation).On the other hand, the residual-like graph convolution we designed can effectively capture the deep interactions among herbs and targets, capable of effectively integrating complex interaction data from different biomolecules; the BPR is a supervised module designed specifically for HTP.This supervised neural network framework does perform better on target prediction.Meanwhile, the f lexible design of the HTINet2 framework is not only  suitable for predicting drug targets but can also be extended to other graph-based bioinformatics problems such as disease gene identification and protein function prediction.We describe the process for applying HTINet2 to other tasks or datasets in the Section 2.5 of Supplemental Materials file 1.This wide applicability makes HTINet2 a versatile tool for various biomedical research applications.
There are still several work to do in the future.First, the current data of herb-target relationships are still noisy, how to maintain consistency and high quality with data from multiple sources remains a problem, and high-quality herb-target relationships remain a limitation.In the future, we will combine the two high-quality datasets SymMap [15] and SympGAN [48] built from our previous studies, integrating technologies such as entity

Figure 1 .
Figure 1.Overall architecture of HTINet2; HTINet2 consists of three key modules, i.e.KG construction and embedding learning (A and D), graph representation learning (B and E), and target prediction (C).

Figure 2 .
Figure 2. (A and B) Schematic representation centered on Artemisia annua and Coptis in knowledge graph TMKG; (C-F visualization of embedding vectors of entities in TMKG by different KE methods.

Figure
Figure Performance comparison of HTP methods.

Figure 4 .
Figure 4. Performance comparison and parameter sensitivity analysis of HTINet2; (A) performance comparison of different KE methods; (B) performance comparison of different components of HTINet2; (C and D) HTINet2's performance with different numbers of GCN layers; (E-F) HTINet2's performance with different dimensions of KE; (G-H) HTINet2's performance with different window size.

Figure
Figure (A-E) Visualization of the pretrained features of herbs and targets in TMKGs by different KE methods; (F) visualization of embedding features of herbs and targets initialized from normal distribution; (G) visualization of embedding features of herbs and targets by HTINet2 and several examples of HTI; (H) relationships between Rehmanniae Radix, PRKAA1, and PRKAA2 in TMKG; (I) relationships between Polygalae Radix and TRPV3 in TMKG.

Figure 6 .
Figure 6.3D diagrams of molecular docking results; (A) artemisinin docked onto the amino acid residue of ADH1A, namely LYS-188; (B) berberine docked onto the three amino acid residues of the protein, namely HIS-62, ARG-150 and LEU-192.
deep network architectures may lead to issues of gradient vanishing or explosion in GCN.Inspired by the residual operation of ResNet, we obtained the final embedding vectors of herbs/targets, e h and e t , by fusing the vectors of after k layers graph convolutional operations, e1

Table 1 .
Performance comparison of HTP methods.

Table 2 .
Performance comparison of key modules of HTINet2.

Table
Prediction analysis of Artemisia annua and Coptis chinensis.Note: Target validation including three types, namely database, literature, and molecular docking.Database validation is based on the two authoritative Chinese herb databases, namely ETCM and HERB.Literature validation is based on newly published medical literatures from PubMed.Docking validation is based on docking-based virtual screening, and BE denotes the binding energy.