EFFECTIVE

The massive number of research articles on the Web makes it troublesome for researchers to identify related works that could meet their preferences and interests. Consequently, various network representation learning-based models have been proposed to produce citation recommendations. Nevertheless, these models do not exploit semantic relations and contextual information between the objects of bibliographic papers’ networks, which can result in inadequate citation recommendations. Moreover, existing citation recommendation methods face problems such as lack of personalization, cold-start, and network sparsity. To mitigate such problems and produce individualized citation recommendations, we propose a heterogeneous network embedding model that jointly learns node representations by exploiting semantics corresponding to the author, time, context, ﬁeld of study, citations, and topics. Compared to baseline models, the results produced by the proposed model over the DBLP datasets prove 10% and 12% improvement on mean average precision (MAP) and normalized discounted cumulative gain (nDCG@10) metrics, respectively. Also, the effectiveness of our model is analyzed on the cold-start papers and network sparsity problems, where it gains 12% and 9% better MAP and recall@10 scores, respectively.


Introduction
In the last few years, recommender systems [1] have been introduced in different domains, including videos [2,3], books [4], news [5], and locations [6].Such models assist users by suggesting items and information that meet the preferences and information needs of the corresponding users [7].In the same direction, the number of research publications in digital libraries is growing at a rapid pace, which makes it hard for researchers to find relevant papers to their research interests.To tackle this issue, various models have been proposed in the literature that produce personalized research papers recommendations [8][9][10][11][12].Typically, these models have been divided into three categories viz., Content-based Filtering (CB) [10,13], Collaborative Filtering (CF) [8,9,14], and Deep Learning (DL)-based models [15][16][17][18].The CF models employ the ratings of user and their friends to recommend research papers [19].They can create justifiable results when such information is available and system can find similar taste users for the query user.Nevertheless, CF techniques are unable to produce quality results and encounter sparsity problem [20] when the users ratings information is sparse [21].On the contrary, CB systems [10,13] produce recommendations employing the descriptions/features of items and users [22].For CB methods, it is mandatory that the descriptions of items and users profile information are available, otherwise CB model will confront with cold start problems [23].
In contrast, the traditional graph-based approaches [24,25] use random walk methods, which may bring poor and biased results since they conceive recommendation like a link prediction task.That is, such models give more importance to old nodes in the network, which leads to biased recommendations [21].To resolve this problem, Network Representation Learning (NRL) models, namely VOPRec [26], BNR [15], TMR [18] and GCR-MHNE [27] have been proposed.These models exploit heterogeneity among the network objects and utilize papers' content to present recommendations.Notwithstanding, existing NRLbased models do not exploit salient information factors, including papers' topics, temporal dynamics, and papers' field of study, which are capable of capturing researchers' preferences and producing adequate recommendation results.Also, these models cannot effectively address the cold-start papers and network sparsity problems, which occur due to the unavailability of sufficient information about papers.
Intuitively, researchers prefer to study a comparatively smaller number of research papers relevant to their research topics, authored by their favorite/followed authors, targets similar field of study, and those published in recent years, etc.For example, the author/s of a manuscript can have a significant role on the readers, and citations [28].Mostly, researchers follow a specific author or a research group/s with same research preferences.Additionally, researchers read and write papers that target specific research topic/s.Also, research works [8,10] show that researchers are more interested in manuscripts with topics of their research interests.Additionally, the field of study [29] and abstract-level semantic relations [30] have a great impact on capturing researchers preferences and creating quality recommendation results.Similarly, paper publication time [31] has a prominent influence on the interests of researchers.Specifically, a paper published in recent (last two, or three) years is more important compared to that which is published twenty years ago.Nevertheless, existing models fail to capture such useful semantics and topological features in Heterogeneous Bibliographic Networks (HBNs).
In view of the above analysis, we argue that there is a necessity to develop a model that could exploit such semantics and produce personalized citation recommendations.In this research, we propose a model called Citation Recommendation using Heterogeneous Bibliographic Network Embedding (CR-HBNE), which exploits researchers' preferences, authors' information, topics, field of study, citations-based proximity, abstract-level semantics, and temporal dynamics, to learn semantic-aware representations of nodes.Furthermore, CR-HBNE utilizes these representations to produce relevant citation recommendations.The proposed model has implications for the academia, research community, and practitioners in the form of personalized citation recommendations.It can be integrated into digital libraries, such as Springer, IEEE, ACM, Science Direct, and Google Scholar to exploit researchers' preferences in personalizing scholarly exploration for them.Also, the research community sites including ResearchGate and others can use this model as a recommendation feature to boost user experience and generate revenue for the organization.The model can also be incorporated into existing recommendation models to address the cold-start problem, which is one of the notorious issues faced by most of recommender systems.Finally, we present the main contributions of this work given as follows: • We formally describe the problem of heterogeneous bibliographic network embedding, which can efficiently capture semantic relationships among the heterogeneous papers network objects.The remainder of the paper is organized as follows.Section 2 discusses related citation recommendation models.Section 3 presents the problem statement and preliminaries.Section 4 demonstrates the methodology of our model.Section 5 presents the comparative analysis of the proposed model against its counterparts.Finally Sect.6 concludes our work.

Related work
In this section, we discuss the current citation recommendation (CR) models proposed in the literature.

Collaborative Filtering and Content-based Models
The CF models use the rating/feedback information of users and their friends to present citation recommendations [32].To this end, PCTR [14] employed CF method to produce papers recommendations.Bansal et al. [9] made paper predictions in the CF task by applying GRU to the content of research articles.Sugiyama et al. [31] recognized potential citation papers by exploiting the paper-citation matrix to form a neighborhood.The model employed the Pearson correlation to compute the similarities between the citation vectors and the target paper.In contrast, CB systems create recommendations by employing the descriptions and features of research papers and corresponding users [22].For instance, a CB model [10] applied Latent Dirichlet Allocation (LDA) to the textual content of research papers to generate its latent representations.Specifically, the model builds the representations of the researcher's profile (using their authored papers) and candidate articles employing LDA.Then, the similarities between the two representations are calculated to create the final recommendations.Likewise, Science Concierge [33] applied Latent Semantic Analysis (LSA) to papers' textual content to offer recommendations.Models that use CB and CF methods can generate quality recommendations.Nevertheless, models based on CF methods encounter different problems, such as cold-start and sparsity.That is, when the model has no user ratings and other required information, then it is difficult for the system to produce justifiable results.Thus, recommendations delivered on such insufficient information can cause inaccurate results [21].On the other hand, CB models require papers and user descriptions/features.Due to the unavailability of such information, they face the cold-start and overspecialization problems [23,34].Besides, traditional CB and CF methods do not utilize side information, and thus they failed to deliver justifiable results.

Deep learning and NRL-based models
In the last few years, different models have employed deep learning methods [35], such as MLPs [36], CNNs [37], RNNs [9,17], and GANs [18,38] to produce quality recommendations.To this end, Huang et al. [36] used the semantic representations of citation contexts and relevant papers to create citation recommendations.Specifically, the model utilized a multi-layer neural network to learn the probability of articles given the citation contexts.In the same direction, PCCR [17] adopted LSTM [39] to learn the embedding of citation contexts and research papers based on context encoder and scientific paper encoders, respectively.Then, the similarity between these representations is computed to suggest top-k citations.Likewise, an RNNbased model [40] exploits authors information and citation relations using Bi-LSTM to produce context-aware personalized citiaton recommendations.Contrarily, Yin et al. [37] proposed p-CNN, a personalized citation recommendation model which makes recommendations using convolutional neural network [41,42].In addition, to compute relevance between citation context and relevant manuscript, the model exploits the authors' information.It employs a discriminative training strategy to learn the parameters and generate relevant recommendations.Similarly, POLAR [16] proposed an attention-based CNN model to produce citation recommendations.
In recent years, more sophisticated graph embedding [43] and network embedding [44][45][46] methods have been proposed, which encode nodes, graphs or a network into an embedding space.To this end, various models [18,26,47] used such embedding methods to generate citation recommendations.For instance, Gupta and Varma [47] used DeepWalk and Doc2vec [48] to learn the vector representations of network and papers' content, respectively.Finally, similarities between the vector representations are computed to generate recommendations.Likewise, VOPRec [26] generated recommendations by integrating text-based nearest nodes and structured-based vectors learned based on the Paper2vec [49] and Struct2vec [50] embedding methods, respectively.On the contrary, BNR [18] used the network structure and contents of objects (authors, and papers) based on Node2vec [51] embedding method to generate their vector representations.Existing network embedding based citation recommendation models such as BNR, GAN-HBNR [52], and VOPRec produce more significant results than random-walk models.However, such models failed to exploit the significance of semantic relations between HBNs objects and dealing with the ''network sparsity'' and ''cold-start papers'' problems.That is, NRL-based models viz., BNR and VOPRec do not exploit rich semantics corresponding to the content of research papers and their authors, and therefore fail to generate semantic-preserving nodes representations.In the same direction, GAN-HBNR employs Doc2vec to learn nodes' content-based embedding, which could not effectively capture the contextual information and relevant semantics.Also, these models are unable to employ topical relevance, temporal dynamics, and corresponding labels, which can improve the quality of results and personalize recommendations.To overcome these limitations, the proposed model uses multiple semantic relations including paper-paper citation, paper-paper semantic linking, papertopic, paper-FOS, author-paper, and paper-time to produce semantic-aware nodes representations.Also, the SPECTER language model used in the GCR-HBNE can capture rich semantics and generate robust content-based representations of network nodes.

Problem formulation and preliminaries
This section covers the core concepts and modules of the proposed CR-HBNE model.Concepts and notations used in this research are given in Table 1.Some preliminaries are defined as follows: It is evident that participating nodes, including authors, papers, field of study, topics, and time maintain meaningful connections with each other.An example of citation relation is shown between paper P1 and paper P5.Likewise, node P1 is linked with paper P4 based on time period T'.
Similarly, P2 maintains relation with P3 by sharing contents.In contrast, authors A1 and A2 make relationship on the basis of collaborative research.In this research, we exploit such semantic relations to produce more adequate recommendation results.
Definition 2 (Heterogeneous Bibliographic Network Embedding): Heterogeneous Bibliographic Network Embedding (HBNE) aims to embed the nodes v 2 N of G into embedding space with d 6 jNj by learning a mapping function W : N !R d .Nodes with similar semantics in G are kept close in the embedding space and therefore possess close representations.Next, the problem statement is defined as follows: Problem Statement: Given a query paper p and heterogeneous bibliographic network G ¼ ðV; EÞ, the model aims to exploit semantic relations between the objects of HBN, and recommends top-k related manuscripts.

Heterogeneous bibliographic network embedding
Here we introduce the principal concepts utilized in this model.In particular, CR-HBNE jointly learns the low-dimensional representations of participating nodes, such as papers, authors, field of study, abstracts, topics, and time by exploiting the HBN and produces personalized citation recommendations.To preserve network structure proximity and semantic relations between the nodes, we exploit the structure of bibliographic network to learn nodes in lowdimensional representations.The proposed model aims to exploit inter-node, and intra-node relationships and learn objects latent representations, which are utilized in  suggesting relevant citations.Further details are given in the following subsections.

Author-paper representation learning
To learn embedding over participating information networks, the proposed model extends Node2vec [51], which adopts a flexible and controllable biased random walk method in investigating diverse neighborhoods of nodes in HBN.In particular, CR-HBNE uses the Breadth First Search (BFS) and Depth First Search (DFS) strategies using tunable parameters p and q to exploit relations among the heterogeneous bibliographic network objects.Considering this notion, our model constructs a biased random walk employing the HBN.Specifically, it learns node emebddings by extending the architecture of the Skip-gram model [53], going from sentences in the text to the random walks in the HBN.To learn inter-node and intra-node relations, it maximizes the probability of the neighbors of a given node in random walks.Finally, these relationships are integrated into a framework to learn final representations of nodes.
Considering the indispensable role of an author in the manuscript, we exploit relations between authors and papers.In the Author-Paper relation modeling, for a node p i corresponding to random walks w 2 W 1 , our model maximizes the probability of neighboring nodes given node a i as follows.
Here, W 1 and v a i denote the generated biased random-walk corpuss and output embedding vector of current node a i , respectively.The neighbors of node a i i.e., PðN p ða i Þ are generated by employing a neighborhood sampling approach [51], where process starts from source node c 0 ¼ u, and the i th node is generated by taking a fixed length walk as defined by: where, p xv represents the unnormalised transition probability between nodes x and v computed as a pq ðt; xÞ Á w xv , and Z denotes a constant used for normalization.One approach to bias random walks is to sample the next vertex using weights w xv , which denotes static edge weights, i.e., p xv ¼ w xv in case of weighted graphs, otherwise w xv ¼ 1.But, this approach does not explore the whole network, therefore a more sophisticated 2 nd order random walk with two parameters, i.e., p and q is adopted, where the unnormalised transition probability is set to p xv ¼ a pq ðt; xÞ Á w xv .a pq ðt; xÞ is defined as follows.
where, d tx denotes the shortest distance between x and t nodes, which is always among 0; 1; 2 f g.Using two parameters, such as return parameter p and the in-out parameter q, our model can effectively guide the walk on the network.In particular, this method investigates neighborhoods of a nodes employing DFS and BFS mechanisms.Based on symmetry and conditional independence assumptions [51], the objective function in Equation 1simplifies to: where, Z P a i ¼ P n p s¼1 expðv p s Á v a i Þ represents per-node partition function.However, this loss function is computationally expensive when applied to large-size networks, therefore we approximate it by employing negative sampling [54].

Paper-paper representation learning
The graph between papers is maintained if there is citation relation between them.Citation relation shows a strong indication of relatedness between papers, as authors cite papers very carefully and bibliography contains relevant papers.To capture such semantics, CR-HBNE optimizes the following objective.
where, R 2 represents the generated biased random-walk corpus of reserach papers, Nðp i Þ & N p denotes the neighbor nodes of the paper p i generated using the neighborhood sampling mechanism [51], where v p i denotes the output representation of node p i .Finally, we can simplify the objective defined in Equation 5 as.
where, Z p i ¼ P n p s¼1 expðv p s Á v p i Þ.

Paper-topic representation learning
The relation between papers and topics is maintained if a paper contains a research topic, or a topic belongs to a research paper.That is, a topic ''Citation recommendation'' is covered in a manuscript p i .Moreover, researchers take more interest in papers which target the topics of their research interests.To encode such semantic relations, our model tries to optimize the following objective.
where, R 3 represents the corresponding corpus, Nðp i Þ & N p denotes the neighbors of the node p i .Additionally, we can write the objective function in Equation 7 as.
where, Z T p i ¼ P n p s¼1 expðv t s Á v p i Þ represents per-node partition function.

Paper-time period representation learning
Paper-time graph is established between papers and time period.For instance, the relation between paper p and time period t 0 is established when p is published in time t 0 .In this work, we define time-period as distinct time intervals of the entire dataset, we have divided it into equal sized bins of years.The time factor plays a very crucial role in citation recommendation, as researchers are more interested in papers that are published in recent years, because recent works present novel ideas and current results.Besides, researchers interests evolve over time, which makes the temporal factor an important relationship to utilize.To explore such meaningful relations, our model utilizes the following objective function.
where, R 4 is the corpus, Nðp i Þ & N p denotes the neighbor nodes of p i .Also, the objective function in Equation 9 can be modified as follows.
where, Z T 0 p i ¼ P n p s¼1 expðv t 0 s Á v p i Þ represents per-node partition function.

Abstract-Abstract Representation Learning
The Abstract-Abstract relation between papers is maintained when they possess semantically similar information.We know that the abstract of a manuscript presents the main idea of that research work.Therefore, to exploit semantic relations between papers, we utilize the abstracts of research manuscripts to compute their low-dimensional representations using the SPECTER [55] document embedding model, which employs Citation-informed Transformers.The SPECTER model is initialized by using SciBERT [56], which is trained on a large corpus of scientific documents, defined as follows.
It takes an input as the concatenated WordPieces (of abstract Ab i ) and [CLS] token, distinguished by the [SEP] token.Additionally, SPECTER trains the Transformer model to learn optimal representations of abstracts by optimizing the triplet margin loss objective [55].
After learning abstracts embeddings, we apply cosine similarity to these vectors to construct abstracts-abstracts relation network.To choose semantically similar abstracts, we use cosine scores in the range of 0.9 and 1.The model exploits the resulting relationship by optimizing the following log probability objective as.
where, R 5 represents the generated biased random-walk corpus of abstracts, NðAb i Þ & N Ab denotes the neighbor nodes of the abstract Ab i generated using the neighborhood sampling mechanism [51], where v Ab i denotes the output representation of node Ab i .The objective defined in Equation 12 can be simplified as: where,

Paper-FOS representation learning
To model Paper-FOS relationship, we collect the field of study (FOS) corresponding to each research paper.Field of study defines the narrow subarea of a research paper, therefore exploiting such relations can better capture researcher's preferences.To model semantic relations between papers and FOS, the proposed model employs the objective as defined.
where, R 6 represents the corpus of FOS generated using biased random-walk, Nðp i Þ & N p are neighbors (i.e., field of study) of the node p i .Also, we can write Equation 14 as follows. where,

Joint learning
The proposed heterogeneous bibliographic network consists of six relation graphs established between nodes including papers, authors, topics, field of study, time, and abstract.To exploit semantics between the participating nodes in different networks, it is mandatory to jointly optimize the objective functions defined for various networks and learn context-preserving node embeddings.To do so, the proposed model first merges intra-node relations, and inter-node relation correlations into an integrated framework.The aim is to optimize the objective function defined as.
where, O ap , O pp , O pt , O pt 0 , O Ab i Ab j , and O pf denote the objective functions of author-paper, paper-paper, papertopic, paper-time, abstract-abstract, and paper-FOS relation networks.In this manner, CR-HBNE learns the embeddings of objects, which are utilized in producing citation recommendations.

Unified model for citation recommendation
After learning the low dimensional representations of participating nodes in the heterogeneous bibliographic network, the proposed model makes papers predictions for user a.In particular, it recommends top-k citations corresponding to a seed manuscript provided by a researcher.To produce top-k recommendations, CR-HBNE uses the following score computation method.
where, N PR , N PT , N AR , and N PR Ab denote the vector representations of training papers, their topics, authors, and abstracts.Additionally, N Qp ; N QAb ; N Qt ; N Qt 0 , N Qa , and N f denote the vector representations of query manuscript, query manuscript's abstract, topics, publishing time, author, and field of study, respectively.In this manner, CR-HBNE model utilizes multiple information graphs and jointly learns node embedding to make recommendations for a researcher.Finally, the model uses parameters, i.e., b; u; #, l, k, and f to tune the significance of different relations on the final results.

Experiments
To conduct experiments, we divided the dataset randomly into two sets called the training v t and test set v p .Training set consists of 80 % of the papers, while test set v p possesses the remaining 20 % papers.Additionally, v ¼ v t [ v p and v t \ v p ¼ ;.For a query paper in the dataset, we provided top K paper recommendations from v t .If the ground truth is recommended in the top K, then the result is considered as relevant, otherwise not.

Datasets
To evaluate experimental results, we utilized two datasets, i.e., DBLP-Citation-network V111 and DBLP-Citationnetwork V122 .Details of these datasets are provided in Table 2, where P-P, Abs-Abs, A-P, P-T, P-T' and P-F represent paper-paper citation, abstract-abstract, authorpaper, paper-topic, paper-time, and paper-FOS relations established between the participating objects.

DBLP-Citation-network V11:
The DBLP-Citationnetwork V11 contains papers from various domains such as computer science, mathematics, and economics.The citation data is gathered from various sources including the DBLP, and Microsoft Academic Graph (MAG).After pruning the dataset, we have 4,314,995 P-P relations.Similarly, Abs-Abs relations are 2,98,937, while A-P relations are 13,556,123.Also, other participating objects maintain relations shown in the first row of Table 2.
DBLP-Citation-network V12: The DBLP-Citationnetwork V12 has a relatively large size among the DBLP datasets.After cleaning the dataset for basic information, we have 3,553,935 P-P relations and 2,30,524 Abs-Abs relations.Moreover, this dataset provides similar information as we have in the DBLP V11.To extract top-k topics with higher topic coherence, we followed the procedure adopted in [10].We extracted 8,345 topics utilizing the inverted indexed abstracts.Also, for both the datasets, we considered maximum of five fields of study and authors corresponding to a research paper.
To compare the results of corresponding models, we used recall, MAP and nDCG metrics [15,19] defined as follows.
Recall: This metric measures the significance of a model based on the percentage of relevant recommendations that appear in its top-k results generated.We choose k ¼ 20; 40; 60; 80; 100 f g .
where, Q represents all target manuscripts, while R p denotes the list of top-k recommendations delivered for seed paper p. Mean average precision: This metric analyzes the significance of a model by examining whether the relevant manuscripts are suggested in top-k or not.Additionally, it penalizes those errors that happen high up in the top@k.
where TPseen denotes total true positives occurred till k.The cutoff value for the Average Precision (AP) is set to AP@10.MAP computes the average of APs as follows.
where AP p denotes the AP for a query manuscript p and S represents the total number of queries.nDCG: nDCG [57] analyzes the rank/position in the top-k of the true relevant papers in the list of recommended articles.It assesses the effectiveness of a model by considering the graded relevance of the recommended papers, which is computed as follows.
where, nDCG g denotes the normalized gain accumulated concerning position g, while G represents the list of relevant papers in the corpus up to position g.Discounted Cumulative Gain (DCG) is the weighted sum of the degree of relevance/relatedness of recommended/ranked papers, aiming that the top relevant papers should arrive at the top of the list.IDCG g denotes the DCG of ideal ordering, which is used for the normalization of DCG scores.The cutoff value for nDCG is 10.

Baseline models
This section presents the details of baseline methods used to compare the experimental results of the propose model.
• Doc2Vec [48] is a representation learning method applied to the textual content of research papers.Cosine similarity between the papers is computed using the documents embedding vectors generated from the textual content using Doc2Vec.The dimension of word vectors is set to 300.• VOPRec [26] learns the embeddings of nodes by exploiting the textual content and network structures.
Then the learned embeddings are used to find top-k recommendations for a query manuscript.
• CCA [47] learns the low-dimensional representations using the content and network proximity.To learn content and network embeddings, it uses Doc2Vec and DeepWalk embedding methods, respectively.Finally, the similarities between the learned vectors are computed to make relevant recommendations.We set the dimensions of DeepWalk to 64 and 300 for the Doc2Vec method.• GAN-HBNR [52] generates citation recommendations by utilizing Doc2vec [48] and Denoising Autoencoder (DAE) [58] to exploit the network structure and relevant content.We set the hidden state of the DAE to 100.• BNR [15] is a NRL method that explores network proximity and relevant papers' content to provide recommendations against a seed paper.• CR-HBNE is the proposed model, we experimented with its five variants, viz., CR-HBNE V 1 , CR-HBNE V 2 , CR-HBNE V 3 , CR-HBNE V 4 , and CR-HBNE V 5 .CR-HBNE V 1 is the version which employs the paperspapers and abstracts-abstracts relationships only.CR-HBNE V 2 extends the previous version by adding papers-FOS network to enhance results.CR-HBNE V 3 enriches the previous version with topics relationship with no use of author's information, and P-T relations.CR-HBNE V 4 is an updated variant of PR-HNE V 2 , which incorporates A-P relations.The final and proposed version is CR-HBNE V 5 , which utilizes all the relation networks including papers-time.

Comparative analysis of models
In this section, we analyze the experimental results of the proposed CR-HBNE model compared to other state-of-theart baselines employing MAP, nDCG, and recall as the evaluation metrics.Table 3 acknowledges that CR-HBNE V 5 produces more precised results compared to other baselines on the DBLP-V11 dataset.To this end, Doc2vec has created very poor results compared to other baseline approaches.This is because it employs only the contents of papers while ignoring to consider auxiliary side information.On the other hand, the BNR model has presented second best results, its significance is attributed to its ability to exploit metadata, i.e., paper's abstract, title, and venue to make citation recommendations.Yet, the CR-HBNE model outperforms BNR by gaining nearly 10% and 9% improved MAP and recall@100 scores, respectively.The significance of our model is credited to its use of contextual information employing the SPECTER language model, which is trained on an in-domain corpus and uses citation-informed transformers to learn semantic-preserving representations.The results of BNR are insignificant compared to our model as it ignores useful factors, namely papers' topics, field of study, publication time, and contextual information to generate more adequate results.Additionally, the nDCG@10 results demonstrate that the proposed model has created better ranked results compared to other counterparts.
Table 4 presents the results created on the DBLP-V12 dataset.It is clear that the BNR presents second-best outcomes related to other counterparts as it utilizes the contents and network structure proximity to create recommendations.On the contrary, Doc2vec and CCA produce trivial results, comparatively.The reason is that these models do not utilize heterogeneous information networks, which can result in more robust results.Notwithstanding, MAP and nDCG results of CR-HBNE has significantly beaten all the competitors since it utilizes the prominent information factors including papers citation relations, abstract-abstract contextual information, field of study, topical relevance, and temporal dynamics that help the model to capture researchers preferences.Finally, Fig. 2b demonstrates that CR-HBNE has produced nearly 7% improved Recall@100 compared to the second best performer, viz., BNR.

Effect of using relation graphs
To analyze the influence of participating information graphs, we conduct an ablation study, and the results are shown in Table 5. Particularly, we analyzed the influence of each information graph on the results produced by CR-HBNE.To do so, we tested different information networks viz., papers-authors, papers-topics, papers-time, paperspapers, and abstracts-abstracts.The results exhibit that the the abstract-abstract and papers-authors relations are the most influential, as they have a significant effect on the final recommendations.Additionally, the results created on the DBLP-V11 and DBLP-V12 demonstrate that incorporating the temporal dimension has comparatively little effect on final recommendations.To conclude, this study shows that the usage of the authors' information and abstract-level semantic relations has a great overall impact on capturing researchers' preferences.

Performance regarding Network Sparsity
Our model can produce relevant recommendations when their is sparsity in the heterogeneous bibliographic networks.Also, it is important to analyze the impact of recommendation results when the networks are sparse.To this end, we evaluate and compare the performance of citation recommendation models with respect to the sparsity of networks based on DBLP-V12.We select DBLP-V12 because its network is tremendous and dense in comparison to the DBLP-V11.To assess the results of various models, we cut off the original network randomly using different percentages of edges (of the HBN) and then built a new network by resetting the edges and used that new network for analyzing the performance of models.Besides, we employ the recall metric to judge the performance of different recommendation models.The results depicted in Fig. 3 exhibit that the results produced by all the models are degraded compared to the former results on DBLP-V12, which is natural because the network is sparse with insufficient information about the corresponding objects.
The results reveal that the final version, viz., CR-HBNE V 5 of the proposed model has yielded more significant recommendations, though the network comprises edges of almost 30% of the original DBLP-V12.The performance of models that use only the network structure is degraded significantly.On the contrary, the models that use contextual information along with auxiliary information    In this section, we analyze the impact of different parameters, such as dimensions, context size, and regularization parameters, on the results of CR-HBNE (Fig. 4).The dimensions of nodes play an important role in the significance of the results of the proposed model.In Fig. 5, we can notice that the MAP score increases when the size of the dimensions is increased and does not change after reaching d ¼ 140 and d ¼ 120 on the DBLP-V11 and DBLP-V12 datasets, respectively.Then, the MAP score does not change significantly.In contrast, results are greatly affected by tuning the parameters p and q.Our model generates more significant results when the values of p and q decrease.Additionally, a small value of p restrict the walk to not go away from the starting node, while a low score of q parameter performs the opposite job.Our model achieves the best results when we set p=1, and q=2.In addition, we can notice in Fig. 4a, b that the results of the model improve to the number of walks per node T and walk length l.It is evident that the model produces better results for T=12, and l=100.Furthermore, the neighborhood (context) size provides improved results on higher values, therefore we fixed its size to 12.
To set the dimentionality of abstracts embeddings, we use the default setting adopted in the SPECTER [55] model.On the other hand, the regularization parameters, i.e., b; #; u, l, k and f are employed to analyze the impact of participating graphs on the final recommendations.That is, we analyze the affect of authorship information, paper topics, papers citations, field of study, contextual information, and temporal dynamics.For the sake of simplicity, we assign values to parameters in such a way that the their net sum equals 1.We selected such values that cause better results.CR-HBNE yields the best results for b ¼ 0:1; # ¼ 0:3; u ¼ 0:2, l ¼ 0:1, k ¼ 0:2 and f ¼ 0:1 on the DBLP-V11 as shown in Table 7.On the DBLP-V12, the model gives best results by using the parameter settings adopted for the DBLP-V11.The results show that for large values of paper-author and abstract-abstract graphs, the model achieves significant MAP results.It demonstrates that exploiting author information and contextual relations boost the recommendation results.

Conclusion and future research
In the past years, various citation recommendation models have been proposed to assist researchers in their scholarly exploration.However, these models lack exploiting salient factors and heterogeneity in the network to capture researchers' preferences and produce relevant results.The existing models suffer from cold-start and network sparsity problems.To address such issues, we introduced a network embedding model termed as CR-HBNE, which exploits semantic relationships between participating objects and captures the preference dynamics of users to produce relevant citation recommendations.The experimental results revealed the effectiveness of the CR-HBNE against baseline models.We have future plans to analyze the significance of other factors and contextual information by introducing attention mechanisms.Additionally, we plan to conduct a user study to judge the applicability of the proposed model.

Definition 1 (
Heterogeneous Bibliographic Network): G ¼ ðN; EÞ is such a network that has two mapping functions, i.e., node type mapping u : N !O and relation type mapping function u : E !R. Here, each node v 2 N and edge e 2 E are linked with a specific node type and relation type, respectively.Additionally, E ¼ [ r2R E r denotes the overall edges of the network, where E r represents the set of edges that has a view/relation type r 2 R [ 1.In a HBN, we have jOj þ jRj [ 2. An example of HBN is depicted in Fig. 1a.

Fig. 1
Fig.1An illustration of the CR-HBNE recommendation process.Here a represents an example of a HBN, where objects including authors, papers, field of stuy, abstracts, topics, and time period

Fig. 2
Fig. 2 Comparison of recommendations models on the recall score computed on the test set, where a denotes recall score on the DBLP-V11 and b reports recall on the DBLP-V12 dataset

Fig. 4 Fig. 5
Fig. 4 Analysis of recommendation results using different parameters used, where a denotes the number of walks per node on the DBLP-V11 and DBLP-V12 and b represents the length of walks on the DBLP-V11 and DBLP-V12

Table 1
Symbols and notations employed in this research ARepresents total authors A ¼ fa 1 ; a 2 ; :::a n g P Represents set of papers P ¼ fp 1 ; p 2 ; :::p n g T Topics of research papers T ¼ ft 1 ; t 2 ; :::t n g Abs Set of abstracts Abs ¼ fAbs 1 ; Abs 2 ; :::Abs n g R Set of biased random-walk corpus N Set of vertices N ¼ fv 1 ; v 2 ; :::v n g

Table 2
Specifications

Table 3
Results

Table 5
The effect of integrating different relation graphs Recall score on the DBLP dataset when the network is sparse sources have proved stability in their results.Finally, it is worth noticing that as long as we increased the numbers of edges in the bibliographic network, the performance of the CR-HBNE has improved, which proves the effectiveness of the proposed model optimization.The examination in Table6demonstrates that even if we have missing information, yet the model can exploit auxiliary information sources for producing useful recommendations.For instance, if a paper does not contain citation relations, our model can utilize field of study, topics, abstract-abstract semantic relations, and authors information to come up with adequate results.It is evident from the results that CR-HBNE V 5 has gained 12% and 9% better MAP and Recall@100 results compared to the second best performer BNR model.

Table 6
Results over cold-start papers, where bold results reveal best model and Ã results show the runner-up model

Table 7
Analysing the results of the CR-HBNE by tuning parameters b, #, u, l, k, and f