Dynamic Analysis of User-role and Topic-influence for Topic Propagation in Social Networks

Hot events spread quickly on social networks. Predicting event diffusion on social networks, also known as topic propagation, is an important task. The two important factors that affect topic propagation are users and topics, and both users’ roles and topics’ influences are time dependent on social networks. However, existing studies have largely overlooked this fact, so topic propagation prediction is still a major challenge. In this paper, a Topic Propagation Prediction method is proposed based on dynamic analysis of user-role and topic-influence, named TPP-DA, which predicts the topic propagation on social networks from both users’ and topics’ perspectives. First, we introduce a temporal perspective to improve the static analysis to the dynamic analysis of user-role, which is more adaptable to the changeable user-roles on social networks. Second, we introduce a metric called the topic heat to dynamically analyze the topic-influence on a single user and social group. Third, we combine the dynamic analysis of user-role and topic-influence with a weighted probability model to accurately predict topic propagation trends. The weights are determined by the dynamic analysis of user-role and topic-influence. Finally, several experiments are conducted to evaluate TPP-DA. Compared with TPP, the average error rate of TPP-DA is reduced by approximately 33%, which proves the efficiency of TPP-DA.


I. INTRODUCTION
Today, social networks are playing an increasingly important role in people's lives. Hot events or topics spread rapidly in social networks. Analyzing hot events in social networks can help governments and companies in mining useful information or studying rumor diffusion models [1,2,3]. For example, for a good topic in social networks, e.g., "Olympic Games", we must predict how many people will pay attention to the topic in the future. In contrast, we should understand the propagation law of a rumor so that the government can refute it. Therefore, topic propagation prediction, which predicts the spread of topics, is an important task.
Topic propagation refers to how information is propagated through social networks [4]. It may be affected by the users and topics in social networks. Additionally, topic propagation can be affected by other factors, such as the community structure and the temporal information of social networks [5,6,7]. While user relationships are important parts of community structure, and the time-varying user interest and topic heat embody the temporal information of social networks, they also depend on users and topics. Therefore, we only focus on analyzing the impact of users and topics on topic propagation in this paper.
First, users have different interests in different topics in social networks. A user may have different roles in different topics, so he or she may have varying degrees of influence on other users regarding various topics. Meanwhile, users have many friends with the same or similar interests in social networks, and they interact with each other in various ways. They may also have different roles in their relationships and have varying degrees of influence on each other to affect topic propagation. Furthermore, the analysis of user-role should consider the changeability of users' interests or hobbies over time. In brief, user-role analysis should be considered in topic propagation prediction.
Second, the topic itself affects topic propagation. Different topics in social networks have different influences. These topics compete for users' attention. For example, many people focus on hot events; therefore, hot topics can spread quickly. However, other topics may diffuse slowly in social networks. Meanwhile, the heat of the topics will change over time. As the topic heat changes, the topic-influence also changes. Therefore, topic-influence should be considered when studying topic propagation.
There is some research on topic propagation models, such as the linear threshold (LT) model [8,9] and the independent cascade (LC) model [10,11,12,13]. Some works utilized epidemic models [14,15,16], data-driven models [17,18,19,20], various dynamical models [21,22,23,24,25] or deep learning neural networks [26,27,28,29] for topic propagation to explore information diffusion. However, these works did not conduct both user-role and topic-influence analyses, so they could be further improved. In our previous work [30], we proposed a user-role-based topic propagation prediction (TPP) model, but it analyzed user-role statically, and it did not consider topic-influence. Social networks are time-varying, and user-roles and topic-influences are time dependent. Existing studies on topic propagation have overlooked the time dependency of user-roles and topicinfluences; predicting topic propagation trends remains a major challenge.
In this paper, we propose a Topic Propagation Prediction method based on the dynamic analysis of user-role and topicinfluence, named TPP-DA. Its purpose is to predict the topic propagation in social networks considering both users' and topics' perspectives.
First, we analyze the inadaptability of our previous static user-role analysis, to user-roles changing over time in social networks. To address the problem, a temporal perspective s introduced to static analysis. Then, user-role is analyzed dynamically. Similar to static user-role analysis, four userrole factors are incorporated to characterize user attributes along two dimensions. In one dimension, the user expertfactor and leader-factor are analyzed based on a single user behavior. In the other dimension, the user social-factor and similarity-factor are described based on social behavior.
Second, topics are time dependent, and they compete for users' attention. The topic itself also plays an important role in topic propagation. Topic-influence should be considered when predicting topic propagation. Topic-influence analysis consists of two parts. The first part is the topic-influence on a single user, and the second part is the topic-influence on a social group. We also introduce a metric called topic heat to calculate the topic-influences on a single user and social group.
Third, the dynamic analysis of user-role and topicinfluence is combined with a weighted probability model to predict topic propagation trends more accurately. Here, the weights are determined by the dynamic analysis of user-role and topic-influence. Behavior probability, relationship probability and time probability are utilized to form the probability model, and they are calculated based on user behaviors, group relationships and time spans.
Finally, we present the algorithms in detail and analyze the computational complexities of the algorithms. Some experiments are conducted to evaluate TPP-DA. Compared with TPP (our previous topic propagation with static userrole analysis), the average error rate of TPP-DA is reduced by approximately 33%. The experimental results show its efficiency. Our main contributions have particular importance for topic propagation prediction. To summarize, the contributions of this paper are as follows: 1) The dynamic analysis of user-role and topic-influence is utilized to conduct topic propagation prediction. To the best of our knowledge, this paper is the first to predict topic propagation based on the dynamic analysis of user-role and topic-influence in social networks from both users' and topics' perspectives.
2) A temporal perspective is introduced to improve our previous static analysis to dynamically analyze on user-role, which is more adaptable to user-roles changing over time in social networks. Topic-influences are also dynamically analyzed on both single users and social groups, which are calculated based on topic heat.
3) The dynamic analysis of user-role and topic-influence is combined with a weighted probability model to predict topic propagation trends more accurately. Some experiments are conducted to evaluate TPP-DA, and the experimental results show its efficiency.
The remainder of this paper is organized as follows. Section II describes the related works. Section III presents the topic propagation problem formulation. Section IV shows the dynamic analysis of user-role and topic-influence in detail. Section V gives the weighted probability model and topic propagation (TPP-DA) algorithm. Section VI evaluates the method. Finally, the conclusion is presented in Section VII.

II. RELATED WORKS
One of the main issues in social networks is to predict information diffusion. With the development of networking and informatization in recent years in particular, the study of topic propagation models has been a popular topic in academia.
Several representative topic propagation models are available, such as the linear threshold (LT) model [8,9] and the independent cascade (LC) model [10,11,12,13]. For strong communities that can facilitate global diffusion by enhancing local intracommunity spreading, Nematzadeh et al. [8] investigated topic propagation with the linear threshold model. Guille et al. [9] proposed a practical solution that aimed to predict the temporal dynamics of diffusion in social networks. The approach was based on machine learning techniques and the inference of time-dependent diffusion probabilities from a multi-dimensional analysis of individual VOLUME XX, 2017 behaviors. Gray et al. [10] studied the effect of graph structure on the flow of information over a network using Watts's simple model of global cascades. Wang et al. [11] developed a user representation learning model to solve the information diffusion prediction problem on social media. The model learned the role-based representations based on a cascade modeling objective and employed the matrix factorization objective of reconstructing structural proximities to regularize the representations. A diffusion model based on a cascade model framework was proposed to generate the retweeting network in [12]. Gao et al. [13] proposed a novel information-dependent embedding-based diffusion prediction (IEDP) model. The proposed model further learned the propagation probability of information in the cascade as a function of the relative positions of information-specific user embedding in the informationdependent subspace.
Due to the similarity between topic propagation in social networks and virus propagation on biological networks, epidemic models [14,15,16] have been widely applied to explore the information dynamics in social networks. Considering that the numerical solutions (both continuous and discrete) of traditional SIR-based (Susceptible-Infective-Removed) models cannot match the corresponding simulation results of node behavior, Rui et al. [14] extended the classic SIR model to the new SPIR (Susceptible-Potential-Infective-Removed) model by introducing the new concept of the potential spread set. In the diffusion process, a new proposed model characterized a practical reinfectionreemergence scenario caused by the change in social attributes for individuals [15]. Kong et al. [16] introduced a connection between generalized stochastic SIR models and self-exciting point processes in a finite population. However, nearly all of the previous epidemic models failed to sufficiently consider the influence of social ties. Subsequently, some studies using data-driven models [17,18,19,20] for topic propagation have emerged. An analysis method of the influence of the potential edges of topic propagation was studied using a simple topic propagation model of the networks [17]. Molaei et al. [18] proposed a novel heterogeneous deep diffusion (HDD) approach in which functional heterogeneous structures of the network were learned by a continuous latent representation through traversing meta-paths. Li et al. [19] proposed a diffusion model based on multiple messages and a multiplex network space. Stai et al. [20] defined informed Twitter users as those who have produced/reproduced tweets with a specific hashtag to more precisely capture real information propagation on Twitter.
The above topic propagation models do not generally follow the principle of the conservation of matter. Therefore, the majority of studies on topic propagation use various dynamical models [21,22,23,24,25]. Based on the propagation dynamics, Liu et al. [21] proposed a nonlinear dynamic emergency topic propagation system and mathematical model for public events. Cao et al. [22] investigated topic propagation from an evolutionary gametheoretic perspective, and they derived the evolutionary dynamics and evolutionarily stable states (ESSs) of diffusion. Farajtabar et al. [23] proposed a temporal point process model, COEVOLVE, which efficiently simulated interleaved diffusion and network events and generated traces obeying the common diffusion and network patterns observed in realworld networks. Hu et al. [24] proposed a hydrodynamic topic propagation prediction model (hydro-IDP) that exploits a hydrodynamic model to describe the spreading process of information on online social networks. Saha et al. [25] proposed the Competing Recurrent Point Process (CRPP), a probabilistic deep learning machine that unifies the nonlinear generative dynamics of a collection of diffusion processes and interprocess competition -the two ingredients of visibility dynamics.
In addition, owing to the significant recent successes of deep learning in multiple domains, attempts have been made to predict information diffusion by developing neural network-based approaches [26,27,28,29]. Chen et al. [26] proposed a deep multitask learning-based information cascade model (DMTLIC), which explicitly modeled and predicted cascades through a multitask framework with a novelly designed shared-representation layer. Wang et al. [27] proposed a novel sequential neural network with structural attention to model topic propagation. The proposed model explores both the sequential nature of an information diffusion process and the structural characteristics of a user connection graph. Mishra et al. [28] proposed a recurrent neural network model by modeling a social cascade with a marked Hawkes self-exciting point process and then learned a predictive layer on top for popularity prediction using a collection of cascade histories. Sankar et al. [29] present a novel variational auto-encoder framework (Inf-VAE), which utilized powerful graph neural network architectures to learn social variables to predict the set of all influenced users.
With the increase in the interaction frequency on hot topics among users in social networks, users and topics play increasingly important roles in topic propagation. Because the user-roles and topic-influences are time dependent in social networks, predictions of topic propagation trends should account for this changeability. However, the existing models either rely on the probabilistic modeling of information diffusion based on partially known network structures or discover the implicit structures of diffusion from users' behaviors without considering the dynamic analysis of user-role and topic-influence; therefore, they need to be further optimized and improved. In this paper, based on our previous work [30], we introduce the dynamic analysis of user-role and topic-influence to the topic propagation prediction model, which predicts topic propagation in social networks, considering both users' and topics' perspectives to predict topic propagation trends more accurately. VOLUME XX, 2017

III. TOPIC PROPAGATION PROBLEM FORMULATION
In this paper, we define topic propagation as predicting the number of people who will spread a topic on a certain day in the social network. Specifically, given a social network and a topic, we should be able to predict how many people will pay attention to the topic in a few days. In this section, we will give the problem formulation of the topic propagation.
We use   In brief, given a social network S with the users information (relationships) and all messages from time 1 to time T, and for a topic in S , we should predict how many people will pay attention to the topic at time t,  As mentioned earlier, the topic propagation may be affected by the users and topics in social network, and the user interest and topic heat change over time. Therefore, we introduce the dynamic analysis of user-role and topicinfluence in topic propagation. Figure 1 shows the framework of our topic propagation model. The model contains three parts.
The first part is the dynamic analysis of user-role. Taking the social network as the input, user-roles are first dynamically analyzed from four aspects. Four user-factors are introduced to analyze user-role. The second part is the dynamic analysis of topic-influence, which takes the social network as the input and calculates the topic-influences on a single user and a social group. Based on the dynamic analysis of user-role and topic-influence, the third part introduces a weighted probability model to accurately predict topic propagation trends.

IV. DYNAMIC ANALYSIS OF USER-ROLE AND TOPIC-INFLUENCE
In this section, we introduce the dynamic analysis of userrole and topic-influence in topic propagation, which is the basis of the topic propagation model.

A. DYNAMIC USER-ROLE ANALYSIS
In our previous work [30], we studied how to statically analyze user-roles. However, since we know that social networks are time-varying systems and user-roles and topic-influences are time dependent in social networks too, the analysis of user-role should consider this changeability over time. For example, one of the user-roles, the expertfactor, has a value that has remained static and constant throughout time according to the static user-role analysis in our previous work. However, it would be impractical because users may have dynamic interests, attributes, and features over time. As a result, the expert-factor value will also change over time.
We calculate the expert-factor value using static user-role analysis with a dataset over 30 days. In addition, we divide the 30-day dataset into 10 consecutive sub-datasets, each spanning 3 days. Since we want to conduct static user-role analysis on topic propagation, we conduct the same calculation for the time windows (each 3-day sub-dataset) independently. That is, for each time window, we calculate the user expert-factors using data from every 3 days of the time window. These results are shown in Fig. 2. The static expert-factor over 30 days is constant, and the values every 3 days are different, which proves that user-roles change over time. This changeability should be considered when predicting topic propagation.
In addition to the user expert-factor, the user leaderfactor values are calculated over 30 days and every 3 days. The results are shown in Fig. 3. As the user leader-factor is calculated based on his/her relationships, which are generally stable, it is less influenced by time. As a result, the change in the user leader-factor is very small. In fact, the content-based factors (expert-factor and similarityfactor) are more time dependent, while the social-based factors (leader-factor and social-factor) are less time dependent. In summary, introducing a temporal perspective is necessary when analyzing user-role in social networks.  (1) DYNAMIC USER EXPERT-FACTOR This factor calculates the relative expertise of different users in a social network. The expert users in a social network exert a greater influence on topic propagation than other users. Based on our previous work, the dynamic VOLUME XX, 2017 expert-factor can be denoted by introducing the time dimension: (2) DYNAMIC USER LEADER-FACTOR This factor represents the influence of users' social relationships in social networks. Users with more social relationships may have greater influence than other users. Similarly, the static user leader-factor is modified by adding the time dimension. The PageRank algorithm is used to calculate all user leadership factors based on the regression relationship. According to the PageRank algorithm, the dynamic user leader-factor can be summarized as: Ut is the set of users who have discussed topic z at time t, () (3) DYNAMIC USER SOCIAL-FACTOR The dynamic user social-factor measures the strength of all social relationships between the users in a social network. In other words, the more users who have the same friends, the closer and stronger the social relationships among users. Here, using time as the variable and adding parameter t, the dynamic user social-factor is obtained as: where ( , , , ) (4) DYNAMIC USER SIMILARITY-FACTOR The dynamic user similarity-factor is defined as the number of messages that users interact with each other in social networks, which describes the similarity of each user pair in terms of subject preferences. The more interactions on the same topic, the more users exert similar influences in social networks. The dynamic user similarity-factor between i u and j u is calculated as:

B. DYNAMIC TOPIC-INFLUENCE ANALYSIS
In social networks, the topic cannot propagate without social users. In addition to users, the topic itself plays an important role in topic propagation. For example, hot topics can spread quickly while other topics may diffuse slowly in social networks. Generally, social networks have many topics. These topics have different influences, and they also compete for users' attention; therefore, topic-influence affects topic propagation. Some hot topics will spread to more users, which can further enhance their influences. As a result, the impact of topic-influence on topic propagation should be considered.
Topic-influence analysis consists of two parts. One part is the topic-influence on a single user, and the other part is the topic-influence on a social group. In this subsection, we analyze the topic-influence on both a single user and a social group.
(1) TOPIC-INFLUENCE ON A SINGLE USER In general, a user has dynamic interests; he or she may pay close attention to several topics that will consume the user's attention. Therefore, the first factor affecting the topicinfluence on a single user depends on how much attention the user directs on the topic, i.e., the topic heat on the user. The following formula is adopted to calculate the topic heat on a user, which is the ratio of the user's messages on a certain topic to the user's messages on all topics at time t.
Furthermore, users can interact with each other in social networks. If a user's friends are discussing a topic, then the user is highly likely to be interested in the topic; therefore, the topic-influence on a single user is related to the topic heat on his or her friends. As a result, the second factor affecting the topic-influence on a single user is: where () (2) TOPIC-INFLUENCE ON A SOCIAL GROUP The topic-influence on social groups depends on the topic heat of topic z for all users in the social group. The topic heat on a social group is denoted as: If the total number of messages on topic z from time 1 to time t is large, then topic z may be popular, and its influence will also be large. Therefore, the topic-influence of topic z on a social group is:

C. LAGRANGE INTERPOLATION POLYNOMIAL
For user i u and topic z, if we have the previous s values, This formula works because each fraction takes the value of 1 at the appropriate Then, we can obtain the expert-factor of i u on topic z at time (t + 1), which is: ( , , 1) ( , , 1) As for other user factors and topic-influences, we can construct similar Lagrange polynomials with previous s values and then obtain the dynamic value at time t + 1.

V. DYNAMIC TOPIC PROPAGATION MODEL
In this section, we present the weighted probability model used to predict topic propagation. First, we a briefly introduce the probability model in our previous work [30]. Second, we study how to combine the dynamic analysis of user-role and topic-influence with probability model to predict topic propagation trends. Third, we present the pseudocode of the TPP-DA algorithm.

A. THE PROBABILITY MODEL
User behaviors, relationships and time spans are used to calculate the probability of user engagement on a topic. Specifically, the number of times a user participates in a topic, the number of people participating in a topic and the popularity of the topic will affect the user's behavior in different ways. In other words, a probability model is built according to behavior probability, relationship probability, and time probability.
(1) BEHAVIOR PROBABILITY If a user often participates in the discussion of a certain topic, then he or she is interested in the topic and may discuss the topic in the future. As a result, we designate a behavior probability function ( , , ) bi p u z t for user i u engaging in topic z at time t, which is: (2) RELATIONSHIP PROBABILITY Considering a user's social relationships, if more users or friends participate in a discussion, then he or she will be more willing to discuss the topic. We designate a relationship probability function ( , , ) where z y is the ratio of the change of the user number, which is calculated by dividing the total number of users involved in the discussion of topic z from time (t−(q′−1)) to (t−1) by the total number of those involved in the discussion of topic z from time (n = t − q′) to (t − 2). If the total number of users decreases during the time period, z y is less than 1. In addition, q′ means that a user's relationship probability at time t is related to the relationship probabilities from time (t − q′) to time (t − 1). In addition, z l is a parameter that is trained for topic z in advance, and e is the base of the natural logarithms.
(3) TIME PROBABILITY Regarding the effectiveness of user interests, we assume that users will gradually lose interest after participating in topic discussions for a period. Generally, the total number of participants decreases after reaching the peak number of participants, and the rate of decline increases with the period of time after the peak time. We designate a time probability function ( , , ) ti p u z t for user i u engaging in topic z at time t to calculate the probability using a time lapse factor: where z m t is the peak time when the number of participants is the highest from the initial time to time t on topic z, and z  is a lapse exponential coefficient to be evaluated by experience. If ( ) 1 z i xt , then the value of the time probability function is larger than 1; and the greater t is, the lower the value of the function will be. Thus, the longer the interval from the peak time to the predicted time is, the lower the probability of a user joining the discussion will be.

B. TOPIC PROPAGATION MODEL
The dynamic user-role and topic-influence analysis are combined with the probability model and integrated into a unified topic propagation prediction model. First, according to the analysis, users have different roles in different topics, and the user-role influences also change over time. Dynamic user-role analysis should be considered in topic propagation. Second, the topic itself plays an important role in topic propagation, therefore topicinfluence should be considered in topic propagation. The user-role (the user leader-factor and expert-factor) and topic-influence are combined in a single user with the behavior probability to obtain the following weighted probability: Similarly, the user social relationships (i.e., the socialfactor and similarity-factor) and topic-influence on social groups are combined with the relationship probability to obtain: 2 ( , , ) ( , , ) As time passes, the probability of a particular user engaging in a specific topic discussion decreases. Moreover, this probability is affected by the user's roles. In general, if a user is an expert user on a topic, he or she will continue to pay attention to it. Otherwise, he or she may lose interest in the topic after a period of time. Thus, we revise the time probability of a user on a topic as: Considering user-role and topic-influence analysis and the weighted probability model together, we can devise a Here, ( ( )) we can obtain the values of ( ( ) 1)

C. ALGORITHMS
Because the TPP-DA is based on the dynamic analysis of user-role and topic-influence, we first should train the model and obtain the user-role factors and topic-influences. Based on the training and testing results, we provide the details of the TPP-DA algorithm.

VOLUME XX, 2017
We sort all messages from time 1 to time T in chronological order, and then we divide them into a training set and a testing set, which is The processes of topic-influence training and testing are similar to those of user-role training and testing. Algorithms 3 and 4 give the training and testing processes of the topicinfluence analysis respectively. In Algorithm 3, we first update some information for each subset (lines 5-9), and then calculate the topic-influences (lines 10-15). In VOLUME XX, 2017 Algorithm 4, we construct the Lagrange formulas and predict the values of topic-influences at time (t + k) (lines 5-11). Next, we update () z i x t k  with the subset, which is prepared for the next iteration at time (t + k + 1). In Algorithm 3, the computational complexities of lines 5-9 and lines 10-15 are (

A. DATASET
To evaluate the effectiveness of the TPP-DA method, suitable social network datasets are needed. The datasets are collected from Sina Weibo and Twitter. After crawling the data, the Sina dataset and Twitter dataset are obtained. The sSina dataset contains 8586 users, 416826 microblogs (messages), and 98362 user relationships, while the Twitter dataset has 6060 users, 15485 tweets (messages), and 11617 user relationships. These data are managed in three  tables: a user table, a blog table, and a user relationship  table. The user table contains user information; the blog  table contains all attributes of all messages; and the  relationship table consists of two fields, "suid" and "tuid", indicating that "suid" pays attention to "tuid".
As there are many topics in social networks, we focus on only hot topics for the sake of simplicity. In the Sina dataset, four hot topics are chosen: "US Presidential Elections", "Novel Coronavirus Pneumonia", "Smog", and "Elon Musk", which are labeled as the 1-1 st topic, 1-2 nd topic, 1-3 rd topic and 1-4 th topic, respectively. In the Twitter dataset, two topics, which are "tokyo_2020" and "capitol_2020" are selected and labeled as 2-1 st topic and 2-2 nd topic, respectively. After data preprocessing, we remove messages that do not relate to the above topics. Then, the datasets are divided into six sub-datasets. For example, 19458 microblogs and 5472 relationships among 1228 users are related to the 1-1 st topic "US Presidential Elections" in the first sub-dataset, 17184 microblogs and 5164 relationships among 1305 users are related to the 1-2 nd topic "Novel Coronavirus Pneumonia" in the second sub-dataset, and so on. Last, we obtain the Sina dataset with 50127 microblogs and the Twitter dataset with 12367 tweets. The Sina dataset is related to 3672 users and four topics, and there are 14846 relationships among these 3672 users, while the Twitter dataset contains 982 users, 7636 relationships and 12367 tweets related to two topics. The detailed information of the datasets is shown in Table II and  Table III. Note that each user may have discussed more than one topic, and the sum of the users on all topics may be larger than the number of users in the dataset, which is similar for the numbers of relationships and messages. All messages are sorted in chronological order into two datasets. Then, the datasets are divided into a training set and testing set. The training set and testing set are independent, and both of them cover all users and topics in each dataset. To show the dynamic analysis of user-role and topic-influence, the training set and testing set were divided into subsets for every day. The training set covers messages from time 1 to t, and the testing set contains the latter messages ranging from (t+1) to T. Here, the time unit in [1,T] is one day.

B. PARAMETER SETTINGS
(1) PARAMETER OF THE LAGRANGE FORMULA TRAINING The first important parameter used to construct the Lagrange interpolation formula in the dynamic analysis of user-role and topic-influence is s . As mentioned in Eq.
(11), we can predict the next value when we know the previous s values; the value of parameter s affects the Lagrange interpolation formula. Generally, the larger the parameter s is, the more complex the constructed Lagrange formulas will be. As a result, we cannot set parameter s too large. In our experiments, we set s as 2, 3, 5, 7, and 9. Then, we construct the Lagrange formula in Eq. (11) to dynamically predict user expert-factors from time (t + 1) to T with Eq. (12).
After constructing the Lagrange formula with the previous s values, we calculate the predicted expert-factors with Eq. (12). Next, we compare them to the results VOLUME XX, 2017 calculated with Eq. (1) using the actual data. The average error rates between them with s varying are shown in Table  IV.  Table IV shows that the average error rates are the smallest when s is set as 5 or 7 on different topics. When s = 2 or s = 3, the Lagrange formulas will be linear functions and quadratic polynomial functions. In these settings, the average error rates are very large because they are not suitable for sudden changes in user expert-factor values. The constructed Lagrange formulas are too complex when s is set as 9, and the average error rates are also large, which may be due to overfitting. The results of setting s = 5 and s = 7 are the same. We set s = 5 to construct the Lagrange formulas when analyzing the user expert-factor dynamically.
As mentioned previously, the user expert-factor and similarity-factor are content-based factors, and the leaderfactor and social-factor are relationship-based factors; therefore, we take the user leader-factor as an example to conduct a dynamic user-role analysis experiment when s is set as 2, 3, 5, 7, and 9. The results are listed in Table V. As analyzed in Section IV, the user leader-factor is calculated based on the relationships, which are generally stable in social networks; therefore, the change in the user leaderfactor is very small when s varies. For simplicity, we set s = 5 when constructing the Lagrange formulas to predict the leader-factor value.
Last, we test the average error rates of topic-influence analysis, and the results are similar. In summary, we set s = 5 in the dynamic analysis of user-role and topic-influence.

C. EXPERIMENTAL RESULTS
(1) COMPARISON RESULTS WITH OUR PREVIOUS WORK In this subsection, the TPP-DA method is evaluated by comparing the prediction results of TPP-DA against those of our previous work (TPP). Furthermore, we compare the results of TPP-DA with actual data. We take time (each day) as the abscissa axis and the number of users who may discuss the topic at time n as the ordinate axis to draw curves. The prediction results of TPP-DA and TPP against the actual data on all topics are shown in Fig. 4. Moreover, we calculate the average error rates between the prediction results and the actual data, which are shown in Table VII. Fig. 4 and Table VII show that both TPP-DA and TPP can predict topic propagation trends accurately. Moreover, TPP-DA outperforms TPP on all topics in two datasets. In particular, Table VII shows that all average error rates of TPP are over 10%, and even the error rate of TPP on the 1-2 nd topic is nearly 16%. All average error rates of TPP-DA are less than 10%, which indicates that TPP-DA has a smaller error rate. Compared with TPP, the average error rate of TPP-DA is reduced by approximately 33% for two reasons. The first is that TPP-DA utilizes dynamic user-role analysis, which is more adaptable to user-roles changing over time in social networks. Conversely, TPP analyzes user-role only statically, which ignores that user-roles are time dependent. The second is that TPP-DA introduces topic-influence analysis. TPP-DA considers the competition among topics to analyze topic-influences on both single user and social groups, so it can model topic propagation trends more accurately. To summarize, TPP-DA is based on the dynamic analysis of user-role and topic-influence. Thus, it has better results than TPP, which also proves the advantage of dynamic analysis. In addition, both TPP-DA and TPP incorporate behavior probability, relationship probability and time probability to build a weighted probability model that predicts topic propagation. They can handle multiple peaks in the topic propagation process. For example, there are approximately four peaks at days 11, 15, 21, and 24 on the spreading of the 1-1 st topic. At these peaks, both TPP-DA and TPP can obtain similar prediction topic propagation trends. Because TPP-DA obtains the weights of the probability model based on the dynamic analysis of user-role and topic-influence, it can adjust these weights with time to adapt to the changeability dynamically. Therefore, TPP-DA has more accurate results than TPP. The experimental results in Fig.  4 and Table VII show the effectiveness of TPP-DA on topic propagation prediction in social networks. VOLUME XX, 2017 (2) COMPARISON RESULTS WITH OTHER METHOD In this subsection, TPP-DA is compared with another information diffusion method called temporal dynamics of information diffusion (TDID) [20], which was also compared with TPP in our previous work. According to [20], the total number of infected users at time t can be obtained by: (  The actual total number of infected users reaches its maximum (862 users) on the 10 th day, and it will not change over time; therefore, the results for only the first 10 days are shown. TPP-DA performs the best among the three methods, and TPP has the second-best results. This is because both TPP-DA and TPP consider user-role analysis in topic propagation, while TDID develops only an epidemic model of the temporal dynamics of information propagation for specific topics, which proves that user-role analysis is helpful for topic propagation. Moreover, TPP-DA performs slightly better than TPP, which shows that the dynamic analysis of user-role and topic-influence used in TPP-DA is better than the static user-role analysis used in TPP. In addition, we compare TPP-DA and TPP with TDID using the MAE (mean absolute error), RMSE (root mean square error) and 2 R (R-squared, i.e., the coefficient of determination), and the results are listed in Table VIII. TPP-DA has the smallest average errors (MAE and RMSE) and the largest 2 R . For example, the R-squared of TPP-DA is 0.948 on the 1-1 st topic, which is higher than those of TPP and TDID, and the MAE and RMSE of TPP-DA are smaller than those of TPP and TDID. These results mean that TPP-DA is closest to the actual results on topic propagation, which also proves that introducing dynamic analysis of user-role and topic-influence is more suitable for topic propagation prediction in social networks.

VII. CONCLUSION
In this paper, we analyzed the shortcomings of both the existing work and our previous work. Then, we applied dynamic analysis of user-role and topic-influence to propose the TPP-DA method, which can predict topic propagation considering both users' and topics' perspectives. First, TPP-DA introduces a temporal perspective to static user-role analysis to analyze user-role dynamically and accurately, which is helpful for topic propagation. Second, the topic competition and topic-influences in social networks are considered when studying topic propagation. TPP-DA also utilizes a metric called the topic heat to calculate the topicinfluences on a single user and social group. Third, TPP-DA combines the dynamic analysis of user-role and topicinfluence with a weighted probability model to predict topic propagation. Last, two datasets are crawled from Sina and VOLUME XX, 2017 Twitter, and experiments are conducted to evaluate TPP-DA on these two datasets. The experimental results show that TPP-DA performs more effectively and efficiently than others. Moreover, the average error rate of TPP-DA is 33% lower than that of the TPP, which also proves the effectiveness of the dynamic analysis of user-role and topicinfluence in topic propagation.
Although the TPP-DA can get better results than the others, it is a probability-based prediction model. In the future, we will consider how to utilize deep learning method to predict topic propagation. Moreover, we also should crawl more data to verify our method.