A PageRank-Based WeChat User Impact Assessment Algorithm

: In recent years, the mobile Internet has developed rapidly, and the network social platform has emerged as the times require, and more people make friends, chat and share dynamics through the network social platform. The network social platform is the virtual embodiment of the social network, each user represents a node in the directed graph of the social network. As the most popular online social platform in China, WeChat has developed rapidly in recent years. Large user groups, powerful mobile payment capabilities, and massive amounts of data have brought great influence to it. At present, the research on WeChat network at home and abroad mainly focuses on communication and sociology, but the research from the angle of influence is scarce. Therefore, based on the basic principle of PageRank, this paper proposes an influence evaluation model WURank algorithm suitable for WeChat network users. This algorithm takes into account the shortcomings of the traditional PageRank algorithm, and objectively evaluates the real-time influence of WeChat users from the perspective of WeChat user behavior (including: sharing, commenting, mentioning, collecting, likes) and time factors.


Introduction
With the advent of the Internet age, Mobile devices are becoming an integral part of people's lives, With the rise of social media platforms, and there is a trend beyond traditional media social. In recent years, in recent years, WeChat has emerged as a surprise, According to the 2017 WeChat data report released by Tencent's WeChat team [1]: As of November 2017, WeChat's market penetration in Chinese mainland reached 93%, with more than 938 million active users worldwide, with a daily average of 38 billion messages, social payments up 23 percent and offline payments up 280 percent. Every day there is a huge amount of data released on the WeChat platform, and more users rely on this platform to obtain information, make friends, leisure and entertainment.
As one of the most popular social platforms in China, WeChat has a large group of users, powerful mobile payment function and a large amount of information, so it is of great significance to study the real-time influence. Exploring the communication value and commercial value of WeChat network is the main purpose of studying the real-time influence of WeChat network. First of all, WeChat network is the virtual embodiment of social network. Its huge user group and information dissemination have the characteristics of real-time and are not limited by space, which provides enough samples for the study of human information dissemination mechanism. Secondly, the powerful mobile payment function in WeChat network has set off a "micro-business fever". A large number of merchants through the WeChat public number to push the latest information, store ideas, and related activities to achieve the purpose of marketing, publicity and consumption incentives, and break the limitations of traditional consumption methods. Therefore, the research on the real-time influence of WeChat network has certain commercial value.

Related Works
The starting point of social networking is email (E-Mail). Early E-Mail solved the problem of remote mail transmission. With the development of the Internet, virtual social networks have gradually diversified from foreign IM (instant messaging), Blog (blog), Twitter (Twitter), FaceBook (Facebook) to domestic Renren, QQ, Weibo, WeChat, the function is getting stronger day by day. As a typical local social network platform in China, WeChat is gaining more and more attention. In recent years, the evaluation of user influence on virtual social networking platforms has always been a hot topic for scholars. The research mainly adopts PageRank algorithm and its improved algorithm and topic extraction algorithm HITS. Foreign research on the user influence evaluation model of virtual social network platforms mainly focuses on the two social platforms of Twitter and FaceBook. Romero et al. [2] uses the HITS algorithm to evaluate the influence of Twitter users from both the fan itself and the number of fans. The algorithm does not take into account the user's forward interest rate, and it is easy for fans to be very active but will not be forwarded. Scholars such as Cha et al. [3][4] conducted research on the influence of Twitter users from the perspective of user behavior. Experiments and mathematical calculations show that users have many followers, but their messages may not necessarily be followed or received. Reposts, that is, the number of fans does not contribute much to the user's influence. Many domestic scholars have conducted in-depth research on it. Chen et al. [5] and others proposed a real-time influence algorithm MURank based on forwarding behavior. The MURank algorithm improves the topic drift problem of the traditional PageRank algorithm, analyzed the distribution law of the reposting time interval of Weibo users and redefined the damping coefficient (d). A comprehensive evaluation of the real-time influence of Weibo users in terms of the number of reposts, time intervals, and number of fans of Weibo users makes the evaluation of user influence time-effective. However, the algorithm still has many shortcomings, for example, it does not consider the influence of personal hobbies, geographic location and other factors on user attention. Li [6] proposed an improved HITS algorithm to evaluate the influence of nodes in the Github social network. This algorithm improves the problem of the same authority attribute and Hub attribute of different users in the traditional HITS algorithm, and associates them with users. And unrelated users are taken into account, but it does not take into account the user's own attributes. Xu et al. [7][8] proposed the DynamicPriortyPush algorithm. The DynamicPriortyPush algorithm improves the problem of the average weight distribution of the PageRank algorithm. In the continuously updated dynamic network, it can efficiently calculate the influence of Weibo users and continue to track it.
The PageRank algorithm is one of Google's classic algorithms. Since Larry Page and Sergey Brin (Sergey Brin) proposed in 1998, it is still a hot topic of scholars' research, involving sociology, economics, etc. In all aspects, the main research directions: web data mining and search engine optimization, influence evaluation research, etc.

Web Data Mining and Search Engine Optimization
In the context of explosive growth of network information, the storage of information in search engines is mostly unstructured or semi-structured, making it impossible for users to retrieve information quickly and accurately. Researching the PageRank algorithm ranking method and applying it to search engines can guide the web structure mining and link structure optimization in search engines, making it more in line with users' retrieval habits and more accurate and quicker search for the required information. Xian [9] proposed an improved PageRank algorithm that introduces user interests based on the time factor and applied it to search engine ranking. This algorithm can improve the completeness of system queries and the accuracy of page ranking, but based on User personalized mining is not deep enough. Gao [10] introduced the text similarity and time factor of latent semantic analysis into the PageRank algorithm, proposed the LSA-PageRank algorithm, and applied it to the search engine web structure mining. This algorithm can improve the relevance of topics. Although the algorithm can improve timeliness to a certain extent, it does not take into account the importance of web content. Ping [11] proposed the TLSPR algorithm, which is based on topic link similarity without increasing time complexity, and can sort web pages according to user satisfaction and improve search efficiency. Although the algorithm can achieve the purpose of controlling topic drift, but it does not well control the problem of ignoring user interests.

Evaluation of Influence
The evaluation of influence is mainly applied to social network nodes and books (papers, journals, etc.). The PageRank algorithm is applied to the evaluation of influence. Each user or each book is equivalent to a page in the PageRank algorithm, and the mutual association of users or the mutual reference of books is equivalent to the mutual links between pages, and finally constitutes a directed graph. Tunkelang [12] proposed the Tunk Rank algorithm, which is based on the basic model of

Measurement of WeChat's Influence
The influencing factors of WeChat user influence evaluation mainly consider the following three aspects: First, the user's own influence, for example, celebrities and ordinary people share the same Moments dynamics at the same time, the public pays more attention to celebrities and therefore has more influence. Second, the number of friends of the user. The more the number of friends of WeChat users, the greater the probability that the message will be shared, and the greater the influence. Third, the interaction behaviors between users. WeChat users' behaviors mainly include sharing, mentioning '@', commenting, favorites, and likes. The number of times a user's message has been shared, mentioned, commented, favorited, and liked the more it has, the greater its influence. When evaluating the influence of WeChat users, the influence of user behavior on influence should be considered.

PageRank Algorithm
The PageRank algorithm is one of the top ten classic algorithms for machine learning, also known as the page ranking algorithm and the Page algorithm. It was proposed by Google founders Larry Page and Sergey Brin in 1998 An algorithm for ranking the importance of pages. The core idea of the algorithm is to associate the importance of pages with links between web pages. If a page link points to another page to indicate support for the page, the support for that page will be enhanced. If a page is pointed to by many pages, then the page will get a lot of support, that is, the higher the PageRank value (referred to as the PR value), the more important the page will be. In the PageRank algorithm, the value of the importance of a page is called the PageRank value (PR value). The importance of a page is not only related to the number of links to the page, but also related to the importance of the linked page, that is, If the PR value of a page itself is higher, then the PR value of the page linked to it will increase accordingly. And the PageRank algorithm believes that the PR value of a page is evenly distributed to the pages it links to. For example, there are two links to Page2 and Page3 on Page1, two links to other pages on Page2, and three links to other pages on Page3. The weight assignment process is shown in Fig. 1. The mutual links between web pages essentially form a directed graph G(N, M), where N represents the set of all web page nodes, and M represents the set of directed edges formed by the mutual links between web pages. A and B belong to N, which are two nodes in the set of web page nodes. If there is a link from A to B, the link from A to B constitutes a directed edge, and the directed edge <A, B> belongs to M, if A There are other hyperlinks. A will evenly distribute its PR value to each linked page. Similarly, B will evenly distribute its PR value to its linked page nodes. The PR value is passed on until it reaches a stable and convergent state. From this, the initial model of the PageRank algorithm can be derived as follows: Among them, A and B represent two different web pages, PR(A) and PR(B) represent the PR value of A and B, respectively, k is the control coefficient (usually 0.85), and C(A) represents the set of all out-links of node A, and N(B) represents the number of out-links of web page B.
The above-mentioned initial model has a relatively big problem. In the network link structure, the links between web pages may not be in order from the inside to the outside as shown in Fig. 1, but are more likely to be random and disordered. Therefore, there may be such a special situation: A page may not be linked to the outside of the page, but a loop is formed during the web linking process. The PR value is calculated iteratively along the loop, and the final consumption is 0. This phenomenon is called Rank Sink (grade subsidence by Page et al.) as shown in Fig. 2. In order to solve this problem, Page et al. introduced a random walk factor n, that is, the damping coefficient, and modified the formula (1) to obtain the revised PageRank algorithm as follows: It can be seen from formula (2) that the PR value of a webpage is affected by the number of links on the one hand, that is, with a certain weight, The more the number of links, the greater the PR value of the webpage. On the other hand, it is affected by the PR value of the linked webpage, that is, when the link is certain, the greater the PR value of the linked webpage, the PR value of the page will increase accordingly big. The PR value of a webpage is calculated based on the PR value of the in-chain webpages. If a random non-zero initial PR value is given to each webpage, then after recursive calculation layer by layer, it will eventually reach a stable and convergent Status, the PageRank value finally obtained by the page is its importance value.

Problem Description
With the rapid development of the mobile Internet, WeChat quickly penetrated the Chinese market with its diverse information transmission methods and simple operation methods. As the most popular social platform nowadays, WeChat has a large user group, powerful mobile payment function, and massive amounts of information, attracting many scholars to conduct research on the influence of its users. In the WeChat network, if a friend of a WeChat user shares his Moments messages, it means that the friend supports or recognizes the user's views. If multiple friends share the user's messages, the user will get a lot Support, the WeChat user's influence on other users will be greater. The research on the influence of WeChat users is similar to the problem of ranking the importance of web pages. The greater the influence, the higher the importance of users. The PageRank algorithm, as a classic page importance ranking algorithm, subtly converts the importance ranking problem between web pages into mutual links between web pages. In the PageRank algorithm, a webpage evenly distributes its PR value to all its outgoing links, that is, all outgoing links of the webpage get the same PR value assigned to the page. This idea easily leads to lack of timeliness, that is, the existence time is longer. The PR value of a long webpage may be significantly higher than that of a newly linked webpage, and a webpage with a high PR value may be an old, outdated webpage. As a real-time communication software, WeChat has high timeliness and the behavior of WeChat friends on a message is not only sharing, but also mentioning '@', like, commenting, and bookmarking. These behaviors are positively related to the influence of WeChat users. the traditional PageRank algorithm can no longer satisfy the objective evaluation of the influence of WeChat users.
The WeChat User Rank algorithm (WURank algorithm for short) has partially modified the PageRank algorithm. The WURank value (WR value for short) is used to indicate the importance of a user. The higher the WR value, the higher the importance level. And the support of a user's friends to the user is called attention, that is, the more a user's friends support it, the more the friend's attention to it. In the WURank algorithm, the WR value of a user out of the chain is not evenly distributed but in a certain proportion. The higher the attention of a user's friends, the higher the WR value distribution ratio for the user. The attention to the same user may be different at different times, and its attention is affected by five user behaviors: sharing, mentioning '@', liking, commenting, and bookmarking. From this, the mathematical expression of WURank can be obtained: WURank formula parameters are as follows: (1) A and B respectively represent two different WeChat users.
(2) WURank(A,t): represents the influence level of user A at time t, that is, the WR value. The mathematical expression is as follows:

WURank Algorithm
It can be seen from the formula (3) that the process of obtaining a user's WR value is a recursive process, and iterates layer by layer until it reaches a stable state. At time t, the calculation process of the WURank value of the WeChat user network is as follows: (1) Determine the damping coefficient n(t), generally a value of 0.85 is the best.
(2) Assign an initial WR value to each WeChat user, generally the value is 1. WeChat user has to other WeChat users at time t.
(5) Substitute the values obtained in Steps (3) and (4) into formula (3), and recalculate the WR value of each WeChat user based on the initial value.
(6) Use the recalculated WR value as the initial value for each WeChat user's next calculation. (7) Repeat Steps (5) and (6) until the difference between the two calculated WR values this time and the last time is extremely small, that is, a stable state is reached.
(8) Stop the calculation, and calculate the WR value of each WeChat user in descending order according to the above process.
The above process is a single loop process. Based on the above loop, its time complexity is ) (n O , the calculation of the WR value for the entire WeChat network user is based on the process of recirculating the process, that is, a double cycle, and the time complexity can be calculated as The flow chart is as follows:

WURank Examples of Algorithms
As shown in Fig. 4, user A has two in-chains with WR values of 1 and 2, user A has shared the WeChat messages of B and C respectively, B also has an in-chain with a weight of 5, and B shares other users C shares the news of other users twice. From the above calculation results, it can be seen that WURank(B,t)>WURank(A,t)>WURank(C,t), the influence of user B is greater than the influence of user A is greater than the influence of user C.

Discussion of WURank Algorithm
Based on the PageRank algorithm model, the WURank algorithm takes into account the time factor, which makes the evaluation of WeChat user's influence time effective, and improves the lack of uniform weight distribution in the PageRank algorithm. The influence of five user behaviors, including sharing, mentioning '@', commenting, bookmarking, and liking, on WeChat users' different levels of support is comprehensively considered on the influence of a certain WeChat user. The WURank algorithm can consider the user's own influence and user behavior to make a relatively objective assessment of its influence, and the evaluation is time-sensitive. However, the algorithm still has many shortcomings, mainly in the following four aspects: First, the impact on WeChat users The influencing factors of power are not only user behaviors, but the user's interest preferences and geographic location will also affect the user's attention distribution ratio. Second, the lack of consideration of the timeliness of the damping coefficient, the user's attention to a certain user's sharing behavior at different times is not static, but will show a certain pattern over time. Third, as the complexity of the WeChat network increases, recursion will become more and more complex, requiring parallel computing technology and related servers to support.

Summary and Outlook
The WeChat network is affecting our lives in a subtle way, and more and more users cannot do without the WeChat platform in their lives. This paper mainly studies a user influence evaluation model suitable for WeChat. Based on the basic idea of the PageRank algorithm, the WeChat User Rank algorithm (WURank algorithm) is proposed. The algorithm takes into account the time factor, which makes the evaluation of WeChat user's influence time effective, and improves the lack of uniform weight distribution in the PageRank algorithm. Five user behaviors include sharing, mentioning '@', commenting, bookmarking, and liking. Different levels of support for WeChat users consider their influence on a certain WeChat user. Through the research of this algorithm, the timeliness of the real-time influence of users in social networks can be improved, and the commercial value of the rapid dissemination of specific product information in the WeChat network can be fully explored. However, the WURank algorithm still has many shortcomings, such as insufficient consideration of the timeliness of the damping coefficient, and the higher the complexity of the network, the greater the amount of calculation that requires parallel computing and server support. Therefore, the following work needs to be discussed: First, the timeliness of the damping coefficient, and the distribution of users' attention to a certain user at different time intervals, and the comprehensive consideration of the WURank algorithm in this article. Second, other behaviors of WeChat users, such as the number of chats, etc., these behaviors still have a great impact on the influence of WeChat users, and research on them is of great significance.