Research on the Application of User Recommendation Based on the Fusion Method of Spatially Complex Location Similarity

Since the user recommendation complex matrix is characterized by strong sparsity, it is difficult to correctly recommend relevant services for users by using the recommendation method based on location and collaborative filtering. *e similarity measure between users is low. *is paper proposes a fusion method based on KL divergence and cosine similarity. KL divergence and cosine similarity have advantages by comparing three similar metrics at different K values. Using the fusionmethod of the two, the user’s similarity with the preference is reused. By comparing the location-based collaborative filtering (LCF) algorithm, user-based collaborative filtering (UCF) algorithm, and user recommendation algorithm (F2F), the proposed method has the preparation rate, recall rate, and experimental effect advantage. In different median values, the proposed method also has an advantage in experimental results.


Introduction
With the rapid development of spatial information technology, spatial information such as smart sign-in, mobile services, and GPS has become one of the research hotspots in recent years. Accurate position prediction has a very important application value in urban planning [1,2], traffic forecasting [3,4], advertising push [5,6], and disease prevention [7]. e existing models for discovering the law of user mobility have their own advantages, such as Markov model and PMM. However, there are still the following defects: (1) the impact of time on user location changes cannot be truly and quantitatively reflected; (2) successive and interrelated effects are reflected in real and quantitative terms.
With the maturity of smart sensing devices, mobile phone positioning systems, etc., people usually carry more than one sign-in tag. Each type of the check-in label produces a corresponding type of check-in data, and various types of check-in data are generated in large quantities, providing researchers with a large amount of data for analysis.
In recent years, many scholars have used check-in data to conduct research. For example, using network sign-in data to cluster interest groups and content recommendations for user groups [8] and using urban residents' bus card information to study people's travel characteristics and hot business circles [9]. However, the existing research is limited to single sign-in data. Although the number of single sign-in data is large, it is usually sparse in time and space [10], which reduces the reliability of user similarity calculation and the quality of neighboring search, which is not good. e effect needs to be improved.
For the traditional recommendation problem, in order to improve the search quality of neighboring users, researchers have improved the similarity calculation method. For example, literature [11] uses the Jaccard similarity coefficient improved by the modified formula to calculate the similarity between users. Considering the relationship between common scoring items and all scoring items between users and the difference of users' scores on common scoring items on user similarity, the search quality of neighboring users is improved; Wang et al. [12] proposed an entropy-based user similarity. e sexual measurement method, considering the relative error of user scores, improves the search quality of neighboring users, but this method does not consider the influence of time and geographical location on the recommendation results. In location recommendation, existing research methods use friend relationships and check-in time information to improve the quality of neighborhood search [13]. Literature [14] proposes a friend-based collaborative filtering algorithm, which can only find neighboring users among users' friends, but the accuracy of this method is limited, and the literature [15,16] also shows that the user's friend relationship has limited improvement on recommendation accuracy. Construct a user-location-time three-dimensional matrix, consider the time periodicity of the user's sign-in, and obtain time-aware user similarity, which improves the accuracy of the recommendation. e existing location recommendation algorithm considers the geographical location factor less when calculating the user similarity. In view of this problem, this paper proposes a fusion algorithm that integrates the geographic location preference and multisimilarity measure to improve the proximity of users by calculating the user similarity. Quality improves recommendation accuracy.

Geographical Relationship Recommendation Complex Model
Based on the geographical location relationship, how can users in different locations establish a relationship connection with other users, a relationship can be established by establishing relationships between nongeographic users, users in the same geographical location establish relationships, and whether users who have not established relationships are analyzed. ere is a similarity in the spatial position, and the relationship between users is established by this similar relationship.
User set U � u 1 , u 2 , . . . , u m ; Urepresents a set of all users, data represented by m; N-geographical position information L � l 1 , l 2 , . . . , l N . User u accesses a location in a geographic location l u . Each location information l u is described by < longitude, latitude > coordinates. When the check-in matrix in the user collection is where f ij represents the number of visits by the user u i at the address geographic location l j , and the number f ij determines the size of the user's interest in the geographic location. It is obvious that the matrix is a sparse extreme matrix. e user's model and interest predictions for the geographic location are analyzed by analyzing the matrix at different points in time and associated frequencies.

Location Information Model.
is paper is used to establish a user set to model the overall geographical location. Obviously, the user behavior can be embodied as a whole, but it is difficult to describe the individual preferences. In order to describe the behavior of the user preferences as much as possible, we use an adaptive kernel density-estimation wide-geographic location-kernel function modeling in the following formula: where dis(l i , l k ) represents the spatiotemporal distance of the geographic locations and l k , X u represents the geographic location l i and l k sample sets, |X u | represents the number of x u , and K( * ) is a kernel function. is paper uses the Gaussian kernel function, and the formula is as follows: where τ b is the smoothing parameter; in this paper, the normal distribution test of user and geographical distance distribution is adopted, and the optimal kernel width estimation is used to obtain the best effect. If the distance distribution is approximately normal, the following formula is used: where τ b is the standard error of the sample X u : where δ b is the absolute difference median of adjacent samples. According to above formula (5), the objective function modeled by the geolocation kernel function estimation method is as follows: where C g uj is the user's preference for the geographic location, and the user u predicts the geographic location l i interest: where α b ∈ [0, 1] is a weighting parameter used to evaluate the user's preference for geography, A(l j ) is a normalization coefficient of b(x), and b(x) indicates that the geographic location is strong l k against the address location l i correlation.

Geographic Similarity Calculation Model.
Due to the proximity of the geographical location, the probability of their communication below the online is higher. We assume that users with higher geographic overlaps are more likely to become friends in virtual and real social interactions and to obtain geographic location relationships between users by calculating the historical similarity of sign-in. We have explored three commonly used similarity calculation methods: cosine similarity, KL divergence, and Jaccard similarity. e user's check-in history data, that is, a matrix, is used to evaluate the geographic commonality between users.

Cosine Similarity.
Cosine similarity is the inner product of two vectors. It is calculated that the cosine of the two vectors is independent of the size of the vector. Cosine similarity generally measures the multidimensional space with values between [0, 1]. In matrix F, the geographic metrics of u i and u j users are as follows: where N is the number of locations.

KL Divergence Cosine
Similarity. e KL divergence is called relative entropy and measures the relative distribution of two probabilities. For users u i and u j , the probabilities for the next geographic location preference are P i and P j . e relative entropy p ik of u i to the geographic location l k is In order to prevent the occurrence of zero probability, add 1 to each of the numerator and the denominator, and then, according to the KL divergence, the KL divergence formula of P i and P j is When the path or preference of the position between the two users is the same, the above formula results in 0. erefore, the similarity of the similarity is mapped to [0, 1], and the KL divergence is normalized, and the KL value is subtracted by 1. is ensures that the geographical status probability distribution is similar and the similarity value is also larger. e geographical similarity KL divergence between two users is where KL divergence is asymmetrical. When recommending the geographic location to other users, use Sim KL (u i , u j ) to indicate the similarity between the user and other users.

Jaccard Similarity.
e Jaccard similarity is used to evaluate the similarity of two sets. For the geographic location preferred by the user, all the location sets of the user calculate the similarity, and the subset of the user's location set is used to compare the similarities, and the overall similarity passes through the subset. Similar to Jaccard, it is similar, and its calculation formula is as follows: where F i represents the geographic location history of the user u i and N is the sum of the locations.

Recommendation Algorithm Based on Geographic Location Preference
After the user's preference for the geographic location is calculated and the similarity is calculated, how to effectively recommend the location information to the user, considering the individual user's preference and the similarity between the users, effectively integrates the research through the two. Recommendations are made by ranking individual user preferences and then using user similarity.
For the geographic location of the phase, there is multiple users' continuous access to the geographic location, and then, the user has the same geographic preference during this time period. From the vector model of the user and the preferred geographic clustering, we have been through the computing user together. e more times the two users visit the same location, the higher their interest similarity in the location will be. e user's interest similarity formula at the location is as follows: where c k is the kth cluster at the center of the cluster class (geographic location k), F i,k represents the access frequency of the ith user going to position k, and Sim i,j int refers to the similarity of user i and user j in geographic interest. e value of interest ranges from 0 to 1. Formula (14) uses the similarity evaluation formula to measure the similarity of points of interest of different users to K geographic locations. e value of formula (14) indicates the recommended location similarity value. If the result is equal to 1, the location has extremely high similarity, and if it is 0 means, there is no similarity in the position.
More times a user accesses a location, the greater the probability that the location will be accessed next time, and the average of the location is also large, indicating that the user has more preference for the geographic location than other users, and the preference model is quantified as where P k � ( t∈U p t,k /|U|), (|p i,k |/ j∈P p i,j ), the ratio of the number of visits by user i in the geographic location K to the number of times he visited the geographic location, p i,k is the number of times user i accesses the location K, j ∈ P, P is the set of all access locations, and (|p i,k |/P k ) is the ratio of the Complexity number of times user i accesses the location K to the average number of times the user accesses the location.
Since the geographic location of the user is added to the matrix decomposition optimization formula as a constraint rule, the original loss function not only satisfies the constraints of the matrix decomposition but also satisfies the geographical constraints. Its constraint formula is as follows: where q represents the implicit feature factor of the user, and the probability prediction values of the users u i and u j recommended as friends are q T i Λq j . G(u k ) represents all current friends of user u k . is function is based on the assumption that the more similar the two users are, the more similar the implicit factors corresponding to them are decomposed by the matrix. e smaller the Sim(u i , u j ), the greater the difference between the users u i and u j . e difference between their implicit factors is greater. Conversely, the larger the Sim(u i , u j ), the smaller the difference between the users u i and u j and the smaller the difference between their implicit factors (Algorithm 1). e above is based on the user similar recommendation algorithm. rough the above algorithm, the user and other users calculate the similarity degree to analyze those users who are to be recommended users and recommend the user to the recommendation level.

Experimental Evaluation Method.
We use the Gowalla and Foursquare datasets. Gowalla [17] and Foursquare [18] are based on location social networking sites that provide users with location sharing, event sharing, line sharing, etc., by signing in. e Gowalla dataset was randomly extracted from 3,000 users, 2,530 points of interest, and 50,724 checkin records. e Foursquare dataset contains 1,083 users, 400 location types, 38333 locations, and 227427 user check-in records, with a dataset sparseness of 99.45%. In order to ensure the effectiveness of the experiment, delete the user data with less than four times of check-in and less than four times of registration. Finally, 1071 users, 307 place types, 5291 locations, and 92,056 user sign-in records are obtained. e sparseness of the dataset is 98.38%. e datasets used in the paper are Gowalla and Foursquare, which are public geographic location datasets. ere are a large number of points of interest, sign-in data, user status information, etc. In the dataset, the application of the dataset has high persuasiveness and credibility. e goal of the user recommendation is to recommend N interested geographic locations for the user, and all the candidates are ranked by the user's preference to obtain the recommended result. e experimental selection accuracy P, the recall rate R, and the F 1 -Measure evaluation value F 1 are evaluated. e evaluation index of the experimental effect, wherein the F 1 evaluation value is a comprehensive evaluation index based on the accuracy rate and the recall rate. e calculation method of the above indicators is as follows: where R(u) is the geographic location recommended to the user u and T(u) is the geographical location that the user in the test set u has checked in. e following three indicators are used to measure the recommendation results. P(u), R(u), and F(u), respectively, indicate the inaccuracy rate and the recall rate. is paper analyzes the F 1 performance of three methods for calculating geographic similarity, then analyzes the performance of each comparison method in terms of accuracy, recall, and F 1 , and compares the performance of the proposed hybrid method proposed in this paper. A lot of experiments and analysis were carried out for the first 1,4,8,12,18,22, and so on. Finally, we analyzed the influence of the blending parameters on the influence of the user-recommended constraints on the proposed method, and the similarity performance is shown in Figure 1. Figure 1 shows the results of sorting recommendations using only geographic similarity, that is, using only cosine similarity,KL divergence and Jaccard similarity calculate the geographical similarity between users and rank the similarity as a recommendation score, and the A users with the highest scores are recommended to the target users.
In this paper, the purpose of geographic commonality is to explore the relative preference of users for geographic location, and it does not pay attention to the specific number of times a user is in a certain location. Because each user's living habits or patterns are different, some users may rarely go out. e number of visits is small, and some users may go out often or they have more times. erefore, the difference caused by this data magnitude is useless information, and we should pay more attention to which user prefers location A and location B. Since the calculation method of Jaccard similarity is greatly affected by the magnitude of data, this method performs the worst. When the performance of cosine similarity and KL divergence is equivalent, the cosine similarity is slightly better than the divergence; the KL divergence performs better when k > 8. e cosine similarity method considers the history of the user's collection as a vector, each location is a dimension of the vector, and the calculated cosine similarity is the angle of the vector. e KL divergence is the distribution of the user's history as the user's preference distribution at different locations. e calculated KL divergence is the difference between the two distributions, which is consistent with the measurement of the user's different concepts of point preference difference. 4 Complexity erefore, cosine similarity and KL divergence have achieved good results. In the recommendation list, users at the top of the list are often able to get the attention and acceptance of the target user, so for the measurement, in terms of recommended performance, these top users are more relevant than the later users. Since cosine similarity and KL divergence have good effects, we use cosine similarity and KL divergence fusion method to calculate users.

Comparison of Experimental Methods.
To verify the validity of the experiment, the following comparison algorithm is used in this paper: (1) location-based collaborative filtering LCF algorithm [19]; (2) user-based collaborative filtering UCF algorithm [20]; (3) F2F rank candidate users accord to that number of user friends [21], and the top k users who select the most common friends are recommended to the target users. e fusion KL and cosine methods proposed in this paper are recorded as KL-C. e influence of the parameter α on the recommendation result is shown in Figure 2. e value of the parameter α is evaluated for the recommendation result. e recommended list length is fixed to 10, the number of neighboring users is fixed at 10, and α is taken in the interval [0, 1]. e F 1 value increases gradually. When α � 0.6, the F 1 value reaches the maximum, indicating that the comprehensive recommendation effect is the best. When α continues to increase, (i) Input: user set U, historical access record set F, u is the target user, and r is the user to be recommended (ii) Output: Top-u. (iii) Algorithm steps: {Calculating the similarity Sim(u i , r j ) between the user u and the recommended user r according to formulas (9), (11), and (13); (vii) Calculating the preference prediction value Sim u,r int of the user u in U according to formulas (7) and (8); (viii) Calculate the similarity of the similar user recommendation with preference Pre u,r according to formulas (14) and (15), Sim u,r int value; Calculating the recommended score S(u, r) according to formula (16) Complexity the F 1 value is gradually smaller. When α � 0, it is cosine; when α � 1, it is KL. Figures 3, 4, and 5 show the comparison of the accuracy, recall, and F 1 values of each algorithm when taking different recommended list lengths. First, α is fixed at 0.6, and the length of neighboring users is fixed at 10. It can be seen that when N is in hours, the accuracy rate is higher and the recall rate is lower. As N increases, the accuracy rate gradually decreases and the recall rate gradually increases. e recommended effect of KL-C is better than that of UCF and LCF, which indicates that compared with the traditional proximity-based collaborative filtering algorithm, the proposed method has better proximity detection effect. At the same time, the recommendation effect of KL-C is better than that of F2F. is shows that the user similarity considering time and space perception is better than that of unilateral similarity. It can be seen from Figure 5 that the comprehensive index F 1 of KL-C is better than other algorithms, and when N � 20 o'clock, the F 1 value of KL-C reached the highest, and the recommended effect was the best. When N was 25, 30, 35, and 40, F2F was inferior to UCF, indicating that there is a certain deficiency in using the spatial similarity of users. e LCF effect was poor in all comparisons and did not achieve the expected recommendation.
When the number of users recommended is different, what is the effect of the four methods under F1 value, α � 0. 6, and the length of adjacent users is fixed at 10, as shown in Figure 6.
As shown in Figure 6, the K value is between 2 and 18. e UCF and KL-C are better than the LCF and F2F methods. When K � 10, K � 14, and K � 16, the LCF is better than the F2F method. Among the four methods, the effect of KL-C is optimal. When K � 12, the maximum value is reached, which is significantly higher than the other three methods. rough the above comparison experiments, it can be found that the algorithm can improve the search quality of neighboring users and thus improve the recommendation effect based on the user similarity fusion algorithm.

Conclusion
In order to be able to access the geographic location with similarity between different users in the spatial geographic location, this paper proposes a method based on the fusion of KL divergence and cosine similarity. KL divergence and cosine similarity have advantages by comparing three similar metrics at different K values. Using the fusion method of the two, the user's similarity with the preference is reused. Finally, through comparison with LCF and UCF, the proposed method has advantages in preparation rate, recall rate, and F 1 . At different median values, the proposed method also has advantages in F 1 . Future works will solve the problem of matching similar users through graph theory [22,23] to improve the effect of user recommendation.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.