Semantic Recommendation Model via Fusing Knowledge Graph and Formal Concept Analysis

The core idea of semantic recommendation is to incorporate semantic knowledge into the recommendation process. The semantic recommendation algorithm, based on knowledge graph, ignores the deep implicit semantics of the evaluation data. The semantic recommendation algorithm based on the deep matrix decomposition model is limited to the implicit semantics of the evaluation data. The semantic recommendation algorithm based on the collaborative filtering algorithm performs only the selection of the nearest neighbors of the user or the item unilaterally and ignores the influence of other aspects, which naturally leads to a decrease in the recommendation accuracy. To solve the above problems, this paper introduces Formal Concept Analysis (FCA) based on collaborative filtering. Using the property that the formal concept in FCA can cluster objects (users) and attributes (items) simultaneously, we propose a semantic recommendation algorithm (SRKGFCA) based on the knowledge graph and formal concept analysis to solve the problem of ignoring user or item factors. Finally, the proposed semantic recommendation algorithm is validated on two public datasets in this work. By using traditional algorithms and current semantic recommendation algorithms as benchmarks, extensive experiments show that our proposed algorithm consistently outperforms state-of-the-art methods.


I. INTRODUCTION
In recent years, researchers have introduced semantic information into traditional recommendation algorithms to alleviate problems in various types of algorithms. The most classical application is the personalized recommendation for Amazon online shopping mall [1]. Although semantic recommendation algorithms have facilitated the development of recommender systems [2], the main drawback is the cold-start and data sparsity problems. Zhang et al [3] achieved high quality movie recommendation by introducing movie synopsis, poster and ontology as movie knowledge embedded in a low-dimensional vector space, which largely alleviated the cold-start and data sparsity problems. Semantic recommendation algorithms recommend items based on user preferences, The associate editor coordinating the review of this manuscript and approving it for publication was Chong Leong Gan . but it is difficult to perform feature extraction when the items are multimedia data.
Deep learning-based recommendation algorithms [4] can extract semantic information about users or items from a large amount of text data, but make recommendation systems overly dependent on external text data and face the problem of data source and reliability. The emergence of knowledge graph that contain rich semantic information has brought semantic recommendation research to a new stage of development. The emergence of a large number of public knowledge graphs (e.g., DBpedia [5], YAGO [6], Wikidata [7], Word-Net [8], and CN-DBpedia [9], etc.) has enabled researchers to more easily access and utilize semantic information and obtain the final recommendation list by merging semantic top-k nearest neighbors and collaborative top-k nearest neighbors to form the final recommendation list and improve recommendation accuracy. Translation model-based (Trans) map embedding technology is the most widely used, including TransE [10], TransH [11], TransR [12] and TransD [13]. They embed knowledge graph triples into low-dimensional vector space through gradient update training, which is more interpretable and simple than the method based on graph neural network [14].Wu et al. [15] learned the representation of movies by constructing a knowledge graph in the movie domain, and then combined the semantic similarity of movies and the similarity of elements of collaborative filtering to achieve higher accuracy and coverage of recommendations. Reference [16] the knowledge graph representation learning method is used to embed the semantic information of the items into a low-dimensional semantic space; then, the semantic similarity between the recommended items is calculated, and then, this item semantic information is fused into the collaborative filtering recommendation algorithm. This algorithm increases the performance of recommendation at the semantic level. Although the above knowledge graph-based recommendation methods reflect user preferences through rating data, the high sparsity of the rating matrix is the major drawback of the above algorithms. The key is to unlock the hidden semantics of the rating matrix.
For the past few years, with the development of deep learning, He et al. [17] proposed NeuMF model (Neural Matrix Factorization), which combines probabilistic computation and multi-layer perceptron. Xue et al. [18] proposed DMF (Deep Matrix Factorization) model to address the shortcomings of NeuMF in processing user ratings and obtained better results. Although matrix factorization can effectively determine the latent semantics in user-item interaction data, it ignores the external semantic knowledge. Because it is limited to the cryptic meaning mining of the items scored by users, and deep learning itself applies the idea of heuristic algorithm, when there are too many missing values in the scoring matrix, the deep matrix factorization model is easy to fall into the dilemma of local optimum in the training. Therefore, the input of matrix factorization model is processed accordingly, so that the input matrix or vector can describe the characteristics of users and items more accurately, which can often achieve better results. Jiarong et al. [19] performed DMF model training after scoring matrix completion in the way of mathematical matrix completion, and achieved good recommendation accuracy. Quan et al. [20] achieved better results by performing graph embedding through a modified TransHR model and then populating the scoring matrix with semantic similarity mixed with collaborative filtering similarity. However, Jiarong et al. [19] simply used mathematical formulas to complete the scoring matrix, while ignoring external semantic information. Yuan et al. [20] used only a superficial model for hidden semantic mining and ignored the deeper semantic information of the evaluation matrix. In recent years, researchers have found that FCA can cluster both objects (users) and attributes (items) and is able to find the nearest neighbors of users or items more efficiently than the k-nearest neighbor algorithm. Both user-based K-nearest neighbor and item-based K-nearest neighbor consider only one-sided neighbor relationship. By introducing FCA into collaborative filtering recommendation and considering the neighbor relationship between users and items at the same time, they achieved good recommendation effects [21], [22]. Jiang [23] proposed an ontology semantic extension-based FCA to improve the computation of formal concept similarity by incorporating semantic information from knowledge graph. In FCA-based recommendation, Boucher-Ryan et al. [24] first introduced the concept lattice of FCA into the collaborative filtering recommendation algorithm to find the nearest neighbor users based on the concept lattice, so as to make product recommendation Zou et al. [21] used a concept lattice in cosine similarity computation to improve the prediction accuracy of a recommendation system with collaborative filtering. Mezni et al. [22] proposed a cloud server recommendation system based on fuzzy FCA to determine the relationship between users and items at a deeper level, alleviating the problems of data sparsity and cold start and achieving better results. The approach described above can only make a recommendation for small amounts of data, since a concept grid must be constructed. At present, the recommendation algorithm based on FCA is far from popular recommendation algorithms based on matrix factorization and knowledge graph in terms of recommendation accuracy [21], [22], [24]. However, how to effectively use the related theories of FCA and its semantic extension and integrate them into the existing recommendation algorithms to improve the recommendation effect of the existing algorithms is a problem worth studying.
In summary, the recommendation model based on knowledge graph ignores the deeper implicit semantics of the evaluation data, while the recommendation model based on the deep matrix decomposition is limited to the implicit semantics of the evaluation data FCA can effectively solve the problem of selecting the nearest neighbors of users or items by traditional recommendation algorithms of collaborative filtering, and has the theoretical basis for combining with related semantic techniques. However, there are also problems in creating concept lattices in large-scale datasets or the accuracy of heuristic concept algorithms is not high. In this paper, we combine knowledge graph and FCA to propose a new and better semantic recommendation algorithm.
1) In collaborative filtering, combined with the depth matrix decomposition model, a semantic recommendation algorithm based on knowledge graph is proposed in this paper.(Semantics based Recommendation using Knowledge Graph, SRKG) 2) A fuzzy FCA-based concept area calculation method and an attribute (item) based heuristic concept construction algorithm are proposed to solve the problems of lack of consideration of user ratings and difficulties in operating with large-scale datasets. 3) FCA is integrated into the SRKG algorithm.
A semantics-based recommendation algorithm using knowledge graph and formal concept analysis (Semantics based Recommendation using Knowledge Graph and Formal concept analysis, SRKGFCA) is proposed to solve the SRKG in PR1 evaluation prediction, which considers only the item factor and ignores the user factor. The remainder of this paper is organized as follows.
In Section II, we briefly introduce the related work. Section III presents semantic recommendation algorithms based on knowledge graph. Section IV introduces our SRKGFCA algorithm in detail. Section V evaluates our implementation based on experimental results. Section VI concludes this paper with future work.

II. RELATED RESEARCH A. KNOWLEDGE GRAPH
Knowledge Graph(KG) is a practical approach to model factual information in the form of entity-to-entity relationships for representing extensive information from different domains. DBpedia Knowledge Graph [5] is a specific example of a Semantic Web application that automatically extracts semi-structured data from Wikipedia and transforms it into structured data that can be linked via Linked Data to other knowledge graphs such as Wikidata [7], YAGO [6], Freebase, etc. DBpedia's Semantic Web technology facilitates the application of Wikipedia data and was awarded Best Application Service at the 2009 Semantic Web Awards. The atlas contains more than 6 million entities, and in particular all the entities of the Movielen movie recommendation dataset required for use in this work are includeded.

B. FORMAL CONCEPT ANALYSIS
Formal Concept Analysis (FCA) [25] is a theoretical framework that provides the basis for data analysis and knowledge processing and can represent the relationships between objects and attributes in a given domain. In real life, many relationships between objects and attributes are ambiguous, for example, even if someone has seen a certain movie, there are often a variety of levels of personal preferences. For example, not bad, average, not so good, etc. Fuzzy information like this cannot be defined by a simple 0 or 1. Therefore, fuzzy set theory was introduced to define Fuzzy Formal Concept Analysis (FFCA) [25].
Definition 1 (Fuzzy Formal Background [25]): A fuzzy formal background is a quadruple K = (O, A, V , R = ϕ(O×A)), where is the set of objects, A is the set of attributes,R denotes O the fuzzy relation between the set of objects O and the set of attributes A, and V = [0, 1] denotes the affiliation taken in the fuzzy relation. Each object and attribute in FFCA has an affiliation µ 1 (o, a) ∈ V = [0, 1] between them. Given a grid of concept pairs (o, a) ∈ O × A and the affiliation µ 1 , that is, an indication of the extent to which the object o has (µ 1 ) the attribute a. Corresponding to the user rating relationship for the movie, how much (µ 1 ) the user o likes the movie a, or how much the user o rates the movie a as µ 1 . The fuzzy formal background is shown in Table 1.   [16]): Given a fuzzy background

Definition 2 (Fuzzy Formal Concept
and given an affiliation threshold α, define the following operations. is the affiliation degree of o corresponding to a, and when the binary C = (ϕ(E), I ) satisfies E * = I , I * = E, then C is said to be a fuzzy formal concept, where E is called the fuzzy concept externality and I is called the fuzzy concept implication.
Also in the fuzzy concept C, the affiliation µ o of each object o ∈ O for the fuzzy set ϕ(E) is computed publicly as Taking the fuzzy formal background of Table 1 as an example, let the affiliation threshold α be 0.5. It is necessary to state that since the only rating of user 2 for movie b is 0.4 < α, which indicates that this user 2 does not like movie b, this correspondence will not be considered when generating fuzzy concepts. One of the fuzzy concepts can be obtained from the above definition is {(2/0.6), (a, c, d)}, where '2/0.6' indicates that user 2 has an affiliation of 0.6 for the movie set (a, c, d).
In FFCA, the concept lattice is defined similarly to singlevalued FCA, where two fuzzy concepts are given and then the corresponding fuzzy concept lattice can be obtained by connecting the concepts through the partial order relation '≤' between them. Figure 1 is the fuzzy concept lattice corresponding to Table 1.

III. SEMANTIC RECOMMENDATION ALGORITHMS BASED ON KNOWLEDGE GRAPH A. SIMILARITY CALCULATION METHOD BASED ON KNOWLEDGE GRAPH
For semantic recommendation algorithms, similarity computation is the core work, including semantic similarity of items VOLUME 11, 2023 based on knowledge graph, and similarity of items based on ratings.

1) ITEM SIMILARITY BASED ON SEMANTICS OF KNOWLEDGE GRAPH
In the semantic recommendation algorithm, the items Ii are represented by vectors as where d is denoted as the embedding of items into a vector space of dimension d and e i,q is the value corresponding to the qth dimensional vector. The Euclidean distance formula is used to calculate the entity similarity of the recommended items.
The Euclidean distance formula leads to a distance-based semantic similarity formula for the two items sim graph ,

2) SCORING-BASED ITEM SIMILARITY
We use the user-item interaction data, i.e., the user-item rating data, to construct a rating matrix R m×n of user set U to item set I The user set's rating of an item is used as a rating vector, e.g. the vector for item is denoted as I i = {r 1i , r 2i , · · · , r mi } T . Then the similarity sim cos of two items I i and I is denoted as, where can be known from the vector I i and r u,i is denoted as user u's rating of item i. The cosine similarity takes values in the range of [−1,1], so normalization is performed. The normalized cosine similarity formula for item similarity based on ratings sim rate is.
where,r u denotes the average rating of user u, and U i∩j denotes the set of users who rated both items i and j.

B. SEMANTIC RECOMMENDATION ALGORITHMS BASED ON KNOWLEDGE GRAPH
In this paper, we propose a semantic recommendation algorithm based on knowledge graph for similarity (SRKG), which mainly includes the fusion of two similarities and makes initial score prediction, then improves and trains the DMF model by including semantic information in the predicted scores, and finally results in the recommendation.

1) SIMILARITY FUSION AND SCORE PREDICTION
The semantic information of the knowledge graph and the rating information of each user are incorporated into the algorithm, i.e., the two item similarities mentioned above are fused. According to  and (3-6), we can obtain the fusion formula sim for the similarity of the two items.
where, a < 1 is the fusion factor and when a = 0, the proposed SRKG algorithm considers only the item similarity calculated based on the knowledge graph.
We segmented the function, mainly to alleviate the cold start (sim rate = 0) or cases where the knowledge graph cannot be the similarity (e.g., problems such as movie entities not existing, sim graph = 0). By introducing the item similarity computed by the knowledge graph, we can more accurately predict the user's rating of unrated items. The predicted rating PR1 of user u for item i is, +r u (3)(4)(5)(6)(7)(8) where N (u) denotes the items rated by user u, S(i, k) denotes the top k items with the greatest similarity to item i, and the intersection of the two is the reference object for rating prediction.

2) DMF MODEL TRAINING
The purpose of the DMF depth matrix decomposition model is actually to represent item and user features with lowdimensional vectors, while accurately measuring the similarity of users and items. For this purpose, its objective function is as follows   Then, we measure the loss of training by the cosine similarity, and the specific loss function is shown below where, Y + , Y − denote the non-zero and zero values in the rating matrix, respectively; max(rate) denotes the highest rating of the recommendation system, which is mainly used to normalize the ratings; Y ij andŶ ij denote the true and predicted values of the ratings, respectively. where both R and RPR1 are m * n matrices, and the missing values of the R matrix are filled with 0,the RPR1 matrix in Algorithm 1 represents the prediction rating matrix, which is derived from the above computed PR1 prediction ratings, and is filled with 0 values for items that have been selected by the user. Compared to the original DMF training approach [18], the improvement in this paper is mainly reflected in the 8th row of the Algorithm 1, which is populated into the input vectors p i , q j by the predicted ratings PR1 containing the external semantics. It is worth noting that the positive and negative samples of the model, as well as the true ratings Y ij required by the loss function, are still derived from the original rating matrix R in order to guarantee the accuracy of the model. We mainly populate the input vectors p i , q j with the main purpose of making the input vectors p i , q j more accurate in characterizing the user and the item. Because the training positive and negative samples of the original model are not changed, the training method of the DMF model proposed in this paper is not much different in time complexity compared with the training method of the original model. After the DMF model is trained, we can get the u and i cosine similarity by inputting the target user u and the target item i into the model, i.e., the normalized prediction score sim DMF , (3-13) Since the normalized cross-entropy loss function of the model is normalized considering the ratings as well as the bias term, the predicted rating PR2 of the target user u for item i is where max(rate) indicates the maximum value of the rating in the recommender system, e.g., if the highest user rating in MovieLens is 5, then max(rate) = 5. Rating predictions are made for each target user, and by ranking the predicted ratings, the SRKG algorithm can be summarized as the following process, 1) Retrieve the corresponding entity and its triples from DBpedia by the name of the item. 2) The triad is trained to learn the TransE mapping representation, and is given a low-dimensional vector containing items. 3) Knowledge graph-based similarity of items and evaluation-based similarity of items are computed from the low-dimensional vector of items and the evaluation information, respectively, and the similarity fusion is performed. 4) The fused item similarity of the item is used to calculate the prediction score PR1using k-nearest neighbors; 5) Fill the predicted scores PR1 into the input vector p i , q j and train the DMF depth matrix decomposition model. 6) After the model is trained, the prediction results PR2 and recommendation results are obtained by inputting the target users and items into the model.

IV. SEMANTIC RECOMMENDATION ALGORITHMS FOR FORMAL CONCEPT ANALYSIS A. STRONG CONCEPT CALCULATION METHODS
Following the FCA of Yuncheng Jiang et al. in semantic retrieval [26] and the semantized FCA [23],for fuzzy formal concepts C(ϕ(E), I ), the main purpose is to determine VOLUME 11, 2023 L. Wei, X. Zhu: Semantic Recommendation Model via Fusing Knowledge Graph and Formal Concept Analysis the closest set of items, and we will also include semantic information in the computation of the fuzzy concept area calculation. For item i, the formula for the fuzzy concept area formula as follows where µ i denotes the affiliation of each object in the extents ϕ(E), which follows from (2-3), and sim(i,i') denotes the similarity of the two items, which follows from (3-7).
To find the set of nearest neighbors of an object or item, we need to subtract concepts that contain only one item or object from the concept lattice. For example, in the concept lattice of Figure 2, there are three fuzzy concepts that contain item b. For example, to find the nearest neighbors of item b, we first need to subtract the concept C 1 = {(1/1.0, 3/0.7, 4/1.0, 6/0.8),(b)}. Among the remaining 2 concepts, for the sake of example, let the similarity between attributes be both 1. Concept C 2 = {(4/1.0, 6/0.8,), (a, b)} is a strong concept of item b because it has the largest concept area S ′ = (1 + 0.8) * (1 + 1) = 3.6, so concept C 2 is a strong concept of item b.

B. STRONG CONCEPT-BASED SCORING PREDICTION METHODS
By the above work we can calculate a strong concept for each item. The implication I of a strong concept is the set of nearest items of the item. For example, if the concept C 2 is a strong concept of item b, then the implication (a,b) is the set of closest users of item b, and a is the closest user of item b.
With the set of nearest neighbor users, we can improve (3)(4)(5)(6)(7)(8), so that the improved PR1 FCA is +r u  where I (C i ) denotes the set of implication of the formal concept C i . Compared to (3)(4)(5)(6)(7)(8), this equation differs mainly in the selection of nearest neighbors by replacing S(k, i) with I (C i ) ∪ S(k, i), which solves the problem that only item factors are considered in the SRKG semantic recommendation algorithm. Of course, we will also perform an experimental comparative analysis of the effectiveness of the substitution in the experiment.

C. HEURISTIC CONCEPT CONSTRUCTION ALGORITHM
Integrating FCA into semantic recommendation solves the problem that the SRKG algorithm lacks consideration of user factors. However, to implement FCA-based rating prediction, we need to construct concept lattices whose construction requires extremely high time complexity.We propose a heuristic concept construction algorithm to heuristically construct strong concepts for each item.
For a fuzzy form background K ′ α = (O, A, V , R = ϕ(O × A)) processed by affiliation thresholding, the Algorithm 2 to heuristically find the strong concept of any attribute (item) a ∈ A is as follows: Taking the processed fuzzy background and item b in Table 2 as an example, we describe an example of the operation of the Algorithm 2. We first obtain the extents b * = {1, 3, 4, 6} of b, where b does the * operation when the affiliation of user 5 is filtered because it is lower than the affiliation threshold (a ≥ 0.5), and also for the convenience of demonstration, we set the inter-attribute similarity in the inner set to 1. After combining the objects in the extents with b in turn, we can get the users 1 and 4 that make the largest concept area, and the alternative area 2; randomly select user 1 or 4 as the alternative user, and continue adding the remaining objects to the epitaxy. If user 1 is chosen as the alternative user, we end up with a strong concept {(1/1.0, 3/0.5), ( b, c)} with an area of 3. If user 4 is chosen, we get the above concept C 2 = {(4/1.0, 6/0.8,), (a, b)} with an area of 3.6.
Obviously the heuristic algorithm does not always give the optimal result, but it ensures that the product of the number of extents and connotations is the largest.
Finally, we apply the HCC algorithm to the FCA-based rating prediction to replace the concept lattice generation and thus reduce the complexity of the algorithm.

D. SRKGFCA SEMANTIC RECOMMENDATION ALGORITHM
We achieve the purpose of dynamically setting the number of nearest neighbors through the cluster property of formal concepts by using the strong concept connotation of items as the nearest neighbors of items. For the semantic recommendation algorithm, the number of nearest neighbors for each item is obviously not fixed, so dynamically setting the number of nearest neighbors by FCA can improve the scoring prediction efficiency to some extent. As for the accuracy, we will also conduct comparative experiments in the experiment. To solve the problem that the concept lattice is difficult to generate in large-scale datasets, we propose a heuristic method to construct concepts and a method to measure the range of fuzzy concepts, which can reduce the complexity of the algorithm to a great extent. By incorporating the heuristic concept construction algorithm into SRKGFCA, our algorithm can be executed in large-scale datasets.The SRKGFCA algorithm can be summarized as the following process.

1) preprocessing and correlation as in SRKG algorithm
and computation of two similarities followed by similarity fusion. 2) construction of a fuzzy background using the scoring matrix R. 3) Generation of a strong fuzzy concept for each item using the heuristic concept construction algorithm HCC. 4) The set of inclusions by the strong concept of an item as the nearest neighbour set for that item and PR1 FCA score prediction. 5) replacing PR1 FCA in SRKG for the same input vector filling operation, DMF model training and finally the prediction score PR2 and recommendation results. Obviously, the problem of ignoring the user factor in the SRKG algorithm is solved by FCA, using heuristic ideas to generate strong concepts that allow the algorithm to run in large-scale datasets. The improved SRKGFCA algorithm, whose recommendation accuracy changes, is demonstrated in the experiment.

V. EXPERIMENT AND RESULT ANALYSIS A. EXPERIMENTAL DATA
In our experiments, we use two movie standard datasets in Table 3 (MovieLens 100K (ML100K) and MovieLens 1M (ML1M)) to test the evaluation prediction error and recommendation effect of the proposed algorithm. ML100K is a dataset with relatively small data volume and ML1M is a dataset with relatively large data volume.
In this paper, DBpedia is selected as the knowledge graph of the experiment, and the version number of DBpedia is 2019-8-30. We used DBpedia Spotlight [27] to map the movie name data contained in the ML100K and ML1M datasets into DBpedia, respectively. Although all movie name information has been integrated into DBpedia knowledge graph [5], there are cases where movie names cannot match the knowledge graph. The main reasons are as follows, 1)The movie noun in ML100K and ML1M data sets is missing, and the movie marked as 'unknown'. 2)The movie name exists in the special characters, such as ' 1 3 '. 3)Movies are released in different years. Finally, for ML100K we get 1621 movie entities corresponding to DBpedia and for ML1M we get 3892 movie entities corresponding to DBpedia. After obtaining the corresponding movie entities in DBpedia, we extract and filter triples (head entity, relation, tail entity). The triples corresponding to 1621 and 3892 movie entities in ML100K and ML1M data sets were obtained. All the relations which appear less than 10 times in the triplet set have been cleaned out. Finally, 'dbo:wikiPageWikiLink' and 'dbo:starring' to be some 20 semantic relations of knowledge graph which indicate the relations among categories, actors and actors. The triplet related information that we finally use for knowledge graph representation learning is shown in Table 4.

B. BENCHMARK AND EVALUATION CRITERIA
We used the evaluation method used in the literature [17], [28], [29], with the following formula.
where, hit(i) of (5-1) indicates whether the target item appears in the Top-k recommendation list, 1 if it exists and 0 if it does not exist, p i of (5-2) indicates the position of the target item appearing in the recommendation list, and p i → ∞ if the target item does not exist in the list; both take values in the range [0,1]. y, k in (5-3) and (5-4) denote the predicted and true scores, respectively, and n denotes the number of samples in the test set.

1) SEMANTIC VALIDATION EXPERIMENTS FOR FUSED KNOWLEDGE GRAPHS
We fuse the rating-based item similarity with the semantic similarity of the knowledge graph by a fusion factor α, which increases from 0 to 1 in steps of 0.1, i.e., from using only the semantics of the knowledge graph for recommendations to using only collaborative filtering for recommendations. The graph embedding dimension is 200 and default values are used for all other parameters. We run the SRKG algorithm on the ML100K and ML1M datasets (with a nearest neighbor kvalue of 20) and obtain the experimental results in Figure 3 and Figure 4. As seen in Figure 3 and Figure 4, both MAE and RMSE error values decrease as the fusion factor α increases. For  ML100K, the lowest error value is reached at α = 0.6, while for ML1M, the lowest error value is reached at α = 0.5. The reason is that ML1M has a larger amount of evaluation data to measure the similarity of items based on evaluations only, and has a higher accuracy than ML100K, so it does not need more knowledge graph similarity to be optimized relatively. It is worth noting that the two errors are actually relatively larger when α = 0 or 1. This shows the effectiveness of merging two similarities. In the following experiments, we choose 0.5 as the fusion factor for ML1M and 0.6 as the fusion factor for ML100K.

2) INTEGRATED FCA VALIDATION EXPERIMENTS
To calculate MAE and RMSE, we use two prediction scores, PR1 and PR1 FCA . To find the best value for nearest neighbor k, for the ML100K dataset, k is set from 1 to 100 with a step size of 10. For the ML1M dataset, k is set from 1 to 200 with a step size of 20. Finally, we obtain the experimental results in Figure 5 and Figure 6 below.
As seen in Figure 5, the smallest MAE and RMSE are obtained for k = 30 in the ML100K dataset. It can be seen in the figure that SRKGFCA has lower error values than the SRKG algorithm in all cases for the same k. In the extreme case of k (k = 1), the error of SRKGFCA can still be at a low level, which indicates that SRKGFCA has a better recommendation for 'cold start' ; at the same time, the extreme cases of k value often occur in practical applications, which means that the algorithm has a stronger ability to deal with extreme cases after integrating FCA. In this case, we can basically conclude that the integrated FCA effectively solves the problem of the SRKG algorithm ignoring user factors. In particular, the significant improvement in the effectiveness of the SRKGFCA algorithm in extreme cases (k = 1) in terms of RMSE error is due to the fact that the algorithm is able to find the set of nearest neighbors of each item more accurately instead of generalizing to k nearest neighbors.
For the ML1M dataset with a large amount of data (e.g., Figure 6), the two algorithms achieve the lowest error at k = 40, and the SRKGFCA algorithm has a lower error deviation, confirming the conclusions drawn in Figure 5. In general, the SRKGFCA algorithm shows better recommendation results and confirms the effectiveness of the integrated FCA.

3) COMPARISON EXPERIMENTS AND ANALYSIS OF RESULTS
To verify the effectiveness and rationality of SRKG and SRKGFCA, we experimentally compared them with the current recommendation algorithms, which performed well on the MovieLens dataset.Benchmark methods are as follows ItemKNN [28] : Item-based collaborative filtering recommendation algorithm proposed by Rosing et al. By integrating the score similarity into the classic Amazon item collaborative filtering recommendation algorithm [17], this algorithm achieves good recommendation effect and is one of the benchmark comparison algorithms for most current recommendation algorithms.
UserKNN+FCA [21] : A user-based collaborative filtering recommendation algorithm based on FCA and concept lattice proposed by Zou C et al in 2015. By integrating FCA into the coordinated filtering recommendation algorithm, the authors can more effectively mine the neighbor relationship between users. It is a good algorithm in the field of FCA recommendation in recent years, but it cannot be run in large-scale data sets. It is worth noting that compared with recommendation algorithms based on knowledge graph or deep matrix factorization, the recommendation effect based on FCA is relatively poor. Therefore, only the representative UserKNN+FCA [21] algorithm is selected in this section, instead of other FCA-based algorithms [22].
NeuMF [17] : He et al. proposed the deep matrix factorization NeuMF model combining probabilistic computation and multi-layer perceptron in 2017. This method performs deep decomposition of the rating matrix through deep learning, so as to achieve high-quality semantic recommendation. It is the reference model for most deep matrix factorization models at present.
DMF [18] : Xue et al. proposed an improved deep matrix factorization model through normalized cross-entropy loss to deal with the shortcomings of NeuMF in dealing with user ratings. It is a good algorithm in cryptic meaning recommendation algorithms in recent years, and the DMF model proposed by Xue et al. is also a model used in this paper to mine cryptic meaning.
TransHR+SVD [20] : A semantic recommendation algorithm implemented by Yuan Quan et al. in the past two years through TransHR model and SVD matrix factorization model. They introduce external semantic knowledge through knowledge graph representation learning, and then use matrix factorization to mine cryptic meaning, and achieve good recommendation results.
MC+DMF [19] : implicit recommendation algorithm implemented by Shi Jiarong et al., through matrix completion method and DMF model in recent two years. The sparse user rating matrix is filled to reduce the sparsity of the matrix, and then the deep matrix factorization is carried out to realize the implicit recommendation algorithm with better recommendation effect.
For training the DMF model, the number of hidden layers is set to 2 for the ML100K dataset and 3 for the ML1M dataset, the maximum number of training iterations is 500 in each case, and the other hyperparameters are the default parameters of the DMF model. The experimental results can be found in Table 5.
As shown in Table 5, the SRKG and SRKGFCA algorithms proposed in this paper, which combine external knowledge graph semantics and internal hidden semantics, achieve excellent recommendation performance. It is worth noting that MAE and RMSE are error indicators and their corresponding lower values indicate better recommendation results, while HR and NDCG are recommendation accuracy indicators and their corresponding higher values indicate better recommendation results. In particular, compared to several existing semantic recommendation algorithms (NeuMF, DMF, TransHR+SVD, MC +DMF), SRKGFCA achieves the best results in both datasets, except for the RMSE index, which does not show much advantage. The reason for the lack of advantage in RMSE metric is that some movie titles are not matched with DBpedia and some movie entities correspond to too few triads.
Our experimental results with UserKNN+FCA and ItemKNN on the ML100K dataset confirm the effectiveness of previous work on improved collaborative filter recommendation algorithms based on FCA. The main improvement lies in the fact that the k-nearest neighbor algorithm only considers the item factor in nearest neighbor selection and simply determines the nearest neighbor set of items through item similarity ranking. In contrast, the FCA-based nearest neighbor determination performs clustering of both users and items, improving both the prediction rating error (MAE, RMSE) and the accuracy of recommendation results (HR, HDCG). Since UserKNN+FCA on the ML1M dataset is not able to create a concept lattice in a short time, we did not test the four categories of metrics for this algorithm on this dataset.
Moreover, by comparing the experimental results of grouping implicit semantic-based recommendation algorithms (NeuMF, DMF) and knowledge graph-based algorithms (TransHR+SVD), we can conclude that the implicit semantic-based approaches are better at making sequential recommendations, since their HR and NDCG values are relatively high, which means that the content they recommend to users is more consistent with their preferences, while knowledge graph-based methods are better able to perform the rating prediction task since their rating error values (MAE and RMSE values) are relatively low. Comparing the DMF and NeuMF algorithms, the MC +DMF algorithm, and the SRKG and SRKGFCA algorithms, it is found that effectively padding the model input vectors increases the accuracy of the recommendations, i.e., the problem of deep neural networks falling into local optima during the training process is mitigated. Combining the advantages of the two types of algorithms, it can be said that the combined SRKG and SRKGFCA algorithms have excellent representation in prediction evaluation and also perform well in sequential recommendation tasks. In Table 5, the obvious shortcomings of our proposed semantic recommendation algorithm can be found by comparing the differences in MAE and RMSE between our proposed semantic recommendation algorithm and the TransHR+SVD algorithm in the two datasets. In both datasets, the MAE values are not significantly improved, and the RMSE values are even lower than those of the TransHR+SVD algorithm in the ML1M dataset. When analyzing the ternary data from DBpedia, we found that the main reason is that there is little or no ternary data for some of the movies in the data. To check whether the MAE and RMSE values are not significantly improved due to the small number of triad data of some movies, we deleted the records of movies with less than 10 occurrences in the DBpedia triad in the ML100K and ML1M datasets, i.e., 1544 movie records were retained in ML100K and 3460 movie records were retained in ML1M. We reran the SRKG and SRKGFCA algorithms and found that both MAE and RMSE were reduced, as shown in Table 6.
After processing the data, the MAE and RMSE error values are further reduced, and it can be verified that some movie entities are missing or the corresponding triad data are too few to affect the recommendation effect of the SRKG algorithm and the SRKGFCA algorithm in the above experiments. In addition, we find that the main reason for the missing movie entities or the too few corresponding triad data is that this work uses the public knowledge map of encyclopedias instead of the domain knowledge map constructed by the authors of the TransHR+SVD algorithm. However, exploring more accurate knowledge graph representation methods and constructing domain knowledge graphs is our main research direction for the future. In the meantime, we find that the semantic recommendation algorithm proposed in this paper still dominates the recommendation accuracy in the comparison between HR and NDCG.

VI. CONCLUSION
In this paper, we propose a semantic recommendation algorithm based on a knowledge graph and formal concept analysis (FCA) to address the shortcomings of the semantic recommendation algorithm based on a collaborative filtering algorithm. We address the problem that the collaborative filtering algorithm does not consider user or item factors when selecting nearest neighbors by combining formal concepts that can simultaneously cluster objects (users) and attributes (items), further reducing the error in predicting ratings. At the same time, we address the problem that the conceptual grid of FCA cannot be generated in largescale datasets. Based on previous work, we propose a method to calculate the fuzzy concept area and a heuristic to construct formal concepts, so that the semantic recommendation algorithm integrating FCA in this paper can run on different datasets. We also selected several representative recommendation algorithms as benchmark methods for comparison and verified that our proposed semantic recommendation algorithm based on a knowledge graph and FCA achieves the best results on various metrics. Our next work will address how to effectively use FCA for better nearest neighbor discovery, how to effectively construct an intra-domain knowledge graph, e.g., a knowledge graph for the movie domain, and how to effectively use deep neural networks for representation learning of large-scale knowledge graphs to further improve the accuracy of semantic recommendations.