Topic and knowledge-enhanced modeling for edge-enabled IoT user identity linkage across social networks

The Internet of Things (IoT) devices spawn growing diverse social platforms and online data at the network edge, propelling the development of cross-platform applications. To integrate cross-platform data, user identity linkage is envisioned as a promising technique by detecting whether different accounts from multiple social networks belong to the same identity. The profile and social relationship information of IoT users may be inconsistent, which deteriorates the reliability of the effectiveness of identity linkage. To this end, we propose a topic and knowledge-enhanced model for edge-enabled IoT user identity linkage across social networks, named TKM, which conducts feature representation of user generated contents from both post-level and account-level for identity linkage. Specifically, a topic-enhanced method is designed to extract features at the post-level. Meanwhile, we develop an external knowledge-based Siamese neural network for user-generated content alignment at the account-level. Finally, we show the superiority of TKM over existing methods on two real-world datasets. The results demonstrate the improvement in prediction and retrieval performance achieved by utilizing both post-level and account-level representation for identity linkage across social networks


Introduction
The exponential growth of the Internet of Things (IoT) and mobile edge computing (MEC) empowers social networks [1], infusing social media posts with dynamic and diverse characteristics [2,3].Concurrently, the number of social platforms centered around IoT devices is steadily increasing [4,5], approximately 80% of Internet users tend to register multiple accounts on different social networks to access various online services [6].Social networks are progressively meeting users' escalating demands for self-promotion with IoT devices through cutting-edge media, such as 3D image and augmented reality videos, which increase numerous computational demands [7][8][9].The evolution of MEC meets users' real-time needs, offloading tasks to nearby nodes and enhancing the responsiveness and interactivity for MEC applications [10,11].MEC applications enhance the responsiveness and interactivity.For instance, with edge AI, users can swiftly summarize video content, edit photos, and optimize content using prompts [12].Additionally, MEC applications can analyze user behavior in real-time, providing personalized services and recommendations [13].For example, leveraging location data from users' IoT devices (e.g., vehicles or gaming devices) [14][15][16], social network applications can suggest nearby activities, businesses, or friends [17].Thus, diversity of platforms and online data brought by IoT and MEC applications presents huge potential for improving cross-platform applications, such as analysis of social network structure [18][19][20], cross-domain topic detection [21], and multi-layer rumor influence minimization [22,23].These applications are hungry for the comprehensive amalgamation of user data which is from diverse social networks [24,25].However, due to the cross-platform online data heterogeneity and the diversity of posts caused by the myriad IoT devices, it is challenging to process the integration of the users' separated data from diverse social networks.
In light of this, as illustrated in Fig. 1, cross-social network identity linkage is envisioned as a promising technique to amalgamate the separated IoT user data for the construction of comprehensive social profiles [26], playing the role as a vital prerequisite of the above crossplatform applications.Driven by MEC, identity linkage can operate on edge nodes for real-time cross-network product recommendations and advertising placements, enhancing user experience and delivering economic value [27].Particularly, driven by the diversify of user attributes(e.g., user profile, social relationships) and data generated by edge-enabled IoT users, a complete view of IoT user's characteristics can be modeled to identify accounts from multiple social networks [28,29].Some studies have been conducted using users' attribute information for identity linkage, such as users' social relationships [30][31][32], and user profiles [33][34][35], etc.However, users prefer to make their posts public, but set their social relationships as private, and their profiles could be dynamically changed.With the User Generated Contents (UGCs) generated by edge-enabled IoT users, a variety of user features can be extracted (e.g., writing style, spatial-temporal feature) without the issues mentioned above.From the perspective of UGCs [6,36], e.g., posts, tweets, and publications, capturing correlations between posts can characterize user behavior with low acquisition difficulty instead of user profiles and social relationships [37,38].
Although using UGCs to tackle the identity linkage task can reduce the inconsistency of accessing user data, it also brings challenges to accurately model IoT user features by considering the cross-platforms distribution disparities and abundant semantic information of UGCs (e.g., Text).First, due to the latent semantic information making contributions to the similarity of UGCs, it is necessary to find hidden correlations of different semantic features.Users may post texts with different content but describing the same event on different social networks.Meanwhile, because of the extensive social network data and complex natural language semantics, it is important to represent different deep semantic information to capture user features and identify the corresponding user accounts without text annotations on multiple social networks.Second, the granularity of post is too limited to calculate the correlation between different accounts [39].If there is a discrepancy in the presentation of posts in different accounts belonging to the same user, comparison of post-level may lead to miss of the target account, which increases the difference of same user identities and degrades accuracy of identity linkage.Therefore, it Fig. 1 Illustration of the user identity linkage task is essential to represent macro user characteristics (e.g., account-level features) and make it reinforce the post similarity representation [40].Furthermore, temporal factors should be considered which play a vital role in the feature representation of UGCs.
Thus, in this paper, we propose a topic and knowledgeenhanced edge-enabled IoT user identity linkage model, named TKM.First of all, the topic information enhances shallow semantics information represented by BiL-STM in post-level feature representation.Then we use a account-level feature representation, which introduces external knowledge alignment to reduce the discrepancy of data distribution among different platforms.When generating similarity distribution of different levels, we use the attention mechanism for incorporating the topic and shallow semantic features in post-level, while using encoder structure of the Transformer at account-level to incorporate temporal factors.Finally, we evaluated our work with the dataset from real social platforms: Twitter, Instagram and Flickr.
Our contribution is summarized as follows.
• We propose a UGC-based approach named TKM for identity linkage across social networks, incorporating post-level and account-level information to uncover hidden correlations among user features, particularly enhancing semantic information at the post-level using topic information.The organization of this paper is as follows.In "Related work" section, related work is reviewed regarding user identity linkage, topic representation model, and external knowledge base, respectively."Preliminaries" section introduces basic concepts, definitions, and formulation.In "Methodology" section, a topic and knowledgeenhanced identity linkage method is elaborated."Performance evaluation" section presents the experimental results."Conclusions and future work" section gives the conclusion.

User identity linkage
Existing works use user profiles, user relationships, UGCs, and combinations of these information for user identity linkage across networks.Traditional methods usually adopt the user's profile information [41,42].Goga et al. [43] focused on profile attributes for analyzing social network users, and investigated how profile attributes, such as usernames, location, and friends, affect the overall matching reliability.Nevertheless, the users' information could be profile fictitious.To characterize users more comprehensively, Zhou et al. [44] addressed the challenges arising from incomplete user information and sparse user pairs by proposing TransLink.This approach utilized the user's social relationships to generate embedding vectors, which are then projected into a uniform low-dimensional space.
However, in recent years, an increasing number of users are choosing to conceal their social relationships and dynamically update their user profiles, which can affect the performance of user identity linkage.Different from the above works, several efforts have been made to address this challenge using users' published content, which is UGCs.Generally, UGCs contain rich characteristics of users and remain public and unaltered.User features, including events, hobbies, attitudes, and other characteristics, can be inferred by analyzing the textual information within UGCs.Chen et al. [36] considered the textual information of posts and used GloVe and BiLSTM to generate user features.It is characterized that similarity between pairs of user posts in adjacent time periods contributes more to the user similarity distribution.The location information in each post can also generate rich user representations.Based on user's physical presence, Feng et al. [45] proposed an end-to-end deep learning based framework by utilizing the spatial-temporal locality of user activities to extract representative features from trajectory.They also demonstrated that network accessing related information can be translated into location, and thus help complete the user identity linkage task.To alleviate the limitation of using absolute location, Chen et al. [26] proposed HFUL, which generates location information in user posts based on kernel density estimation.Additionally, they developed an index structure from the spatio-temporal data and employ pruning strategies to reduce the search space.With the help of the Bayesian personalized ranking (BPR) framework, Song et al. [46] investigated the relationship between multimodal information and used latent compatibility to unify the different complementary kinds of information.In addition, there exist the models that use multilayer perceptron to fuse the scores of different modality information similarities [47], as well as the model based on adversarial learning to reduce the information distribution distances across different social platforms [48].When using heterogeneous user information, the effectiveness of integrating different modal information would indirectly affect the effectiveness of the model.Moreover, users are gradually becoming more aware of their personal information, and it is increasingly difficult to obtain their multimodal information [49].
However, existing works based on user-generated contents (UGCs) lack a comprehensive representation of textual features, particularly overlooking the latent semantic information embedded within textual content and neglecting the challenge of semantic distribution disparities across networks.Therefore, in this paper, we concentrate on utilizing textual information from UGCs to comprehensively represent user characteristics, specifically delving into latent textual representations.

Topic representation model
In recent years, the topic model has achieved prominent success in natural language processing tasks.Topics can be represented by using latent variable generation models [50].For example, Kingma et al. [51] proposed a variational auto-encoder(VAE).They used a deep learning model to approximate the probability distribution parameters on the latent vector layer, to extract a lowdimensional representation of the strain variables in high-dimensional information.Nan et al. [52] proposed a topic model based on the Wasserstein autoencoders (WAE) structure, to address the challenge of distribution matching and avoid the problem of posterior collapse.Furthermore, for the short text posts typically available in social networks, Li et al. [53] clustered the sentiment of the comments into a single document and adopted the topic information to generate a summary.However, the limitation is that the topic information they used is tags given by the user, rather than the latent topic information in the text.
Beyond this, due to the aim of detecting topic information in social networks, Pathak et al. [54] proposed a sentiment analysis model for topic modeling at the sentence level, which used latent semantic indexing constrained by regularization.At the same time, short text posts on social networks usually have an informal style and might contain some spelling mistakes, Internet buzzwords, and informal grammar.Kolovou et al. [55] proposed a sentiment analysis framework called Tweester, which incorporates several models, including the topic model, semantic sentiment model, and word embedding model, to solve the problems of tweet polarity classification and tweet quantification.In particular, they demonstrated that topic modeling could improve the performance of semantic analysis tasks in informal, short-text posts like tweets.

External knowledge base
Recently, knowledge graph has attracted increasing research attention as an approach to introducing external knowledge.Lehmann et al. [56] extracted structured knowledge from different language versions of Wikipedia and mapped it to a single shared ontology consisting of different classes and properties, as a combination of different knowledge.Beyond this, to explore event-centric knowledge graphs, Sap et al. [57] focused on inferential knowledge, which is expressed in the form of If-Then relations with variables.
Recent developments of language representation have heightened the need for the introduction of external knowledge.Liu et al. [58] explored the knowledge-driven challenges in specific-domain by integrating BERT with a knowledge graph.Wang et al. [59] proposed a model named KEPLER to address the challenge of knowledge embedding and pre-trained language representation, which integrates not only factual knowledge into pre-trained language representation models but also generates effective knowledge embedding information.In addition, Sun et al. et al. [60] proposed a contextualized language and knowledge embedding model, named CoLAKE, to reduce the heterogeneity of relevant knowledge contexts and language representations by constructing a word-knowledge graph (WK graph).Moreover, among the approaches that introduce external knowledge to describe the global characteristics of users, Karidi et al. [61] proposed a followee recommendation method that models followers and potential followees based on the same external knowledge and the topics of interest to users.

Preliminaries
In this section, we first introduce the necessary definitions of the identity linkage across social networks and then formulate the research problem.

Basic concepts and definitions
Before introducing our methodology, we define some of the main key terms and descriptions used in this paper, which are listed in Table 1.Some of these terms are described only in the environment of social network SN X , since we can define these terms similarly in social network SN Y .
Definition 1 (Post-level and account-level representation).Given a social media network SN X or SN Y , each user in the network has her own vector space to represent the different characteristics of this user.In our paper, the user vector space consists of post-level vector representation and account-level vector representation.For each user, post-level representations focus more on the detailed features and connections of each post.Accountlevel representations are coarser-grained and focus more on the overall features of the user.More specifically, postlevel representation consists of the BiLSTM-based textual representation and the VAE-based topic representation.Account-level representation refers to the global features of the user account, which are generated by introducing an external knowledge base.Moreover, textual representation based on BiLSTM refers to shallow semantic information, while topic vector representation indicates deep semantic information.

Definition 2 (Identity linkage). Given two different users u X
i and u Y j .We design the representation learning models to generate user vector space from UGCs.Thereafter, we aim to determine whether u X i and u Y j in different media networks are different user accounts belonging to the same user identity.
If the matching results of u X i and u Y j do belong to the same user identity, it can be defined by

Problem formulation
In our paper, our proposed model tries to tackle the following two main questions "Whether it is possible to determine if two user accounts refer to the same user identity only using the user's text posts", and "Could topic information and comprehensive knowledge graph-based user features enhance the shallow semantic information (1) of users to perform the identity linkage task".Given two arbitrary social media networks SN X and SN Y , user set Without loss of generality, since we only use the content of users' textual posts, which is the common component of mainstream social media networks, and it has the advantage of being easily accessible.Furthermore, the social media networks SN X and SN Y in our model can be arbitrary.
For each post , where G is the number of posts.Each post has its own property, where t i g refers to the content of the g-th textual post of the i-th user, and t i g refers to the timestamp of it.There are two levels of vector representations generated from these posts, including the post-level vector representation level H and the account-level vector representation level U .level H includes two vectors, one is the textual vector rep- resentation t i g that shallow semantic information of post t i g , and the other one is the topic latent vector representation z i g .level U corresponds to the account vector repre- sentation m i , which is based on the knowledge graph KG to employ alignment operations across social media networks.
In addition, our paper focuses on the linkage of different user accounts in two social networks, while our model can also be extended to address the linkage in a multi-social network environment by the following methods.
Given social media networks f refer to the same user identity; then we can establish a linkage between user u X i and user u Z f , which represents that u X i , u Y j , u Z f belong to the same user identity.

Methodology
In this section, we detail the proposed topic and knowledge-enhanced identity linkage method with attentive modeling.In essence, the purpose of utilizing topic information is to enhance the shallow semantic information of the posts.Meanwhile, the application of an external knowledge base could perform alignment of UGCs.Accordingly, we can tackle the user identity linkage task by using different representations from multiple levels.

The overall design of TKM
As illustrated in Fig. 2, our proposed model consists of two key components: post-level representation generation and account-level representation generation, to address the challenges in the problem formulation.In particular, two kinds of information are included in the post-level representation learning, one is the information generated with the topic model to represent the deep

Table 1 Key terms and descriptions
Terms Description

SN X
The social media network named X.

SN Y
The social media network named Y.
The i-th user in SN X .
The set of posts of u X i .
The top-K most similar triples of t i g .
semantic features in the post, and the other is the shallow semantic features generated with the BiLSTM model.Simultaneously, we use the attention mechanism integrated with temporal post correlation, to fuse the similarity distributions of the two post-level representations.In the account-level representation, we resort to the knowledge graph to obtain top-K triples for each post, and generate the embedding vector of their knowledge representations with the help of the attention mechanism.
In particular, the encoder structure of the Transformer is utilized to generate account representation.Moreover, we use a fusion strategy to process post-level similarity and account-level similarity.

Post-level vector representation VAE based topic latent vector representation
Undoubtedly, the topic is fundamental to the analysis of UGCs in social media, and it is also a significant component of post representation learning.In fact, not all users add topic tags to their posts, and we need to generate the topic features from the high-dimensional text information.Meanwhile, although each post appears to be independent, users may use multiple posts to describe similar topics.Intuitively, unlike formal articles where sentences are correlated with each other, prior posts are not likely to depend on subsequent posts on social networks.
Towards this end, we resort to TodKat [62], which designed a topic model to encode the latent topic vectors of utterances in the dialogue.Especially, due to the characteristics of posts in social networks, we propose to use the VAE-based topic representation model with the sequential structure for accurate topic latent representation learning.For simplicity, we omit the superscript i, which refers to the i-th user.To generate latent topic vector for each post's text content t g , we use the internal loop structure of z g to han- dle time series information.A topic layer is added to the RoBERTa model Ra.Ra ϕ is the part before the topic layer, while Ra θ is the part after the topic layer [63].Here, the variational approximate posterior can be calculated as where x R g refers to the output of Ra ϕ (t g ) .f µ φ (•) and f σ φ (•) correspond to two multilayer perceptrons, respectively.To be more specific, the multi-headed attention mechanism could be treated as the meaning of the query "which parts of the context in the post cues to the latent topic representation".It is worth noting that multi-headed attention mechanism is proven to have the ability to catch (2) Fig. 2 The overview of TKM model, including two components: post-level representation and account-level representation features more effectively in Transformer model [64].Thereafter, we can obtain f τ as where z g−1 refer to the query, and x L g−1 correspond to keys and values.And to represent the dependencies between z g−1 and z g in posts, we can represent the prior of z g as where f µ γ (•) and f σ γ (•) symbolize two multilayer per- ceptrons similar to (2).In fact, there would not exist a natural posterior p θ z g | x ≤g , z <g of z g .The posterior of z g is replaced using a neural network, represented as q φ z g | x ≤g , z <g .Moreover, we adopt VAE model to pro- cess each post, where we need to reconstruct post text based on the latent topic vector z g .As a consequence, language model based on the encoder-decoder architecture can boost the performance towards reconstruction of the post and generate the topic more accurately.Accordingly, we can assign the process of the reconstruction of x R g from z g as In addition, according to VAE [51], we can construct Variational Lower Bound(VLB) L t as the sum of the reconstruction loss and the regularization loss.In particular, the reconstruction loss represents the similarity of the generated latent topic vector z g to post content x R g , while the regularization loss refers to the difference between probability distribution of z g and the priori probability distribution (i.e., Gaussian distribution).Thereafter, we can formulate L t as where D KL means KL divergence, and both p θ (z ≤G ) and q ϕ (z ≤G |x ≤G ) are Gaussian.Thereafter, we can generate latent topic representation with sequential structure for each post, and we have a language model using post content for fine-tuning, which is adopted later in the knowledge-based representation.

A BiLSTM based textual vector representation
To solve the problem of weak semantic information in short texts, we adopt the BiLSTM framework to process users' historical posts.In our work, we regard the text features generated by this method as shallow semantic information.It plays a vital role in processing text information due to the explicit modeling of semantic relations within sentences.Despite the tremendous success of applying (3) BiLSTM in natural language processing (NLP) tasks and identity linkage task [36,65], there is scarce work exploiting the incorporation of shallow semantic information with latent topic information for identity linkage.Firstly, for a user's post H , its textual content t g is com- posed of Υ words in multiple sentences, which can be represented as t = Word 1 , Word 2 , • • • , Word Υ .To generate the embedding vector of each word, we utilize Global Vectors (GloVe) [36].In particular, GloVe is a word embedding model based on the statistical information of global lexical co-occurrence to learn word vectors, it combines the advantages of both statistical information and local context window approaches.And the use of BiLSTM provides a complete modeling of the semantic information of posts.Specifically, for each word Word υ , v = {1, 2, . . ., Υ } , the embedded vector is e υ ∈ R D e .Update gate and reset gate can be calculated as where W u and b u are the weight matrices and the bias vectors of the update gate.W r and b r are the weight matrices and the bias vectors of the reset gate.σ (•) denotes the sigmoid activation function.Here, the memory cell state m υ and the vector − → f υ generated by the for- ward LSTM can be represented as where W m and b m are the weight matrices and the bias vectors.⊙ denotes the element multiplication operation.Similarly, we can obtain the backward LSTM vector ← − f υ .Here, the vector representation generated using BiLSTM for Word υ can be expressed as Consequently, the BiLSTM-based vector representation t containing all the information of the post text can be defined as

Similarity fusion of post-level representations
After feature representation, we incorporate the above two vector representations to generate post-level similarity distribution.Considering that users are likely to (7) post on different social networks with similar content or topics in close adjacent time period [66].Intuitively, we need to incorporate a temporal correlation factor when generating similarity distributions.Towards this end, we resort to UserNet [36], with the key modification that image representation is replaced by topic representation.Especially, we propose to use the attention mechanism for incorporating the topic latent vector representation z g with textual vector representation t g to generate account- level similarity distribution.The similarity between the different types of semantic information of the posts of u X i and u Y j and the temporal weights can be calculated as where S t g,n and S z g,n denote the shallow semantic similarity and topic similarity between the g-th post of u X i and the n-th post of u Y j .And pg,n denotes the temporal relevance weight between the posts of u X i and u Y j , where p g and p n are timestamps.Then, Ŝt g,n = pg,n S t g,n and Ŝz g,n = pg,n S z g,n are applied to denote pair-wise similarities that imply temporal factors, respectively.
In addition, if the textual features (e.g., word association) of users' posts on different social networks is the dominant feature, more confidence needs to be applied to the shallow semantic information.In fact, because posts in social networks are informal, shallow semantic information could not accurately identify the association between users.Intuitively, we need to set different confidence for different representations.Accordingly, the attention mechanism for the incorporation of two postlevel similarities can be expressed as where Ŝt and b t , b z subtables denote weight matrices and bias vectors, con(•) denotes the concentration operation.And α t , α z denotes the confidence of different semantic information, and the post-level similarity can be calculated as Meanwhile, post-level similarity distributions can be defined refer to the total number of posts of u Y j .The post-level similarity space can consequently be denoted as ỹH = sigmoid w T d + b , and the loss function uses the cross entropy loss function, which is denoted as L H .

Commonsense knowledge retrieval and embedding
Having obtained the similarity distribution of different features of the text, we can move forward to model accountlevel representation.In fact, the same user on different social networks has the problem of data distribution disparity.Intuitively, without data alignment processing across social networks, it will affect the effectiveness of identity linkage [45].Towards this end, we introduce an external knowledge base, which successfully describes user information tasks [67], to perform the alignment of UGCs.The source of external knowledge is the Atomic Knowledge Graph [57], which is an event-centric knowledge graph structure, and we used the If-Event-Then-Mental-State structure in it.(e.g., "If X gives Y a gift, then Y will likely show appreciation"), it has a promising performance in the task of utterance representation.To be more specific, this structure contains three different kinds of information: xIntent εxI , the likely intents of the event, xReact εxR , the likely reactions of the event's subject, and oReact εoR , the likely reactions of others.For example, given an event "x gives o a gift", the εxI could be "x wants to get alone with o", the εxR could be "x feels nervous" and εoR could be "o feels grateful".
To retrieve the most relevant events to the textual information t g , we use the SBERT model [68], which has achieved great success in computing textual semantic similarity.Here, we selected the MEAN pooling strategy, which is to compute the average of all token output vectors.
We denote the most relevant events extracted from the knowledge graph KG as εxI g,k , εxR g,k , εoR , which represents the top-K most similar triples for the g-th post.We then use the language model Ra that has been fine-tuned during the topic latent vector representation to generate the embedding vectors for the retrieved knowledge.
Here, we can generate u g by Ra CLS (t g ) .Moreover, based on the attention mechanism model, we can generate representations of the posts from the retrieved events triples, which can be calculated as Thereafter, the embedding vector Rg of a post can be calculated as ( 14) Then, based on the self-attention mechanism, Rg is aggregated by event relation types to generate Rg .

Account representation learning
Having generated knowledge-based representation for each post, we should focus on how to use these representations for the generation of account features.A naive approach is to stack the obtained vectors into a new matrix chronologically.However, the inherent relationships between posts could not be modeled with this method.In fact, the semantic relatedness among different posts plays a pivotal role in account features, and we need to preserve semantic information when incorporating sequential characteristics of posts.Towards this end, we propose to embed the knowledge-based vector representation Rg of users' historical posts by using the encoder structure of the Transformer [64].And each Rg is input to the token sequence chronologically.Given the set of Rg of all posts of the user, we embed the sequential factor of posts using the positional encoding [64], which can be calculated as where D is the dimension of Rg and pos is the position of the currently processed post.Then, the encoder of Transformer is utilized to derive the account representation vector m i for i-th user.In particular, self-attention and multi-head attention could explore the semantic connections between posts more effectively.In addition, we propose to use Siamese neural network to accurately generate the similarity distributions of different account representations.Accordingly, we can formulate the objective function for classification as where m i and m j denote the account representations of the i-th user on social network SN X and the j-th user on social network SN Y , respectively.W t denotes the weight matrix, and then the cross-entropy loss function is used to train the model, denoted as L U .Finally, the loss function of our identity linkage model is defined as To generate the final probability ỹ of user similarity, we adopt incorporate ỹH and ỹU with different strategy, where ỹH and ỹU are the probabilities of two accounts belonging to the same identity.We experiment with three fusion strategies: computing the geometric mean of ỹH and ỹU , computing the arithmetic mean of ỹH and ỹU , and select the max value of ỹH and ỹU .The default configuration is geometric mean.
The procedure of TKM is summarized in Algorithm 1. TKM features a nested loop structure for pairwise similarity calculation, with an overall algorithm complexity of O(G•N ) , where G and N represent the number of posts for the two users currently being compared.

Experiment settings Datasets
To satisfy the requirement for comparison with existing methods, two public-access user identity linkage datasets, called TWIN and TWFL, are utilized.Unlike synthetic datasets, data in these datasets are collected from real social networks, including Twitter, Instagram and Flickr.Specifically, each dataset contains a microblogging platform and an image sharing platform with timestamps for each post, respectively.
TWIN [36]: TWIN dataset collects user's latest 200 posts with timestamps from two heterogeneous platforms i.e., Twitter and Instagram, based on the mapping pair obtained from "#mytweet via Instagram".Specifically, posts from Instagram have both text and images, while those from Twitter have only text.The dataset comprises posts collected from 2009 to 2018.Users with low post counts are excluded.
TWFL [69]: TWFL dataset collects user pairs between Twitter and Flickr in 2013 by using the "Friend Finder" mechanism, which is presented on major social platforms.We used the URL of the image provided by (17) TWFL for each post in Flickr to make the dataset adequate.Users with low post counts are excluded.
Table 2 shows the details of two original datasets we utilized in our experiments.

Baselines
First, to evaluate the effectiveness of our method for user identity linkage, five baselines are selected as follows.
• DPM [41] is a model based on homogeneous UGCs which treat all text content as a whole when dealing with user posts.In the experiments, the DPM is conducted by merging textual posts together and generating textual representation with Doc2Vec.
Then, representations are projected to fixed dimensions using principal component analysis and user similarity is generated using MLP.• GLM [45] is a model which considers the temporal factors among posts.In the experiments, textual posts are embedded by, which is a word embedding model based on the statistical information of global lexical co-occurrence to learn word vectors.Then BiLSTM is employed to generate textual representation.Additionally, the similarity between user pairs is generated by MLP.• TPA [70] is a topic-aware model based on tBERT.
In the experiments, similarity between user pairs is generated by taking the average similarity of pairwise user posts which is calculated by the topicinformed BERT-based architecture.• UserNet-T [36] is a model with time-aware similarity generation.In the experiments, Only the textual information of posts is considered and GloVe and BiLSTM are used to generate user features.Specifically, the similarity of pair-wise user posts in close adjacent time period contributes more to the user similarity distribution.• UserNet [36] is a extension of model UserNet-T.It explores users' image and text features generated by pre-training models to tackle the user identity linkage task.In addition, it utilizes an attention mecha-nism to integrate the similarities of different modalities with temporal factors.
In addition, to evaluate the effectiveness of different model components, two derivations of TKM are proposed as follows.
• TKM-NoK removes the account features, which represent the global features of user text, and only use the other two vector representations for user identity linkage.• TKM-NoZ restructures TKM by removing the latent topic vector representation which is similar to TKM-NoK.

Evaluation metric
In order to comprehensively evaluate the effectiveness of TKM, prediction metrics and ranking metrics are utilized to compare the matching and retrieval performance with baseline models.Specifically, prediction metrics are utilized to evaluate matching performance, including Accuracy, Precision, Recall and F1-score.For retrieval performance, we use the Hit-precision based on top-k candidates of user identity linkage results [41], which is defined as follows.
where hit(x) is the correct linked user position in the returned top-k candidates list.And hit-precision can be calculated as follows.
where n is the number of user pairs.The Hit-precision is a metric used to measure the retrieval performance of identity linkage.It indicates the method's ability to accurately retrieve other users most relevant to the given social network user.

Implementation details
In our experiments, TKM and baseline models are conducted on a server with Intel i9-12900K CPU (5.1 GHz, 16 cores), 64 GB DRAM, and 2 NVIDIA RTX 2080Ti GPUs.
Before conducting the experiment, data pre-processing is performed, including removing URLs, processing emojis and tags, etc.During the experiment, we use known matched user pairs as positive samples, i.e., positively matched pairs, and randomly generate negative samples, i.e., negatively unmatched pairs, based on these (18)  known matched pairs.Thereafter, we set up the experiments as follows.First, the number of positive and negative samples used is the same, and, in the dataset, the training set accounts for 80%, and the validation and test sets each account for 10%.In the optimization process, Adam Optimizer is utilized, and the learning rate is set to 0.0001.In addition, for the hyperparameters in the model, which are the learning rate, batch size, etc., we use the grid search method to determine the optimal values of the hyperparameters.In addition, we fine-tuned our method with 200 epochs based on training and validation dataset and evaluate model performance with the testing dataset.In addition, the results of each experimental instance are reported by calculating the average value of 10 times repetition independently.

Overall performance
In this section, we focus on the performance under the comparison between the TKM and baselines (i.e., DPM, GLM, TPA, UserNet-T and Usernet).Tables 3 and 4 demonstrate the overall performance of all the methods on the two datasets TWIN and TWFL.It can be observed that the proposed TKM model outperforms other baselines on both TWIN and TWFL.The experimental results demonstrate that different quality of datasets affects the model performance.The proposed TKM and baselines perform better on TWIN than on TWFL.For example, DPM gets a higher Hit-precision (k = 5) by 3.38%, GLM by 7.12%, TPA by 2.51%, UserNet-T by 0.77%, UserNet by 1.2%, TKM-NoK by 0.5%, TKM-NoZ by 3.46%, TKM by 1.14% on TWIN.This is mainly because Flickr is aimed at photographers who focus more on posting professional photos than on sharing their lives, so the hidden semantic relevance of posted content between Instagram and Twitter is stronger than Flickr and Twitter.For convenience, we adopt k = 5 in the subsequent comparison of Hit-precision unless otherwise stated.
The group 1 of model evaluates the effectiveness of topic, including DPM, GLM, TPA and TKM-NoK, and the experimental results of them are presented.As depicted in Tables 3 and 4, the average Hit-precision of DPM and GLM, which lack topic information, converges to 0.2140 and 0.3413 on TWIN, respectively.TPA, which taking the average similarity of pair-wise posts' topic, performs better than models without topic component (in group 1) by at least 18.19% in terms of Hit-precision and 7.15% in terms of accuracy on TWIN.The model which gets most prominent best performance in this group is TKM-NoK, its average Hit-precision converges to 0.6759 which outperforms TPA, GLM and DPM by the retrieval performance of 3.16X, 1.98X and 1.29X on TWIN, respectively.Besides, the accuracy of TKM-NoK also outperforms TPA, GLM and DPM by the improvement of 1.13X, 1.25X and 1.42X on TWIN without the external knowledge information, respectively.The main reason is that topic representation can provide additional signals to the semantic information, improving feature extraction for short text information in posts.Additionally, TKM-NoK, with its integrated attention mechanism and temporal post correlation, comprehensively models semantic information correlation.
Then, the group 2 method is categorized to evaluate the effectiveness of knowledge, which consists of DPM, GLM and TKM-NoZ.As depicted in Tables 3 and 4, TKM-NoZ, which introduces an external knowledge base to characterize the semantic features of posts, make the increase of average accuracy and Hit-precision.The average Hit-precision on TWIN in TKM-NoZ is improved by 44.36% and 31.63%compared with DPM and GLM, respectively.And the average accuracy on TWIN in TKM-NoZ is improved by 22.68% and 14.67% compared with DPM and GLM, respectively.Therefore, the data distribution across different social platforms is an important factor and introducing an external knowledge can reduce the limitation of it.The experimental results of group 3 involving DPM, GLM, UserNet-T and TKM evaluate the effectiveness of modeling with temporality.It is noted that, by using a temporal modeling with attention mechanism, UserNet-T's average Hit-precision converges to 0.5952 which outperforms DPM and GLM by the retrieval performance of 2.78X and 1.74X on TWIN, respectively.Besides, the accuracy of TKM-NoK also outperforms DPM and GLM by the improvements of 1.28X, and 1.42X on TWIN, respectively.The reason is that UserNet-T used temporal modeling of paired posts to generate similarity distributions, while BiLSTM was used to analyze the global correlation between posts when extracting semantic information.Similar modeling is utilized in TKM because users are likely to post on different social networks with similar content or topics in close adjacent time period.In Tables 3 and 4, it is observed that the TKM outperforms the other three algorithms which converge to 0.7012 in terms of Hit-precision and to 0.8601 in terms of accuracy on TWIN.It demonstrates that it is effective to use the attention mechanism for incorporating the topic and shallow semantic features at post-level and use encoder structure of the Transformer at account-level to incorporate temporal factors.
In addition, TKM performs 1.16% and 2.9% better than UserNet in terms of Hit-precision and accuracy on TWIN.It confirms that the dominant role of textual information in user representations can be further improved by exploring the multi-level latent semantic information.

The effect of different post counts
In this section, we focus on the performance with different post counts under the comparison between the TKM and baselines.The number of posts affects the completeness of the user representation.In general, a greater number of posts typically contain a more diverse set of user characteristics.To perform the evaluation of TKM, the post count is utilized as 60, 90, 120 and 150 while other parameters are utilized as default settings.Table 5 demonstrates the performance of all the methods with different post counts on the two datasets TWIN and TWFL.It can be observed that the Hit-precision of all methods except for DPM present upward trend curves as post counts increase from 60 to 150.Real social network posts often contain varying levels of semantic information.Consequently, methods relying on basic semantic feature representation, such as DPM and GLM, demonstrate worse performance compared to others.As depicted in Table 5, it is seen that after several experiments, the average performance of TPA improves by 11.09% on TWIN, which outperforms DPM and GLM and by the improvement of 8.73X and 1.03X, respectively.Meanwhile, the curve of the DPM on TWFL shows a fluctuating trend, with the post counts at 120 and 150 Hit-precision not as good as post count at 90, which indicates that the performance of the method based on simple embedding of posts is unstable.As for GLM, which effectively models the connection between different posts based on BiLSTM, its Hit-precision on different datasets presents upward trend curves.
In addition, TKM-NoZ gets a higher improvement of Hit-precision with post counts increases from 60 to 90, which is improved by 12.14% and 10.54% on TWIN compared with DPM and GLM, respectively.This suggests that using global text features can capture enough information for identity linkage under the environment with reduced post counts.And the average performance of TKM, TKM-NoK and TKM-NoZ improves by 15.78%, 16.46% and 15.53% on TWIN.The experimental results demonstrate that the methods using topic information can improve better performance with an increase in post count, while introducing external knowledge can improve the performance of the model with a shortage of post count.

The contribution of different components of TKM
In order to evaluate the contribution of latent topic features and the account features in the model, we compared TKM with the two derivations, i.e., TKM-NoK and TKM-NoZ.As seen from Table 5, TKM always outperforms its two variants across all post counts.This section uses post counts at 60 and 150 as examples to evaluate the contributions of different TKM components.As depicted in Table 5, it is seen that TKM achieves 4.11% higher Hit-precision scores than TKM-NoZ on TWIN and 3.11% higher on TWFL with post counts at 60.And TKM also achieves 4.36 higher Hit-precision scores than TKM-NoZ on TWIN and 6.68% higher on TWFL with post counts at 150.The evaluations demonstrate the effectiveness of TKM when the latent topic information is utilized to enhance shallow semantics information represented by BiLSTM in post-level feature representation.
In addition, TKM performs better than TKM-NoK by 3.21% in terms of Hit-precision on TWIN and 1.04% higher on TWFL with post counts at 60. On the other hand, TKM also achieves 2.53% higher Hit-precision scores than TKM-NoZ on TWIN and 1.89% higher on TWFL with post counts at 150.Utilizing coarse-grained account-level feature representation indeed benefits the user identity linkage tasks by reducing the limitation of post-level similarity calculation.Table 5 also illustrates TKM-NoK has better performance than TKM-NoZ.This suggests that latent topic features, representing deep semantic information, play a more significant role in user identity linkage than account representation.The use of external knowledge alignment reduces platform-related data distribution disparities, but sacrifices detailed user features capturing deep semantic information.

The effect of different different fusion strategies
Firstly, we investigate the contribution of the attention mechanism of post-level in our work.We remove the attention mechanism of TKM-NoA used in post-level feature similarity fusion, instead taking the average of the post-level similarities and using the Geometric Mean strategy to fuse similarities generated in different levels.In particular, the attention mechanism has the purpose of fusing the similarities derived from different postlevel features in our model.Table 6 shows that the attention mechanism can improve the performance of our model.Because posts in social networks are informal, TKM-NoA could not accurately identify the association between users.Intuitively, we need to set different confidence for different representations.
In addition, for the fusion of post-level user similarity and account-level similarity, we employ three strategies to compute and produce the final user similarity.Throughout all three strategies, all post-level features are retained in the evaluation process, and the attention mechanism is preserved during the fusion of these  features.The performance of various similarity fusion strategies is presented in Table 7. Upon experimental evaluation, we observed that the Max strategy performed significantly worse, while the other fusion strategies exhibited better performance with marginal differences in results.Furthermore, aside from the Max strategy, we found that the similarity fusion strategy had a minimal impact on model performance when compared to the removal of different representations or attention mechanisms.This indicates that the Max strategy is not applicable to our proposed identity linkage model.Moreover, the selection of either the Geometric Mean strategy or the Arithmetic Mean strategy had little impact on the model.The decisive factor for the model lies in the different types of user representations and the method by which these representations interact with each other.

Conclusions and future work
In this paper, we focus on the cross-social network identity linkage using different features generated from edgeenabled IoT users' text posts.In particular, we propose a topic and knowledge-enhanced identity linkage method with attentive modeling, which only uses the textual information of users.At the same time, it combines different levels of information for the complete modeling of user characteristics.Meanwhile, we explore the problem of reducing the semantic disparities that are caused by different data distributions across platforms.The experimental results demonstrate the effectiveness of using the latent topic features with the operation of introducing external knowledge bases in the cross-social network identity linkage problem.When given more textual content published by users, our proposed approach can introduce additional latent semantic signals, which enhance the representational capacity of edge-enabled user information.Furthermore, we find that semantic representation and fusion techniques exert a more significant influence on the model compared to similarity fusion strategies.Although using publicly available user posts for identity linkage can enhance data acquisition efficiency, our method does not capitalize on the image information within user posts.In terms of future work, we plan to establish associations between different levels of text and image representations, particularly focusing on latent semantic correlations.This approach aims to further enhance the performance of identity linkage with abundant UGCs generated by MEC applications.

Table 2
A brief description of two datasets

Table 3
Comparison with the baseline in terms of accuracy, precision, recall and F1-secore (%)

Table 4
Comparison with the baselines in terms of Hit-precision (%)

Table 5
Comparison with the baselines in terms of post numbers(Hit-precision@Top-5) (%)

Table 6
Comparison with different fusion strategies in terms of accuracy, precision, recall and F1-secore (%)