A Two-Stage Highly Robust Text Steganalysis Model

: With the development of natural language processing, deep learning, and other technologies, text steganography is rapidly developing. However, adversarial attack methods have emerged that gives text steganography the ability to actively spoof steganalysis. If terrorists use the text steganography method to spread terrorist messages, it will greatly disturb social stability. Steganalysis methods, especially those for resisting adversarial attacks, need to be further improved. In this paper, we propose a two-stage highly robust model for text steganalysis. The proposed method analyzes and extracts anomalous features at both intra-sentential and inter-sentential levels. In the first phase, every sentence is first transformed into word vectors. To obtain a high dimensional sentence vector, we use Bi-LSTM to obtain feature information for all words in the sentence while retaining strong correlations. In the second phase, we input multiple sentences vectors into the GNN, from which we extract inter-sentential anomaly features and make a judgment as to whether the text contains secret messages. In addition, to improve the robustness of the model, we add adversarial examples to the training set to improve the robustness and generalization of the steganalysis model. Theoretically, our proposed method is more robust and more accurate in detection compared to existing methods.


Introduction
Steganography is a technique that embeds secret messages into carriers for covert communication. In contrast to steganography, steganalysis is used to detect the presence of secret messages in a carrier. These two technologies have always developed together in competition. The steganography methods try to preserve the characteristics of the carrier by making undetectable changes so that no one other than the recipient can perceive the secret messages. Media used for steganography include images, text, audio, video, etc. With the development of social networking platforms, textual information such as e-books, online news, news reviews, and user data has increased dramatically and prompts the development of research related to textual implicit writing.
From the perspective of the carrier, text steganography methods can be divided into two main classes, modification-based and synthesis-based. Modification-based steganography methods implement the embedding of secret messages by modifying the original text. Specific steganography methods include synonym substitution [1][2], syntactic transformation which means changing the sentence structure according to syntax rules [3], changing the expression according to the original semantic knowledge [4][5], etc. However, it is difficult for such methods to maintain a balance between generating a reasonable and appropriate stego text and a high embedding capacity. With the rapid development of deep learning in the field of natural language processing, it is simple to generate large amounts of high-quality text. Researchers use these techniques to synthesize text for steganography [6][7][8][9][10]. Such methods take advantage of the powerful feature extraction capabilities of neural networks to learn the statistical distribution features of text. During the process of text steganography, generate text containing secret information that is consistent with the statistical features of the training text.
The development of social networks has provided a rich medium for text steganalysis. However, if terrorists use text steganography to transmit terrorist messages, it would greatly endanger the security of society. Therefore, it is necessary to study text steganalysis methods. Traditional text steganalysis methods [11][12][13][14][15] usually extract manual features based on domain knowledge and determine whether the text contains secret messages based on whether the features are anomalous or not. These methods rely on human experience and the features extracted are limited. As a result, it is difficult to adapt to progressively complex text steganography methods. Current deep learning-based methods for text steganalysis [16][17][18][19][20][21][22][23] achieve better performance than hand-crafted text steganalysis methods by automatically extracting and learning high-dimensional features.
Neural networks have shown great potential in the field of steganography. But there are still many challenges for the text steganalysis method based on neural network. Most text steganalysis methods only analyze whether the text contains secret messages from the perspective of sentences, ignoring the intersentential correlation features. The cover synthesis-based text steganography methods may not exhibit unusual features within sentences, but rather exhibit semantic anomalies between sentences. At the same time, with the application of adversarial examples in steganography [24,25], steganography methods already have the ability to actively deceive steganalysis models. Text steganalysis methods need to be further improved to be resistant to adversarial attacks. In this paper, we propose a two-stage highly robust text steganalysis model based on long short-term memory (LSTM) [26] and graph neural network (GNN) [27]. The proposed method analyzes and extracts anomalous features at both intra-sentential and intersentential levels. We generate corresponding adversarial examples based on existing neural network-based text steganography methods. The generated adversarial examples are added to the training set to improve the robustness and generalization of the steganalysis model.

Text Steganalysis
In recent years, text hiding in the form of watermarking and steganography has been widely used in convert communication, copyright protection, content authentication, and other fields. Relative to text steganography, text steganalysis is the process and science of identifying whether there is hidden information in a given carrier text. For example, Yang et al. [19] proposed a fast and efficient method for the automatic generation of stego text by neural networks. The correlation between words is mapped to semantic space, and then the correlation between words in text is analyzed and extracted with the hidden layer. Niu et al. [20] introduced the one-dimensional asymmetric convolution and residual module on the basis of Bi-LSTM [28], and realized the discrimination of inter-sentential relations while extracting the semantic information within the sentence. Shortcut block is introduced to alleviate the loss of feature relation between sliding windows caused by convolution operation to some extent, so that the model effect is improved slightly. Xiang et al. [21] proposed a steganographic analysis model based on two-stage CNN for the problems of synonym substitution in steganographic analysis, which is composed of sentence-level CNN and text-level CNN, respectively corresponding to the sentence-level detection and full-text detection of the article to be tested. This kind of step method is not common in steganographic analysis, which has its own characteristics and has significance for further study. Therefore, this paper carries out study by referring to this method.

Adversarial Examples
There has been a great deal of research on adversarial examples in the field of machine vision [29][30][31][32]. In natural language processing, the research of adversarial examples is also very important. It achieves the purpose of spoofing neural networks by making small changes to the original sample. Designing training methods that resist adversarial examples can improve the robustness and generalization of the model.
In 2013, Szegedy et al. [29] first proposed an algorithm named BFGS to generate adversarial examples in the field of deep learning. The paper pointed out that the cause of adversarial example may be the nonlinear expression and overfitting of the model. In 2014, Goodfellow et al. [30] argued that the linear characteristics of deep neural networks in high-dimensional space are the fundamental cause of adversarial example formation and designed a fast and efficient method FGSM for generating adversarial samples based on this theory. For a given network N, the input text C, θ represents the paraments of the network, y means the target label. We use ( , , ) indicates the loss of the network, η represents the gradient of LN: It can be obtained by backpropagation. Multiply η by a coefficient to get the perturbation.
In text steganography, the same method is used to find the word with high impact of the most loss function. Sorting these words and then replacing, deleting, or inserting them in order, the input can be transformed into adversarial examples to trick the steganalysis. Therefore, text steganalysis methods need to be further improved to be resistant to adversarial attacks.

Graph Neural Networks
Deep learning is good at working with structured data such as audio, images, and text. However, not everything can be represented as sequences or grids, such as knowledge graphs, social network relationships, and so on. This prompted the emergence and development of graph neural networks.
In 2017, Kipf et al. [27] proposed the first graph neural networks, which applies convolution in image processing to graph structures for the first time. The structure is shown in Fig. 1. In each layer of the neural network, each node combines the features of neighbor nodes and then does a linear transformation. Stacking of multiple network layers followed by node classification, prediction, and other tasks.

Figure 1:
The structure of graph neural networks However, GNN needs to know information about the structure of the entire graph, including the predicted nodes, which is not consistent with the real task. In addition, GNN also has problems such as memory consumption. To solve these problems, more graph neural networks are proposed [33][34][35].

The Overall Architecture
The two-stage highly robust text steganalysis model analyzes and extracts anomalous features at both intra-sentential and inter-sentential levels. In the first phase, every sentence [ ] , we used Bi-LSTM to obtain feature information for all words in the sentence while retaining strong correlations. In the second phase, we input multiple sentence vectors into the GNN, from which we extract inter-sentential anomaly features and make a judgment as to whether the text contains secret messages. To improve the robustness of the model, we generate corresponding adversarial examples based on existing neural network-based text steganography methods. The generated adversarial examples are added to the training set to improve the robustness and generalization of the steganalysis model. The framework of the two-stage highly robust text steganalysis model is illustrated in Fig. 2.

Figure 2:
The framework of the proposed network

Sentence Feature Extraction
Text is first transformed into word vectors using word embedding. x x x  . Since the network cannot process text directly, we transform the words and phrases in each sentence into a lowdimensional continuous vector. In the field of natural language processing, such a technique is known as word embedding.
Features of multiple word vectors are extracted and transformed into feature vectors of sentences using Bi-LSTM. Similar to RNN [36], LSTM makes use of context-sensitive information in the mapping process between input and output sequences. It also compensates for the shortcoming of RNN: the limited scope of accessing contextual information. However, LSTM cannot encode back-to-front information, which is not conducive to more fine-grained classification. Therefore we use Bi-LSTM to extract the features of the sentence as a whole. The extracted sentence feature vector can be represented as [ ]

Text Classification
Most text steganalysis models use CNN to further purify the features and classify the text after obtaining the feature vectors of the sentences. These approaches focus only on anomalous features within sentences and rarely consider semantic correlations between sentences in terms of the text as a whole. We input multiple sentences vectors [ ] 1 2 , , m S S S  into the GNN, from which we extract inter-sentential anomaly features and make judgments. Each node in the graph represents a sentence vector, and the edges represent the sentence-to-sentence correlations. This is a graph-level task that does not only depend on the properties of nodes or edges. Each change in graph space features incorporates correlations between neighboring nodes. The classification of text is achieved from the overall structure of the graph.

Training Framework
During the training process, we update the network parameters using backpropagation algorithm. The cross entropy error loss is used as the loss function of the network, which can be described as: where N is the batch size of the texts. pi represents the ground truth label of the text. qi is the predict label of the text. i stands for the i-th sample in each batch. Thus, the entire training is a supervised learning process. The network is brought to an optimal state by minimizing the loss function.

Experimental Settings
The experiments were conducted on Gutenberg [37]. T-Lex is a typical text steganalysis algorithm. We select it to generate the stego text. 8000 pairs of cover text and sego text as training set, 2000 pairs of cover text and sego text as test set. In the training phase, we chose Adam as the optimization method, cross entropy loss is used as the loss function. The learning rates were set as 0.001, the batch size was set as 64.
It is worth noting that in order to improve the robustness of the text steganography algorithm, we randomly replace some stego text in the training set using the corresponding adversarial examples. The experiments were conducted on an NVIDIA GeForce RTX 2080Ti GPU card which has 11GB memory.

Experimental Analysis
In order to evaluate our proposed two-stage highly robust steganalysis model, we use metrics commonly used in steganalysis to measure the performance of the model. The three evaluation indicators include Precision, Recall, Accuracy. All original text is denoted by cover, text containing secret messages is denoted by stego.

Precision
Precision denotes the ratio of samples that were correctly classified as stego to samples classified as stego. The definition is formularized as follows: where TP (True Positive) represents the number of correctly predicted stego sentences. FP (False Positive) means the number of cover sentences incorrectly predicted as stego sentences. Since we used graph neural networks to learn the characteristics of correlations between sentences, the number that cover was erroneously predicted to stego would be smaller, i.e., the FP would be smaller. Therefore, our proposed method will have a higher Precision-value compared to other methods.

Recall
Recall represents the ratio of all samples correctly classified as stego to all correctly classified samples. The definition is formularized as follow: where TN (True Negative) illustrates the number of correctly predicted cover sentences. The application of graph neural networks to text steganalysis combines the advantage that graph convolution can aggregate feature information from nearest neighbor nodes. It is therefore theoretically possible to make a more accurate determination on whether a text contains secret messages or not. As a result, TP and TN would be more accurate, and Recall would theoretically improve slightly.

Accuracy
Accuracy represents the ratio of all correctly classified samples to the total sample. The definition is formularized as follow: + TP TN Accuracy TP TN FP FN where FN (False Negative) indicates the number of stego sentences incorrectly predicted as cover sentences. We used a poisoning attack when training our text steganalysis model. We added adversarial examples to the training data to deal with the possibility of adversarial attacks in real situations. Thus, the number of stego sentences incorrectly predicted as cover sentences would be smaller, i.e., the FN would be smaller. In summary, our method will theoretically have higher accuracy. We will further refine our experiment.

Conclusion
In this paper, we propose a two-stage highly robust text steganalysis model based on LSTM and GNN. The proposed method analyzes and extracts anomalous features at both intra-sentential and inter-sentential levels. Text is first transformed into word vectors using word embedding, and features of multiple word vectors are extracted and transformed into feature vectors of sentences using Bi-LSTM. We input multiple sentence vectors into the GNN, from which we extract inter-sentential anomaly features and make judgments. To improve the robustness of the model, we generate corresponding adversarial examples based on existing neural network-based text steganography methods. The generated adversarial examples are added to the training set to improve the robustness and generalization of the steganalysis model.

Conflicts of Interest:
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.