Steganalysis of neural networks based on parameter statistical bias

Malicious developers can hide malware or other baneful information into the pretrained model imperceptibly, which does harm to computer society. However, steganography on neural network will modify the statistical distribution of the model. We propose steganalysis methods for steganography on neural network parameters by extracting statistics from benign and malicious models and building classifiers based on the statistics


Introduction
Deep learning has seen considerable growth in the past decade [1−4] .State-of-the-art deep learning models such as LeNet [5] , VGG [6] , GoogLeNet [7] , ResNet [8] , and EfficientNet [9] have achieved superior performance in computer vision applications, such as object recognition [10] , face recognition [11] , and image classification [12−13] .In addition, numerous deep learning frameworks have been released to help engineers and researchers develop systems based on deep learning easily or conduct research.
Although these frameworks allow for easy deployment of neural networks in real-world applications, training neural network models is still a daunting task as it requires a considerable amount of data and computational resources.Therefore, pretrained models are provided on the websites to facilitate the reproduction of results of a study.Currently, the sharing of well-trained models is essential for both the research and development of deep neural network systems.Numerous pretrained models have been uploaded by developers on websites such as PyTorch Hub ① and Papers With Code ② .
Owing to the significant progress made in deep learning, neural networks are now being used as steganography cover media.Song et al. made the first attempt to embed messages in neural network parameters [14] and proposed three methods: The least significant bits(LSB) encoding, which simply embeds message in the lower bits of the parameters of welltrained models; Correlated Value Encoding, which forces the parameters to be highly correlated with the embedded message by means of malicious regular terms; and Sign Encoding, which encodes message in the signs of parameters.
Consequently, malicious developers can use neural networks to exchange messages imperceptibly.Liu et al. proposed StegoNet [15] , which turns a deep learning model into a stegomalware by using model parameters as a payload injection channel.There is no significant decrease in accuracy with StegoNet, and the triggers are connected to the physical world by input specification.StegoNet focuses on embedding methods that are effective both on uncompressed and deeply compressed models.Deeply compressed models, such as VGG-16 Compressed and AlexNet Compressed [16] shrink in size by reducing both the amount and data precision of general uncompressed model parameters.Liu et al. showed that, similar to image steganography, LSB substitution in deeply compressed models will lead to significant bias in the statistics of the parameters.Therefore, LSB substitution in a deep compression model is easily detected by traditional image steganalysis methods such as primary sets [17] , sample pairs [18] , Chi Squares [19] and an RS analysis [20−21] , while the LSB substitution on uncompressed models is challenging to detect.For these scenarios, Liu et al. proposed three improved embedding methods based on LSB substitution: Resilience training, value-mapping and sign-mapping.However, the approaches proposed by Liu et al. mainly focus on selecting the position with the minimum distortion to embed the secret message.Therefore, the message embedded by these approaches rely on a considerable amount of auxiliary information such as po-sitions stored in other parts of the stegomalware to ensure extraction.In contrast, the approaches proposed by Song et al. are more universal and can be leveraged to harm society.
Details about models can be easily obtained because developers always share the model settings, such as the structure and dataset, when they publish the model.However, as the training process almost randomly initializes the parameters, it is still impossible to determine the cover model even with the information developers provide.Therefore, it is necessary to design a specific model detection method.In this paper, we demonstrate that the methods proposed by Song et al. [14] which can modify the statistical distribution of a model.An analysis of LSB Encoding revealed that the randomness of bits systematically increases during the embedding of the secret message; therefore, we can build discriminative features by measuring the randomness of the bit plane.Our analyses reveals that, with correlated value encoding and sign encoding, the distribution of parameters varies during embedding; therefore, we can determine statistical features of parameters to measure the differences between the varying distributions.Thereafter, a regression is utilized to capture the bias of the statistic feature for classification.The experimental results reveal that our methods are effective for detecting the model with an embedded message even when the payload of the stego model is low.
The remainder of the paper is structured as follows.Section 2 reviews the related work on neural networks and steganography of neural networks.In Section 3, we propose methods for detecting the presence of steganography in a given model.In Section 4, we present the results of experiments that validate the effectiveness of our proposed methods.Section 5 concludes this paper and discusses future research directions.

Neural network
Machine learning can be divided into supervised learning and unsupervised learning.Although we focus on supervised learning in this paper, our approaches can be applied to unsupervised learning as well.Let be the input space where is the -th instance, and be the output space where is the true label of the -th instance and is the number of classes.Given a set of data points , where , this set is partitioned into two subsets, training data and testing data .A machine learning model is a function parameterized by parameters .When it is difficult to clearly state how a function should be calculated, deep neural networks are most often used.In deep neural networks, is composed of layers of nonlinear transformations that map the input to a series of intermediate θ states and then to the output.The parameter is the weight used in each transformation.As the depth of the network increases, the number of parameters increases.θ In order to find the optimal set of parameters for function , the objective function, which penalizes the mismatch between the true label and the predicted label generated by , is minimized.Empirical risk minimization is a general framework, that uses the following objective function over D train : L Ω(θ) where is the cross entropy loss function, and is the regularization term that prevents models from overfitting.

32
In the general deep neural network model, the parameters are -bit floating point numbers, following IEEE standard 754 [22] .According to IEEE standard 754, floating-point data should be represented in the form , where is the sign, is the biased exponent which can be computed by subtracting , and is the trailing significant field which represents fractions with a non-zero leading bit.A representation of -bit floating-point data in the binary interchange format is given in Fig. 1.In this example, the float32 number is , where ( is the bias), . For steganography methods embedded in welltrained models that focus on the binary form of the parameters, the secret message is embedded directly in lower bit planes.For steganography methods embedded during training that focus on the decimal form of the parameters, the secret message is mapped into the values or the signs of the parameters.

Neural network steganography
Neural network steganography(NNS) is a timely issue, as neural networks are extremely redundant and are widely spread over the Internet.Malicious developers publish their white-boxed models with detailed structures and parameters, and receivers decode the secret message from the parameters.Such a scenario can appear on a third-party platform where people can publish and use the models.Even if the platform is secure, the model supplied by the developer may not be trustworthy.The security of the model is often overlooked by platforms and users.
NNS was proposed by Song et al. [14] to embed messages in uncompressed models and developed by Liu et al. [15] to embed secret messages on both compressed and uncompressed models.The approaches proposed by Liu et al.synthesize the deep learning models with stegomalware [23] .These methods shrink the size of the compressed model by reducing the number and precision of the model parameters.However, to maintain the accuracy of the model, the secret message that can be embedded must be short; furthermore, a significant amount  [14] proposed three techniques: LSB encoding, correlated value encoding (COR) and sign encoding (SGN).
LSB encoding: In this method, the secret message is embedded directly in the least significant (lower) bits of the model parameters.First, malicious developers train a benign model.Then they post-process the model parameters by replacing the lower bits of the parameters with the secret message , producing modified parameters .At the receiverend, the receiver extracts the message by reading the lower bits of the parameters and interpret them as bits of the secret message.

C
Correlated value encoding: This approach gradually embeds the secret message during training.The secret message is embedded by forcing parameters to be highly correlated with the secret message.In detail, a malicious correlation term is added to the loss function, where In the above expression, controls the level of correlation, is the secret message with length , and are the mean value of parameters and secret message , respectively.During training, the malicious term drives the gradient direction towards a local minimum where the secret and the parameters are highly correlated.Therefore, the larger , the more correlated and .Recovering the secret message from the model only requires mapping parameters back to the feature space because correlated parameters are approximately linear transformation of the secret message.
Sign encoding: Sign encoding is another method that can be used to encode a secret message as a model is trained.Similar to correlated value encoding, Sign encoding also adds a malicious correlation term to the loss function.However, m 1 0 P the secret message is embedded in the signs of the parameters.In detail, a positive parameter represents and a negative parameter represents .The malicious correlation term is defined as In the above expression, controls the level of correlation, and is the length of the secret message .Recovering the secret message from the model only requires reading the signs of the parameters and then interpreting them as bits of the secret message.
Unlike in LSB encoding, in correlated value encoding and sign encoding, the secret message cannot be decoded completely and correctly.However, correlated value encoding and sign encoding have better robustness against fine-tuning.

Proposed methods
Even though NNS methods [14] can embed secret messages without performance degradation, biases will occur ineluctably in the model statistics.Moreover, it is easy to master the information of the model as developers need to specify the setting of their model such as the model structure and dataset when the model is released.With the mentioned information, we can build benign and malicious models, and train classifiers to detect the malicious models.In this section, we first present the detection framework for steganalysis of NNS.Then, we describe the model distribution bias generated by each method and design effective features for detection.

Framework of neural network steganalysis
Different biases are caused by the steganography methods proposed by Ref. [14].For LSB encoding, the randomness of the benign bit plane is different from that of the malicious bit plane.For correlated value encoding and sign encoding, the distribution of parameters in each layer of the benign model differs from that of the malicious model.However, as the steganography method applied by the malicious developer is non-transparent, we design an overall framework of our neural network steganalysis to facilitate comprehensive detection.Fig. 2 illustrates the overall framework of our proposed steganalysis methods.
Our framework can be summarized as follows: (Ⅰ) Feature extraction: Since the NNS method used by the malicious developer is unknown, we extract features from the injected model for each NNS method to ensure comprehensive detection.For LSB encoding, we capture the distribution of bit plane as features.For correlated value encoding and sign encoding, we capture the distribution of parameters as features.All of the syntaxes of the features in this paper follow the convention , where represents the feature and is the sequence number.These features captured by the -th steganalysis method jointly constitute the multidimensional feature vector , where , the -th method is LSB encoding steganalysis, the -th method is Correlated Value Encoding Steganalysis and the -th method is sign encoding steganalysis.
w n (Ⅱ) Classification: Essentially, steganalysis is a binary classification task.We use logistic regression [24−26] as our classifier.Let be the input space where is the feature of the -th model extracted by one of the steganalysis methods; is the output space, where is the true label of the -th model; is the parameter of classifier and is the number of models used to train the classifier.The objective function of our classifier is where denotes the coefficient of the regularization term.The loss function is defined as g(x i ; w) where is the logistic function: The ensemble reaches its decision by fusing all the decisions of the subclassifiers using a voting process.The ensemble judges the injected model as benign only when all the subclassifers consider it a cover model.

Feature extraction
As these steganography methods can lead to biases, it is clear that the artifacts they generate can be identified by effective features.In this subsection, we will describe feature extraction in detail.
Multiple features can be used to identify bias.For each steganalysis of NNS, we evaluate the performance of each feature to identify whether the feature is an optimal feature.We then obtain a set of optimal features.For each feature, we train a logistic regression classifier and obtain the detection accuracy.For LSB encoding, ResNet34 [8] trained on CIFAR10 [27] is used to evaluate the performance of each feature, and we train 100 benign models and 100 malicious models with a payload of 1.0 bits per parameter.For correlated valueencoding and sign encoding, ResNet34 [8] , VGG16 [6] , and EfficientNetB0 [9] trained on CIFAR10 are used to evaluate the performance of each feature, and for each network, we train 100 benign models and 100 malicious models with payloads of 0.2, 0.6 and 1.0 bits per parameter.
(Ⅰ) Steganalysis of LSB encoding: For security purposes, a secret message is always encrypted before it is embedded.Thus, the secret message can be regarded as a random binary 0 1 message.For a high performance to be achieved with LSB encoding, high-precision parameters are not required; therefore, modifying the lower bits of the parameters will not lead to significant performance degradation.However, the randomness of benign bit planes is less than that of malicious bit planes.For example, in the binary secret message,0 and 1 are uniformly distributed, while a bias exists for the distribution of and in the benign bit plane.Therefore, we propose our steganalysis algorithm by detecting such biases in the distribution of bit plane.The bias detection in the bit plane distribution can be implemented by randomness test.Thus far, numerous studies on randomness detection have been proposed [28−31] , and the NIST statistical test suite [29] is one of the most representative methods.Therefore, in our method, we design our steganalysis features referring to the NIST statistical test suite, which includes 14 test statistics.
For an injected model, the lowest bit plane with a tendency to be modified is extracted and detected for randomness.As the secret message length varies, not all parameters will be changed.As we detect the bias in the bit plane distribution, while varying the message length, we focus on the change in the payload of the specified bit plane.
To find the optimal set of features, we use the 14 test statistics of the NIST statistical test suite to train the subclassifiers.For each subclassifier, we test the average detection accuracy over the bit planes from the 14th bit plane to the 18th bit plane of ResNet34 trained on CIFAR10 for a payload of 1.0 bits per parameter.Fig. 3 shows the averaged accuracy of each subclassifier.We select the statistics with the top 4 highest accuracies as features.They are the statistics of the frequency test, serial test, approximate entropy test, and cumulative sums test, and are, denoted by , , and , respectively.
evaluates whether the proportion of and in a sequence is similar to that in a random sequence.For a binary secret message, the number of and in the sequence should be about the same.However, for a sequence extracted from benign bit planes, there is a bias in the proportion of and .In this situation, we convert the in the sequence to and compute the sums of the sequence with length .Therefore, is computed as focus on the frequency of all possible overlapping blocks across the sequence.
evaluates whether the number of occurrences of the overlapping patterns is approximately the same as the expected number of random sequence.We count the frequency of the -th -bit blocks as and the frequency of the -th -bit blocks as , where and are the -bit pattern and -bit pattern, respectively.is computed as n where represents the length of the sequence.
uses the approximate entropy to compare the frequency of overlapping blocks of two consecutive lengths ( and ) against the expected result for a random sequence.Let the count of the possible -bit values be described as where represents a -bit value, represents the calculated count of the value , and represents the length of the sequence.The approximate entropy presents the difference in frequency between the -bit overlapping blocks and the -bit overlapping blocks.The approximate entropy (ApEn) is computed as is the entropy of the empirical distribution arising on the set of all possible patterns of length , Thus measure the match between the observed value of and the expected value, Thus evaluates whether the cumulative sum of the partial sequences is approximately the same as the expected cumulative sum of random sequences.We convert in the sequence to and compute the sums of successively larger subsequences with length starting from the beginning: where is the -th value of the converted sequence, , and is the length of the sequence.We use the largest excursion from the origin of the cumulative sums as feature , thus The total feature vector of LSB encoding steganalysis is .
(Ⅱ) Steganalysis of correlated value encoding: Correlated value encoding constrains the value range of the parameter by the malicious correlation term.In this situation, a trade-off between the embedding degree and the model accuracy is achieved during training.We observe that the parameters of the malicious model present a different distribution from those of the benign model owing to the constraints on the parameter values.Therefore, the deviation in the distribution of the parameters is used to detect the correlated value encoding in our steganalysis algorithm.
To detect the bias caused by correlated value encoding, we use moments as our features to describe the shape of the distribution.The most commonly used moments are first-order, second-order, third-order and fourth-order moments.Under this situation, we design our steganalysis features by referring to these moments.As the distribution of parameters in different layers may vary in a neural network, a more detailed description of the parameter distribution is necessary.We describe the distribution of the parameters in each layer.
ϕ 5 is expectation, is used to measure the average value of the parameters and is defined as n j j where is the number of parameters in the -th layer.
is variance, which is used to measure the degree of deviation between parameters and their expectation.
is computed by is skewness, which is used to measure the direction and degree of parameters distribution skew. is computed by  parameters distribution, ϕ 8 is defined as Each feature is a vector with the dimension of the number of layers in the model.To find the optimal set of features, we use the 4 features to train subclassifiers.For each subclassifier, the average accuracy is obtained over ResNet34, VGG16, and EfficientNetB0 where with payloads of 0.2, 0.6, and 1.0 bits per parameter.Fig. 4 shows the average accuracy of each subclassifier.Using the settings mentioned in Ref. [14], gray-scale images are embedded as the secret message.Hence, for each parameter, 8 bits of a secret message are embedded.As shown in Fig .4, and give accuracies above 0.95, which are higher than those obtained with and .Thus, we choose and as the features for detecting the models.Therefore, the total feature vector of correlated value encoding steganalysis is .(Ⅲ) Steganalysis of sign encoding: Sign encoding can also be used to embed a secret message through the malicious correlation term during training; however, in this case, the message is embedded in the signs of parameters.Theoretically, by the malicious correlation term, the parameter and the secret message have the same sign.However, we observe that in practice, not all sign constraints are met.The malicious term penalizes the mismatched parameters by bringing them close to 0, which leads to a distribution of parameters that is different from that in the benign model.Therefore, the difference in model parameter distribution is used to detect the secret message in our steganalysis method.As sign encoding disrupts the parameters distribution, we select features using an approach that is similar to correlated value encoding steganalysis.To detect the bias caused by sign encoding, we also design our steganalysis features by referring to the first-order moments (expectation), second-order moments (variance), third-order moments (skewness) and fourth-order moments (kurtosis).Similarly, the distribution is described for each layer separately.Similar to the experiment for correlated value encoding steganalysis, for each subclassifier, the average accuracy is obtained over ResNet34, VGG16 and EfficientNetB0 where with payloads of 0.2, 0.6, and 1.0 bits per parameter.Fig. 4 shows the average accuracy of each subclassifier.As shown in Fig. 4, and provide accuracies above 0.95, which are higher than those achieved with and .Thus, we choose and as the features for detecting models.Therefore, the total feature vector of sign encoding steganalysis is .

Experiments
In this section, we introduce the setup of our experiments.Then, we discuss the results of the experiments, including the detection achieved with three steganography methods with different embedding rates.

Experimental setup
Table 1 summarizes the datasets and networks we used in our experiments.
The image datasets used in our experiments are the wellknown CIFAR10 [27] and Tiny-ImageNet [32] .ResNet34 [8] , VGG16 [6] , EfficientNetB0 [9] , and MobileNet [33] are selected as the models to be detected.Table 1 shows the number of parameters for each model and the accuracy of the benign model.Our implementation and its corresponding initial architectures are based on PyTorch.In all our experiments, we set the mini-batch size to 128, the initial learning rate to 0.1, and the number of epochs for training to 100.For networks trained on CIFAR-10, we decrease the learning rate by a factor of 0.1 for better convergence in epoch 60.For models trained on Tiny-ImageNet, we decrease the learning rate by a factor of 0.1 in epochs 40 and 60.In each experiment, models are validated and saved every one epoch.The model with the best validation accuracy is selected as the final model.
To measure the impact of secret message length on detection, we design detecting tasks with different payloads for all the steganography methods.For LSB encoding, the payloads are set to 0.05, 0.4, 0.8, and 1.0 bits per parameter.For correlated value encoding and sign encoding, as not all the secret bits can be embedded successfully, we set the payloads to 0.2, 0.6, and 1.0 bits per parameter.For all the steganography methods, we train 100 benign models and 100 malicious models.When the payload is not 1.0, the malicious models are embedded randomly.For all the steganography methods with a payload of 1.0, we train 100 benign models and 100 malicious models.For LSB encoding with payloads of 0.05, 0.4, and 0.8, we train 100 benign models and 300 malicious models, which are uniformly composed of three embedding methods: sequential embedding, sequential embedding from a random position and random embedding.For Correlated Value Encoding and Sign Encoding with payloads of 0.2 and 0.6, we train 100 benign models and 100 malicious models, which are embedded randomly.

% %
For the detection of LSB encoding, correlated value encoding and sign encoding, we evaluate the performance of our detecting methods by 5-fold cross validation.For each cross validation, 80 of the models are selected to form the training set and 20 of the models are selected to form the testing set.We use logistic regression [24−26] as our classifier.In our experiments, we set the number of epochs for training to 100, and use liblinear for optimizing the loss function.We also use the -norm as the regularizer with the coefficient set to .For the detection of LSB encoding, correlated value encoding, and sign encoding, we measure the average detecting accuracy of the 5-fold cross validation.The average detecting accuracy is measured as where TP is the true positive, TN is the true negative, FP is the false positive, FN is the false negative, and is the -th cross validation.

Detection performance of LSB encoding
We train the neural network from scratch on the CIFAR-10 and Tiny-ImageNet with different model structures and payloads, and record the results achieved.Table 2 shows the lowest bit plane embedded without significantly reduced accuracy.For all models, embedding the 14th bit plane does not lead to a significant drop in model performance.In our experiments, we detect the bit planes from the 14th to the 18th bit plane.Table 3 shows the performance of LSB encoding steganalysis for the models trained on CIFAR10 and Tiny-ImageNet.
The results in Table 3 reveal that, except for a few cases, 100% 62.75% 77.72% our approach can effectively detect all models, from the 14th to 18th bit plane.For all model architectures at the 14th bit plane with payloads of 0.4, 0.8 and 1.0, our method can achieve detection accuracies.However, with an increase in the bit planes and a decrease in payload, the performance of our method decreases.For example, for Effi-cientNetB0 trained on CIFAR10, while the payload is 1.0, our method fails to detect the 18th bit plane as the accuracy is , and while the payload decreases to 0.05, even for the 14th bit plane, the accuracy is .For models trained on CIFAR10, our method works better on ResNet34 and VGG16 than EfficientNetB0.For models trained on Tiny-ImageNet, our method works better on VGG16 than MobileNet.This result can be attributed to the number of parameters, which indicates that for a specified bit plane, sequences formed by benign models with fewer parameters have greater random- ness, increasing the difficulty of detection.The results in Table 3 indicate that in most cases, the proposed method is an effective way for detecting LSB encoding.

Detection performance of correlated value encoding
To follow the setup mentioned in Ref. [14], we use gray-scale images as the secret message to be embedded.We train the neural network on the CIFAR10 and Tiny-ImageNet with different model structures, coefficient and payloads.Table 4 shows the appropriate coefficient and model accuracy.We use the mean absolute error (MAE) to measure the embedding degree.Given the embedded parameters and the secret , MAE is , where is the length of the secret message .The range of MAE is , where 0 means the decoded and secret messages are identical and Table 5 shows the performance of our steganalysis method for models trained on different databases under payloads of 0.2, 0.6, and 1.0.From Table 5, it can be seen that our approach can effectively detect all the models with different and payloads.For all the models with a payload of 1.0, our method can achieve detection accuracies.With a decrease in payload, except for a few cases, the performance of our detection decreases.For ResNet34 trained on CIFAR10 with , the accuracy of detection with a payload of 0.6 is , and for a payload drops of 0.2, the accuracy is .However, for VGG16 trained on Tiny-ImageNet with , the accuracy of detection with a payload of 1.0 is , which is lower than that achieved with a payload of 0.2.However, even at a low embedding payload of 0.2, our approach is still valid.

Detection performance of sign encoding
We train the neural network from scratch on the CIFAR10 dataset and Tiny-ImageNet with different model structures, and payloads.Table 6 shows the appropriate coefficient for the correlation term.Given the embedded parameters and binary secret , the embed degree is measured by , where is the indicator function, a true statement and a false statement , is the length of the secret message and is the sign function.Table 7 shows the performance of the neural network under different payloads, which are set to 0.2, 0.6, and 1.0.The results in Table 7 reveal that our approach can effectively detect all the models with different at payloads at 1.0, 0.6, and 0.2.For VGG16 trained on Tiny-ImageNet with and at payloads of 0.2, 0.6, and 1.0, our method achieves a detection accuracy.Furthermore, the decrease in payload has no significant effect on the detection accuracy.For example, for Efficient NetB0 trained on CI-FAR10 with at a payload of 1.0, the accuracy is , which is lower than those at payloads of 0.6 and 0.2.However, even when the payload is low, our detection method is still valid.

Detection performance of the overall framework
As the steganography method and the payload used by the malicious developer for an injected model are both unknown, we need to use all three detection methods and then fuse their results.For simplicity purposes, we specify the payloads of the model parameters, detect models by the three steganalysis methods separately, and fuse the decisions.
Neural networks are trained on the CIFAR10 and Tiny-Im-ageNet with different model structures.For each detection method, 60 benign and 60 malicious models are selected to train the classifier.A total of 40 benign and 120 malicious models uniformly composed of the three steganography methods are used for validation.For LSB, the secret message is embedded in the 14th bit plane, and the payload is set to 0.05.For COR, the payload is set to 0.2, and the coefficient is set to the maximum appropriate value.For example, for Res-Net34, the max appropriate coefficient is 0.1, and for Effi-cientNet, the maximum appropriate coefficient is 0.03.For SGN, the payload is 0.2, and the coefficient is 50.0.The missing alarm rate and the false alarm rate are used to evaluate the effectiveness of detection, and are defined as Table 8 shows the detecting results of our overall framework.LSB detection, COR detection, and SGN detection mean detecting all models by the specific steganalysis method, and the framework detection means detecting models by our framework.It can be seen that our overall framework can effectively detect injection models.The missing alarm rate for framework detection is lower than that for specific detection such as LSB detection.Compared to LSB detection, framework detection has a lower false alarm rate as there is an increased number of true positives.However, for COR detection and SGN detection, the false alarm rate is lower than that for framework detection owing to the voting rule of our framework.c λ s To validate the detecting performance of our framework with unknown payloads, classifers in our framework trained on models with lower payloads are used to detect the models with higher payloads.As in the framework detection experiment setting, in this case, neural networks are also trained on the CIFAR10 and Tiny-ImageNet with different model structures.For LSB, secret message is embedded in the 14th bit plane.For COR, the coefficient is set to the maximum appropriate value.For SGN, the coefficient is 50.0.In our framework, the classifiers for LSB detection are trained with a payload of 0.05, and the classifiers for COR detection and SGN detection are trained with a payload of 0.2.Then, we validate our framework on models embedded by LSB with payloads of 1.0, 0.8, and 0.4, embedded by COR with payloads of 1.0 and 0.6, and embedded by SGN with payloads of 1.0 and 0.6.For each embedding method under each payload, 40 benign models and 40 malicious models are used to detect the secret message.

68.75%
Table 9 shows the detection results obtained using the framework trained with models with lower payloads to detect models with higher payloads.It can be seen that for most of the higher payloads models, our framework, which is trained on the lower payloads models, can effectively detect them.For ResNet34 trained on CIFAR10 embedded by SGN at the payload of 0.6, the accuracy of our framework is .However, for EfficientNet trained on CIFAR10 embedded by COR at the payload of 0.6, the accuracy is only .The results inTables 8 and 9 reveal that, in most cases, our overall framework trained on lower payloads models can effectively detect models with higher payloads.

Conclusions
In this paper, we propose steganalysis methods to detect the steganography on neural networks.First, we analyze the stat- istical bias caused by these steganography methods.As there are multiple features that can be used to describe each statistical bias, in order to find an optimal set of features, we compare the detection accuracies of the methods.Finally, we use the optimal set of features for classification.Various experiments are conducted to show the effectiveness of our framework in detecting neural network steganography.The results in Table 3 reveal that, for LSB Encoding Steganalysis, our method fails to detect the bit planes higher than the 18th bit plane.Methods for detecting higher bit planes need to be explored.Table 9. Results of using frameworks trained with models with a lower payload to detect models with higher payloads.In the framework, the classifiers of LSB detection are trained with a payload of 0.05, and the classifers of COR and SGN detections are trained with a payload of 0.

ϕ 8
is kurtosis, which is used to measure flatness of the

Fig. 3 .
Fig. 3. Averaged accuracy over the bit planes from the 14th to 18th bit plane of ResNet34 trained on CIFAR10 with a payload of 1.0 bit per parameter for each subclassifier.

Fig. 4 .
Fig. 4. Results for the detection of ResNet34, VGG16, and EfficientNetB0 trained on CIFAR10 using different statistics as features respectively.(a) results for correlated value encoding steganalysis; and (b) results for sign encoding steganalysis.
An example of binary interchange formats for a float 32 number, where the green part is the sign, the yellow part is the biased exponent, and the red part is the trailing significant field. of auxiliary information must be extracted successfully.Although the approaches proposed byLiu etal.and Song et al. use the same logic, those proposed by Song et al. focus more on uncompressed neural networks.As the methods proposed by Liu et al. have more limitations and require more auxiliary information for extraction than those proposed by Song et al., we focus on the steganalysis of uncompressed neural networks; specifically, we focus on the approaches proposed by Song et al.Song et al.

Table 1 .
Models and datasets used in our experiments. b

Table 2 .
Results of the LSB encoding in our models.is the lowest bit plane embedded without significantly reduced accuracy.

Table 3 .
Results of the LSB encoding steganalysis for models trained on CIFAR10 and Tiny-ImageNet at payloads of 1, 0.8, 0.4, and 0.05 on Bit planes from the 14th to 18th bit plane.

Table 4 .
Results of the correlated value encoding in our models.is a suitable coefficient for the correlation term.The mean absolute error (MAE) is used to measure the embedding degree.Test ACC indicates the performance of malicious models trained with different .

Table 5 .
Results of the correlated value encoding steganalysis for mod-

Table 6 .
Results of the sign encoding steganalysis for models trained on

Table 7 .
Results of sign encoding steganalysis for models trained on ci-far10 and tiny-imagenet at payloads of 1.0, 0.6, and 0.2.

Table 8 .
Detection results of our overall framework.LSB detection means detecting models by the LSB encoding steganalysis.The definitions of COR detection and SGN detection are similar to that of LSB detection.Framework detection means detecting models by the overall framework. 2.