Artificial Intelligence-Based Digital Image Steganalysis

Department of Electrical and Computer Engineering (ECE), King Abdulaziz University, Jeddah 21589, Saudi Arabia Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University, Jeddah 21589, Saudi Arabia Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia Aarhus BSS, Aarhus University, Aarhus, Denmark Department of Nuclear Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia Department of Mathematics, King Abdulaziz University, Jeddah 21589, Saudi Arabia


Introduction
With the advancement in Internet technology and communication, a substantial amount of images are transferred over public networks. Recently, it has been found that many criminal groups utilize images to transfer their dangerous data. ese groups hide their dangerous data in the images. Generally, they utilize steganography approaches to hide their harmful contents in the images [1]. erefore, researchers have started utilizing steganalysis models to recognize the images which contain embedded data. us, image steganalysis is an approach for recognizing data embedded in images. Consequently, steganalysis classifies the given image as a stego-embedded image or normal image [2].
Zhou et al. [3] designed an ensemble learning model-(ELM-) based image steganalysis. SRNet and RESDET were utilized as base models. Fusion of the base models was then achieved to classify the embedded images. Zhang et al. [4] designed a CNN model by using 3 × 3 kernels, and the optimization of convolution kernels was achieved during the preprocessing layer. e minimal convolution kernels were utilized to minimize the initial parameters. Spatial pyramid pooling was also used to integrate the local features. Gowda et al. [5] designed an ensemble color space model (ECSM) to evaluate a weighted activation map. It can extract various features explicit to each color space. Levy-flight grey wolf optimization was utilized to minimize the number of features selected in the map.
Boroumand et al. [6] proposed a deep residual model (DRM) to reduce the heuristics and externally enforced elements. is model computes the noise residuals by disabling the pooling to overcome the suppression of the stego signal. Yedroudj et al. [7] designed a truncation activation-based ensemble model (TREM) trained with Rich features. It utilizes a truncation activation function and batch normalization on a scale layer. Ye et al. [8] utilized high-pass filter-based CNN (HCNN) to achieve steganalysis. e weights of the initial layer were computed using a high-pass filter for evaluation of residual maps in a spatial rich model. It was utilized as a regularizer to suppress the image content efficiently. A truncated linear unit was also utilized. Wu et al. [9] utilized CNN and deep residual network for steganalysis. It contains a substantial number of network layers, which are significant for evaluating the complex statistics of images.
Yang et al. [10] designed thirty-two-layer CNNs to enhance the performance of features by integrating all features to enhance the gradient. e bottleneck layers enhance the feature propagation and minimize CNN parameters dramatically. Li et al. [11] designed a novel CNN model to evaluate embedded artifacts in an efficient manner. Information diversely was also achieved. A parallel subnet module was also designed utilizing numerous filters. Subnets were trained independently to improve computational speed. Zhang et al. [12] designed a novel CNN model to enhance the classification accuracy of spatial-domain steganography. A spatial pyramid pooling was utilized to integrate the local features. Sharma et al. [13] designed an aggregated residual transformation-based CNN model to obtain significant features for steganalysis. is model has limited initial parameters for enhancing the classification rate. e residual skip connections were also utilized.
Liu et al. [14] have shown the similarity and dissimilarity between SRM-EC and CNN models. An ensemble model was designed to integrate SRM-EC with CNN by averaging their resultant probabilities. Zeng et al. [15] utilized CNN for a Rich model feature set. e bottom to up strategy was utilized for training the output of each subnetwork to the actual output. Yang et al. [16] designed a max CNN for steganalysis. It allocates significant weights to features learned from the complex texture regions. Yang et al. [17] proposed image steganalysis using a transfer learning model with structure preservation. e discriminant projection matrix was utilized for building the model. Frobeniusnorm-based regularization was also utilized to achieve better results. Ren et al. [18] designed an efficient selection channel network and steganalysis model. e steganalysis model combined with the trained selection channels estimates the final steganalysis outcomes.
From the extensive review, it has been observed that deep learning-based models can be utilized for steganalysis [19]. However, deep learning models suffer from overfitting and hyperparameter tuning issues. erefore, in this paper, an efficient θ NSGA-III-based densely connected convolutional neural network (DCNN) model is proposed for image steganalysis. is is the principle difference from the existing model available in the literature. e main contributions of this paper are as follows: (1) An efficient θ NSGA-III-based DCNN model is proposed for image steganalysis. (2) θ NSGA-III is utilized to tune the initial parameters of the DCNN model.
(3) Accuracy and f-measure performance metrics are used as a multiobjective fitness function. (4) Extensive experiments are drawn on STEGRT1 dataset. Comparison of the proposed model is also drawn with the competitive steganalysis model. e remaining paper is organized as follows: Section 2 presents the proposed θ NSGA-III-based DCNN model for steganalysis. Experimental results and comparative analysis are presented in Section 3. Section 4 concludes the paper.

Proposed Model
In this paper, an efficient θ NSGA-III-based DCNN model is proposed for image steganalysis. e following section discusses the working of DCNN and θ NSGA-III.

Densely Connected Convolutional Neural Network
e diagrammatic flow of the DCNN is shown in Figure 1. Assume a stego/normal image a 0 , which is assigned to CNN. e model has N layers which utilize nonlinear transformation I n (·) such that n shows the layer's indexes [20]. I n (·) shows a set of operators like pooling, rectified linear units (ReLU), convolution (Conv), and batch normalization (BN). a ℓ shows the outcome of the n th layer. However, the existing CNN joins the outcome of the n th layer as an input of (n + 1) th layer. It achieves the layer transition as a n � I n (a n−1 ). ResNets utilize a skip join which avoids the nonlinear transformations utilizing an identity operator such as a n � I n a n−1 + a n−1 . (1) ResNets achieve better gradient flow compared to CNN. However, the summation of the identity operator with an output of I n may hinder the data flow in the model. erefore, to enhance the data flow, a DenseNet was designed. It contains direct links from a given layer to every other layer. e n th layer takes the feature maps of all previous layers, a 0 , . . . , a n−1 , as input: a n � I n a 0 , a 1 , . . . , a n−1 , Here, [a 0 , a 1 , . . . , a n−1 ] shows the integration of feature maps obtained from layer 0, . . . , n − 1.
I n (·) is defined as a group operator. It contains BN, ReLU, and a 3 × 3 Conv. e integration operator utilized in equation (2) is not sustainable if there are some variations in the size of the feature maps. e downsampling layers of CNN vary with the size of the feature maps. To achieve downsampling, the model is divided into various densely connected dense blocks. Layers among the blocks are represented as transition layers. In this paper, the transition layer utilizes BN and 1 × 1 Conv followed by a 2 × 2 average pooling layer. ere are no links across dense blocks except the transition layer.
If every I n generates k feature maps, it considers n th layer with J 0 + J × (n − 1) input feature maps. J 0 defines the channels of the input layer.
e main significance of DenseNet over CNN is that it has confined layers, e.g., J � 12. J represents the growth rate of the DenseNet. Every layer merges with the J feature maps. e growth rate regulates the details of every layer's contribution to the global state. e global state is globally defined; therefore, it is not required to redefine in every layer.
Every layer will compute J feature maps, but it may have more inputs. 1 × 1 Conv is utilized as the bottleneck layer prior to every 3 × 3 Conv to minimize the size of feature maps and enhance the computational speed. is model is efficient for DenseNet, and DenseNet with bottleneck layer can be defined as BN-ReLU-Conv (1 × 1)-BN-ReLU-Conv (3 × 3) version of I n , as DenseNet-B. In this paper, 1 × 1 Conv provides 4J feature maps.
To enhance the model density, the feature maps are minimized at the transition layers. If a dense block has c feature maps, then the transition layer computes ⌊θc⌋ output feature maps. 0 < θ ≤ 1 is represented as a compression factor. If θ � 1, then the size of feature maps through the transition layer stays constant.
DenseNet contains four dense blocks. Each dense block contains an equal number of layers. Initially, Conv with 16 output channels is implemented on the input images. For Conv layers having kernel size as 3 × 3, every side of the inputs is zero-padded to maintain the fixed-size feature map. 1 × 1 Conv is followed by 2 × 2 average pooling between two connecting dense blocks. Finally, a global average pooling is implemented, and a softmax activation function is used. e sizes of feature map sizes in dense blocks are 32 × 32, 16

θ-Nondominated Sorting Genetic Algorithm-III
θ NSGA-III [21] has been extensively utilized to optimize many engineering applications. It has achieved good convergence speed, and it does not suffer from the premature convergence issue [22][23][24]. Table 1 represents the nomenclature of θ NSGA-III. Algorithm 1 illustrates the generation of an initial population of θ NSGA-III-based DCNN. Initially, a random population is computed by utilizing the normal distribution.
e computed solutions are then mapped to the group of initial parameters of DCNN.
Algorithm 2 demonstrates the proposed θ NSGA-IIIbased DCNN model. Initially, we will test the DCNN by using the random population to train and test the model on the chunk of steganography dataset. e fitness of each solution is then obtained. Dominated and nondominated groups are then evaluated.
ereafter, mutation and crossover operations are used to compute the child solutions. Nondominated sorting is used to sort the obtained nondominated solutions. If the number of fitness evaluations exceeds the max allowed, then we return the tuned parameters of DCNN. Finally, θ NSGA-III-based DCNN is trained on the steganalysis dataset.

Dataset.
Rezaei et al. [25] designed a reference dataset for image steganalysis. It is the so-called Real version 1 (STEGRT1), and it contains both JPEG and BITMAP images. It has 8000 cover and stego images with different sizes and characteristics. ese images were obtained using various steganographic approaches such as payload and quality factors.

Experimental Set-Up.
e experiments of the proposed and the existing models are drawn on MATLAB online server with the help of a deep learning toolbox. Additionally, to increase the size of the dataset, the BitMix data augmentation [26] is also implemented. e performance of the proposed model is compared with the HCNN [8], TREM [7], CNN [4], ELM [3], ECSM [5], and DRM [6].

Comparative Analysis.
In this section, the comparison between the proposed and the existing CNN-based steganalysis models are presented. Figure 2 shows the performance analysis of the proposed model. It is found that the best performance is found at epoch 8 and 47 th iteration. erefore, the proposed model converges efficiently with good convergence speed. Figures 3 and 4 represent the confusion matrices obtained by using the proposed model with and without θ NSGA-III. It has been found that the majority of the obtained results lie in the true classes (i.e., in diagonal matrices). erefore, it will lead to good performance results such as accuracy, f-measure, precision, recall, and area under the curve (AUC). In Figure 4, every diagonal value shows whether the corresponding class is true or false. It helps in evaluating the various performance metrics. Assume that stego-embedded image is our true class; it means the normal image belongs to the negative class. Overall, the analysis indicates that the proposed model with θ NSGA-III achieves better performance than without the use of θ NSGA-III. Figures 5 to 9 show the comparative analysis between the existing and the proposed models. In these figures, the notched boxplots are shown. e box shows the interquartile range (IQR). Red line shows the median of the computed performance. Notch indicates a confidence interval around the median which is dependent upon the median ± interquartile range/sqrt of a number of experiments (n). Here, we have considered n � 30. If the size of a notch is smaller, then the steganalysis model achieves better results. To evaluate the significant improvement or reduction, we have selected the average computed values of the proposed model and one from the existing steganalysis models (i.e., showing a better average value among existing models).
ereafter, we evaluate their absolute difference. It computes the average mean improvement or reduction; to make it in percentage form, we divide the absolute difference by the maximum possible value and multiply the computed value by 100. Figure 5 represents the comparison between the existing and proposed steganalysis models in terms of accuracy. It reveals that the proposed model achieves better accuracy than the existing steganalysis models. e proposed model outperforms the existing steganalysis models in terms of accuracy by 1.2643%. Figure 6 represents the precision analysis among the proposed model and the existing steganalysis models. It is evaluated that the proposed model achieves consistent values of precision than the existing models. e proposed model outperforms the existing models by 1.1438%. Figure 7 demonstrates the recall analysis of the proposed steganalysis model. It is observed that the proposed model outperforms the competitive models in terms of recall values compared to the existing models. e proposed model has shown an average enhancement in recall values by 1.2832%. Figure 8 represents the f-measure analysis among the proposed model and the existing steganalysis models. It is  τ′← Optimal number of layers. τ″← π 1 , π s , π s−1 , . . . , π 2 κ ′ ← Implement an optimal number of layers based on the DCNN model. κ″←∅R← ζ(τ′, κ″), ζ(τ″, κ″) ALGORITHM 1: Generate initial population.
Security and Communication Networks evaluated that the proposed model achieves consistent values of f-measure than the existing models. e proposed model outperforms the existing models by 1.0245%. Figure 9 demonstrates the AUC analysis of the proposed steganalysis model. It is observed that the proposed model outperforms the competitive models in terms of AUC values compared to the existing models. e proposed model has shown an average enhancement in AUC values by 1.2913%.

Conclusion
From the extensive review, it has been found that deep learning-based models have been extensively utilized for steganalysis. However, these models suffer from overfitting and hyperparameter tuning issues. erefore, θ NSGA-III based DCNN model was proposed for image steganalysis. θ NSGA-III was utilized to optimize the initial parameters of DCNN model. e accuracy and f-measure were utilized to design a multiobjective fitness function. Extensive experiments were drawn on STEGRT1 dataset. Comparison of the proposed model was also drawn with the competitive steganalysis model. Performance analyses have shown that the proposed model outperforms the existing steganalysis models in terms of accuracy, f-measure, precision, recall, and AUC by 1.2643%, 1.0245%, 1.1438%, 1.2832%, and 1.2913%, respectively. e results show that the proposed model can record even little changes in image features.
In the near future, one may extend the proposed work by designing a novel deep learning model to enhance the results further. Additionally, one may test the proposed model on other steganography datasets.

Data Availability
No data were used to support this study