Image classification adversarial attack with improved resizing transformation and ensemble models

Convolutional neural networks have achieved great success in computer vision, but incorrect predictions would be output when applying intended perturbations on original input. These human-indistinguishable replicas are called adversarial examples, which on this feature can be used to evaluate network robustness and security. White-box attack success rate is considerable, when already knowing network structure and parameters. But in a black-box attack, the adversarial examples success rate is relatively low and the transferability remains to be improved. This article refers to model augmentation which is derived from data augmentation in training generalizable neural networks, and proposes resizing invariance method. The proposed method introduces improved resizing transformation to achieve model augmentation. In addition, ensemble models are used to generate more transferable adversarial examples. Extensive experiments verify the better performance of this method in comparison to other baseline methods including the original model augmentation method, and the black-box attack success rate is improved on both the normal models and defense models.


INTRODUCTION
Convolution neural networks (CNNs) (LeCun et al., 1989) are widely used in image processing, such as image classification (Krizhevsky, Sutskever & Hinton, 2017), object detection (Szegedy, Toshev & Erhan, 2013) and semantic segmentation (Milletari, Navab & Ahmadi, 2016), most of which have better performance than human average capacity nowadays (He et al., 2016). Due to the better performance than other traditional deep neural networks, CNNs have derived a variety of network types to apply on different scenarios (Alzubaidi et al., 2021;Naushad, Kaur & Ghaderpour, 2021). However, when overlaying unnoticeable perturbations on original input, CNNs will mostly output incorrect predictions . This makes networks vulnerable to intended attacks and brings security problem to related models while CNNs have greatly facilitated our lives. These inputs added with specific perturbations are called adversarial examples. Misleading models to output incorrect result is firstly found in tradition machine learning (Biggio et al., 2013), and later in deep neural networks . Initial study mostly focuses on image classification, and upcoming research find that adversarial examples also exist in other fields like real world (Kurakin, Goodfellow & Bengio, 2018). For instance, Sharif et al. (2016) demonstrated that the face recognition models are also possible to be evaded by physically realizable attacks. They made an adversarial eyeglass frame to impersonate an identity so that the face recognition model is fooled by an unauthorized person. While adversarial examples pose great threat to neural networks, it also can be used to evaluate the robustness of neural networks. Furthermore, adversarial examples can be used as extra training set to train more robust networks (Madry et al., 2017).
Existing methods of generating adversarial examples have already achieve considerable success rate in white-box attack, such as L-BFGS , C&W (Carlini & Wagner, 2017) and fast gradient sign methods (FGSM) (Goodfellow, Shlens & Szegedy, 2014). But the black-box attack success rate remains to be improved. FGSM provides a convenient way to generate adversarial examples with less computation and better transferability compared with other methods like L-BFGS and C&W. To further improve the black-box attack success rate, researchers derived more complicated methods from FGSM, referring to gradient optimization (Zheng, Liu & Yin, 2021), data augmentation (Catak et al., 2021) and ensemble models (Chowdhury et al., 2021), which are common methods to enhance the generalization of neural networks. Iterative version is firstly brought out like I-FGSM (Kurakin, Goodfellow & Bengio, 2018) and PGD (Madry et al., 2017). Momentum is then introduced to FGSM to relieve gradient oscillation and accelerate gradient convergence speed . Later, image affine transformation is used to manually increase original input and reduce overfitting in FGSM (Xie et al., 2019). Model ensemble is discussed to enhance adversarial examples performance in black-box condition Liu et al., 2016). Table 1 shows the abbreviations used in this work.
The main contributions of this article are as follows: This article refers to the idea of model augmentation and proposes the resizing invariance method (RIM) to enhance adversarial examples transferability. The proposed resizing transformation is verified to be eligible of invariance property and model augmentation. RIM utilizes the improved resizing transformation in generating adversarial examples to further increase input diversity. Similar to other data/model augmentation methods, RIM is readily combined with gradient optimization methods. In this article, momentum is used as baseline gradient optimization.
RIM can generate adversarial examples with ensemble models to realize higher blackbox attack success rate than single model. In this article, we integrate four normal models to generate more transferable adversarial examples with RIM and test on three defense models, due to the fact that defense models are more robust than normal models and can better evaluate black-box attack performance.
Experiments on ImageNet dataset indicate that RIM has better transferability than current benchmark methods on both normal and adversarial-trained models. Additionally, the proposed method can further improve the black-box attack success rate when generating adversarial examples with ensemble models. This method is supposed to help evaluate network robustness and build more secure applications.

RELATED WORKS
Image classification models are found to be evaded and output incorrectly with a simple gradient-based algorithm (Biggio et al., 2013). But this study is conducted merely on traditional networks like support vector machines, decision trees and neural networks. As for deeper neural networks like CNNs,  applied hardly noticeable perturbations on original images by maximizing the model's prediction error to force the model misclassify those images. Their method requires relatively large computation compared with Biggio et al. (2013) as the linear search is inquired during each iteration. To reduce computation cost, Goodfellow, Shlens & Szegedy (2014) proposed the fast gradient sign method (FGSM) with only a single step and gradient query. FGSM can generate adversarial examples much faster than former methods, and most upcoming methods are derived from that.
Adversarial training is one of the effective ways to mitigate adversarial attacks (Bai et al., 2021). An intuitive thought is to directly put generated adversarial examples into network training. Xie et al. (2017) utilized their own diversity input method (DIM) in generating adversarial examples to adversarial-trained models. Madry et al. (2017) applied projected gradient descend (PGD) of their own in an expanding training set. But manually expanding a training set requires stronger attack methods than FGSM and larger capacity  Tramèr et al. (2017) proposed ensemble adversarial training method, using multiple pre-trained models for an ensemble adversarial training model to improve model robustness. A more convenient way to mitigate adversarial attacks is to purify and expunge adversarial perturbations on the original input, requiring less computation than training a new network. Liao et al. (2018) introduced a denoiser to avoid adversarial noise. Guo et al. (2017) compared multiple image transformation methods like image quilting, cropping, total variance minimization, bit-depth reduction and compression. However, generally speaking, image transformation defense effect is not as good as adversarial training. Thus, this article adopts three adversarial-trained models to verify proposed method black-box attack performance. The process of generating adversarial examples for transferable image attack can be described as follows. Let x be an original image and x X be the original input set. y is the corresponding true label and y Y is the label set relevant to original input set. Lðx; yÞ is the model loss function which is usually cross-entropy. x Ã is adversarial example generated from x, and y Ã is the corresponding label. The aim is to find x Ã within maximum perturbation e so that y Ã 6 ¼ y. The x Ã and x should meet L 1 norm bound, i.e., jjx Ã À xjj 1 e, to ensure the adversarial perturbation is hardly noticed by human eyes. For text convenience, the following methods should all meet this term. Considering CNNs loss function characteristic, the problem above could be transformed a condition constrained optimization as: arg max x Ã Lðx Ã ; yÞ; s:t:jjx Ã À xjj 1 e: (1)  used L-BFGS method to transform the problem above into: minimize cjjx Ã À xjj 2 þ Lðx Ã ; yÞ; s:t: x Ã 2 ½0; 1 m : The purpose is to find minimum c > 0 with linear search and minimal x Ã À x. The problem is transformed into a convex optimization to get an optimal approximate solution. Due to the linear search in every iteration, the computation is relatively large considering numerous parameters in both image and network. Later, Carlini & Wagner (2017) proposed a similar method named C&W to fulfill both targeted attack and nontargeted attack conditions. But their method has the same shortage as L-BFGS.
To reduce computation, Goodfellow, Shlens & Szegedy (2014) studied CNNs structure and decision boundary of clean images and adversarial examples. They utilized gradient information instead of global search, greatly reducing computation. The fast gradient sign method (FGSM) is proposed and proved to be equally effective in generating adversarial examples. The process is shown as: where sign is a function to decide perturbation direction. The white-box attack success rate is not ideal due to single step process, but this method is a benchmark that derives many improved versions.
Kurakin, Goodfellow & Bengio (2018) attempted to add perturbations in batches and proposed the iterative fast gradient sign method (I-FGSM). This modification substantially increased white-box attack success rate to almost 100%. The update equation is given as: in which Clip e x is to limit x Ã within e range of x, and a ¼ e=N is the perturbation in N iteration. However, pure pursue on white-box attack success rate causes overfitting, leading massive decline in black-box attack success rate.
Referring to gradient optimization in training networks, Dong et al. (2018) introduced Momentum to generating adversarial examples and proposed the momentum iterative fast gradient sign method (MI-FGSM). Momentum could release overfitting to a certain degree, which can help escape poor local maximum and reduce gradient oscillation. By introducing momentum, the gradient information is memorized and used during every iteration. The update equation is shown as follows, in which g n is the accumulated momentum, and l is momentum decay factor.
Xie et al. (2019) referred to data augmentation and proposed the diverse input method (DIM). DIM transforms input images with a certain probability once in an iteration and the transformation process is shown in Eq. (7). The transformation includes random resizing and random padding.
with probability p with probability 1 À p ; (7) g nþ1 ¼ l Á g n þ rLðTðx Ã n ; pÞ; yÞ jjrLðTðx Ã n ; pÞ; yÞjj 1 : where S i ðxÞ ¼ x=2 i denotes the transformation copy and m is the number of copies in each iteration.
Additionally, those data/model augmentation methods mentioned above are all readily combined with gradient optimization methods. In their works, MI-FGSM is used to conduct comparison experiments.

RESIZING INVARIANCE METHOD
Overfitting is one of the main reasons to restrict the generalization of neural networks. Data augmentation, gradient optimization, ensemble models and early stopping (Ali & Ramachandran, 2022) are common approaches to relieve overfitting. Similarly, we believe the low black-box attack success rate is also caused by adversarial examples overfitting, and those approaches can also be applied to improve transferability. Baseline methods introduced above are the realizations of those approaches. In addition, SIM proposed model augmentation, which is a derived version of data augmentation, and achieved better black-box performance. Figure 1 shows the example of adversarial attack on two typical image classification models. The original image and adversarial image are hardly distinguishable by human eyes, and two typical models can correctly classify the original image as 'Lion'. But the adversarial image is misclassified as 'Sheepdog' and 'Persian Cat'.
Considering SIM utilized unvarying scale transformation and many other image transformations can also be used in model augmentation, we propose RIM, using random resizing transformation to realize model augmentation and improve adversarial examples transferability. The outlook of RIM is shown in Fig. 2 and the image transformation function is Rðx Ã n ; pÞ, randomly enlarging or reducing image. The update equation of Rðx Ã n ; pÞ is shown as: with probability p with probability 1 À p : In particular, IEðx Ã n Þ enlarges the image to rnd Â rnd Â 3 in which rnd 2 ½299; 330Þ and the surroundings are randomly padded to 330 Â 330 Â 3. IRðx Ã n Þ reduces the image to rnd Â rnd Â 3 in which rnd 2 ð279; 299 and the surrounding are randomly padded to 299 Â 299 Â 3. The image is transformed m times in each iteration and w i is the weight for each loss.
According to Lin et al. (2019), a loss-preserving transformation T should satisfy LðTðxÞ; yÞ % Lðx; yÞ for any x X, and the corresponding label of TðxÞ should be the same as x to achieve model augmentation. Besides scale transformation, we find that resizing transformation is also a loss-preserving transformation, which is empirically verified in 'Experiments'. The improved resizing transformation is deployed multiple times in each iteration to meet model augmentation requirements, and the transformation not only include image enlargement, but also include image reduction, which are the main differences from the resizing transformation in DIM.

Single model generation
Using a known model to generate adversarial examples is a typical method to conduct image attack, transferable or not. In this article, we use four normal models as generating model and adversarial examples attack success rate are verified on both normal models and defense models. When verified on generating model, the condition is white-box, while on other models the condition is black-box. Similar to other data/model augmentation methods, RIM is readily combined with gradient optimization methods. Algorithm 1 shows the example that RIM combines with momentum for comparison convenience.
Due to the fact that white-box attack has already achieved considerable success rate, computation can be reduced and generating efficiency can be improved by not adopting our method but PGD or MI-FGSM. RIM can be degenerated to other methods by adjusting some parameters. For example, RIM degrades to DIM by setting m ¼ 1 and IRðxÞ ¼ x. Moreover, RIM degrades to MI-FGSM by setting rnd ¼ 299.
Besides MI-FGSM, RIM can be combined with other gradient optimization methods such as Nesterov momentum or the adaptive gradient optimizer, only by replacing step 4 and step 5 in Algorithm 1 to their own update equations. logits and the corresponding label. Logits are logarithmic to prediction for a network using cross-entropy loss function as Eq. (12) shows. lðxÞ are logits and 1 y is one-hot code of label y.
Lðx; yÞ ¼ À1 y Á logðsoftmaxðlðxÞÞÞ: Similar to Algorithm 1, RIM by ensemble models' method is summarized in Algorithm 2, also combined with momentum for comparison convenience.

EXPERIMENTS
Relatively comprehensive experiments are conducted to verify the effect of our method. Besides basic indicators, namely attack success rate, we consider different attack conditions and different parameters influencing adversarial examples performance. And the main purpose is to further enhance transferability and relieve overfitting while maintaining considerable white-box success rate.
One thousand images are randomly selected from ImageNet dataset (Russakovsky et al., 2015), belonging to 1,000 different categories. In other words, each image has a unique label. These images are able to be correctly classified by the models we use. After processed by the attack algorithm, 1,000 corresponding adversarial examples are generated. When the model misclassifies an adversarial example, we say a successful attack is achieved. And the higher attack success rate indicates better performance of the method. Preprocessing is applied to original input by transforming the format to PNG and the pixels are 299 Â 299 Â 3 with RGB mode. Figure 3 shows the comparison of original images and adversarial examples. The difference is hardly noticeable by human eyes.
Algorithm 1 MI-RIM by single model. Input: original image x and corresponding label y, maximum perturbation e, iteration period N, momentum decay factor l, image transformation time in one iteration m, loss weight w i , and transformation probability p.
To prove the advantage of our method, we choose several baseline methods for comparison, such as MI-FGSM, DIM and TIM. The parameters are set as default according to these methods references. The maximum perturbation e ¼ 16=255 and the iteration period N ¼ 10. For MI-FGSM, the momentum decay factor l ¼ 1. For DIM, the transformation probability p ¼ 0:5. For our method RIM, images are transformed m ¼ 5 times in each iteration, and the loss weight w i ¼ 1=m. As for ensemble models, model weight w k ¼ 1=K.

Invariant property
To verify that resizing transformation is loss-preserving and fits invariant property, we apply the proposed transformation on 1,000 images mentioned above with the resizing scale from 220 to 400 with a step size 10. Figure 4 shows the average loss of transformed images on three normal models, namely Inc-v3, IncRes-v2 and Res-101. We can see that the loss lines are relatively smooth and stable when the resizing scale is in range from 270 to 330, and the loss line on Res-101 oscillates severely outside that range. Thus, we presume that the resizing invariant property is in range from 270 to 330 among tested models.
As shown in Fig. 5, the accuracy of these transformed images keeps a relatively high level on three normal models. Especially, when the resizing range is from 260 to 340, the average accuracy is more than 96%. So, we utilize the proposed resizing transformation in Algorithm 2 MI-RIM by ensemble models.
Input: original image x and corresponding label y, maximum perturbation e, iteration period N, momentum decay factor l, image transformation time in one iteration m, loss weight w i , transformation probability p, number of models used to ensemble K, model weight w k .
Full-size  DOI: 10.7717/peerj-cs.1475/ fig-3 generated from IncRes-v2 and Res-101, RIM has more than 50% success rate on Inc-v3-ens3 and Inc-v3-ens4. However, RIM does not always outperform other baseline methods. TIM has better performance on IncRes-v2-ens when generated from IncRes-v2, and SIM has better performance on all three defense models when generated from Inc-v4. Thus, different methods can be adopted to fool a model, depending on whether the model is perfectly known or hardly known. Attack efficiency can be improved with proper method chosen. And if the model is absolutely unknown, we hope that our method can be first choice in black-box condition. Suppose that three defense models are still unknown, we integrate four normal models as ensemble models. Adversarial examples are tested on both four normal models and three defense models. Thus, the condition is white-box when tested on four normal models individually, which is marked with asterisk ( Ã ) in Table 3. Similar to single model generation, MI-FGSM holds better performance in white-box condition, while our method provides better transferability. Compared with single model generation, adversarial examples generated by ensemble models have better black-box attack success rate. For example, TIM has 67.2% average rate by ensemble models, which is nearly twice by its single model average. And SIM has 36.6% average rate by single model, almost half of 70.7% by ensemble models. On all three defense models, our method holds the best performance and proves to generate more transferable adversarial examples.
As shown in Lin et al. (2019) and Yang et al. (2023), different data/model augmentation methods can also be combined to form a stronger method. Lin et al. (2019) combined three data augmentation methods (namely DIM, TIM and SIM in Table 3) with Nesterov momentum to achieve 93.5% average rate on defense models by ensemble models. Yang et al. (2023) extended this method by introducing another data augmentation method and an adaptive gradient optimizer. The average rate was enhanced to 95.3%. Given the fact

Statistical analysis
Setting transformation probability p in RIM is to control the diversity of model augmentation. We change the value of p from 0 to 1 with 0.1 step size while keeping other parameters invariant and the effect is shown in Fig. 6. Figure 6A is the results of single model generation by Inc-v3 and Fig. 6B is the ensemble models generation. The generating model  in Fig. 6A is randomly selected among four normal models that have similar point-fold lines trend, and the point-fold lines show similar trend with ensemble models generation, which means the transformation probability has similar effect on success rate of both single model generation and ensemble models generation. When p ¼ 0 or p ¼ 1, RIM degenerates to a constant model augmentation method. The white-box attack success rate hardly changes at each transformation probability, but black-box rate has an obvious drop when p is 0 or 1. Transformation probability has little effect in single model generation when p is other values than 0 or 1. And in ensemble models generation, the transferability also has little difference as long as the transformation is random. This shows the importance of stochastic process in data/model augmentation. Similar results can be found in Xie et al. (2019) and Yang et al. (2022). Thus, to make sure the two kinds of transformation methods always come up in one iteration, we set p ¼ 0:5 at last. Image enlargement and reduction scale is respectively 330 and 270. We separately change one value while keeping another one invariant. Other parameters are default regarding to this section. Figure 7A is the results of image enlargement scale effect on success rate in single model generation by Inc-v4, and Fig. 7B is in ensemble models generation. The generating model (Inc-v4) in Fig. 7A is randomly selected among four normal models that have similar point-fold lines trend. We see that the first half of blackbox attack line shows logarithmic growth, and the second half tends to be stable. The trend takes place at approximately 320 pixels. Thus, we choose 330 pixels as image enlargement scale for redundancy consideration. Figure 7C is the results of image reduction scale effect on success rate in single model by IncRes-v2, and Fig. 7D is in ensemble models generation. The generating model (IncRes-v2) in Fig. 7C is randomly selected among four Original images are transformed five times during each iteration. We increase the value of m from one to ten with one step size to study the effect of resizing frequency. Figure 8 shows the relationship between attack success rate and transformation copies in each iteration. Figure  computation cost. To balance computation cost and black-box rate, we decide to set m ¼ 5 considering the limitation of our experiment equipment.

CONCLUSIONS
In this article, we propose the resizing invariance method in generating adversarial examples. RIM refers to data augmentation in training neural networks and introduces improved resizing transformation to achieve invariant property and model augmentation.
Our main purpose is to improve adversarial examples transferability and relieve overfitting, which is directly reflected by black-box attack success rate. Ensemble models is introduced to further enhance transferability. Experiments are conducted on ImageNet dataset to verify the effect of our method. Compared with baseline methods, RIM has higher black-box rate on normal models. As for more challenging defense models, RIM dominates the most models and has the highest average success rate. The advantage is more obvious in ensemble models with 74.6% average success rate. Our work proves the feasibility of model augmentation in image transferable attack, and other methods for enhancing network generalization are also likely to be used to enhance adversarial attack transferability. Due to the multiple and random transformations in each iteration, RIM consumes more computation resources and takes longer computation time than other baseline methods. Also, athough RIM holds better success rate on adversarial-trained models, it remains to be seen whether the adversarial examples generated by RIM can be directly used for adversarial training. We hope our attack method and models ensemble method can help develop more robust networks.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This work was supported by the National Key Research and Development Program of China under Grant No. 2017YFB0801904. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors: National Key Research and Development Program of China: 2017YFB0801904.