Potential auto-driving threat: Universal rain-removal attack

Summary Severe weather conditions pose a significant challenge for computer vision algorithms in autonomous driving applications, particularly regarding robustness. Image rain-removal algorithms have emerged as a potential solution by leveraging the power of neural networks to restore rain-free backgrounds in images. However, existing research overlooks the vulnerability concerns in neural networks, which exposes a potential threat to the intelligent perception of autonomous vehicles in rainy conditions. This paper proposes a universal rain-removal attack (URA) that exploits the vulnerability of image rain-removal algorithms. By generating a non-additive spatial perturbation, URA significantly diminishes scene restoration similarity and image quality. The imperceptible and generic perturbation employed by URA makes it a crucial tool for vulnerability detection in image rain-removal algorithms and a potential real-world AI attack method. Experimental results demonstrate that URA can reduce scene repair capability by 39.5% and image generation quality by 26.4%, effectively targeting state-of-the-art rain-removal algorithms.


INTRODUCTION
As more autonomous driving companies gradually embark on commercial landing projects and more passenger cars and trucks are equipped with assisted and autonomous driving features to actively play a role in real-world scenarios such as online vehicles, 1 intercity delivery, 2 and long-distance 3 and high-speed deliveries, 4 more and more people are expressing high expectations for autonomous driving and demonstrating its great commercial potential. 5 While artificial intelligence (AI) has made excellent progress in ADAS and autonomous driving tasks, its robustness and safety in deployment environments are still not widely recognized by society, and research and validation of AI for complex environment awareness has never stopped. 6 Of concern is the heavy reliance on deep learning (DL) models for environment perception in autonomous driving systems. The RGB images captured by on-board cameras are the critical perceptual path for intelligent perception of autonomous driving, but perception capabilities are often affected by rain in the driving environment. 7 As a complex atmospheric process, rainfall during vehicle driving will result in unpredictable visibility degradation. Often, raindrops/streaks close to the optical sensor may obscure or distort the scene content in the image, while rain streaks deeper in the scene often produce large areas of tiny streaks of obscuration or fog etc., which in turn blur the image content. 8 This reduced visibility will create a potential safety threat to AI systems deployed in autonomous driving systems, such as pedestrian trajectory prediction, 9 scene understanding, 10 target recognition, 11 and semantic segmentation, 12 increasing the likelihood of safety incidents. Therefore, removing rain streaks and raindrops from environment-aware RGB images to restore scene information, i.e., image rainremoval, has become a key pre-processing tool to enhance AI-aware safety. As a key research topic to ensure AI safety during autonomous driving, many DL-based image rain-removal studies have yielded excellent results and attracted significant attention in the fields of computer vision and pattern recognition. 13 However, related research has shown that adversarial examples initially designed to influence generic DL models can also be used to cause failures in autonomous driving tasks. 14 In related cases, some imperceptible image perturbations are often added to the perceptual image of the camera to reduce the accuracy of the DL model or to increase the functional error rate of other AI systems. 15 Such adversarial threats may also be present in image rain-removal algorithms. From an autonomous driving perspective, any inaccurate results of the image rain-removal algorithm due to perturbations may compromise driving safety. More seriously, malicious perturbation attacks will be difficult to identify in cases where humans do not easily detect perturbations, allowing perpetrators to get away with it. The hazards of imperceptible human perturbations on DL-based AI systems have been well discussed in previous studies. Still, their impact on DL-based image rain-removal algorithms has not been verified and discussed in depth. This research gap would prevent the application of autonomous driving in countries or regions with high average annual precipitation. This paper aims to analyze potential threats to DL-based rain-removal algorithms, validate their vulnerability, and demonstrate their potential hazards, as shown in Figure 1. We propose a threat validation method for DL-based rain-removal algorithms to facilitate this research. The proposed method incorporates a non-additive perturbation generation method based on image spatial transform. Perturbations constructed based on this spatial-transform rule are applied to the rain-observed image to interfere with the rain-removal results, which are analyzed in terms of human eye observation, pixel distribution, and AI detection, as shown in Figure 2.
Two key technical challenges are addressed in implementing this validation method, as shown in Figure 3: 1) Non-additive perturbation construction. Unlike adding noise directly to the image pixels, non-additive perturbation is achieved by swapping and fusing pixel values to achieve interference. The proposed method describes the direction and step of pixel changes in rain-observed images by generating a flow field comparable to the image size. 2) Optimization of perturbation generation for DL-based image rain-removal. The proposed method designs an end-to-end generative model and constructs loss functions based on luminance, contrast, and structural differences from the perspective of human visual understanding. The generative model will maximize the loss between the unperturbed and perturbed rain-removal results.
To explore the potential threat of such perturbations to intelligent RGB image-based perception systems in autonomous driving tasks, we propose the universal rain-removal attack (URA), details in Table 1, with a neural network-based generative model. URA can learn from the trial-and-error data of confrontation with the state-of-the-art rain-removal solutions to capture the knowledge of nonlinear disturbances for rainremoval tasks in the real world. Its perturbation results were subjected to adequate human eye observation, pixel distribution analysis, and computer vision AI detection. By connecting the experimental results with iScience Article relating the pixel distribution, we demonstrate the vulnerability of DL-based image rain-removal algorithms. Specifically, the contributions of this paper are as follows.
(1) This paper introduces potential security threats for the first time, to our knowledge, to image rainremoval algorithms and gives hypothetical attack scenarios. The experimental results in this paper clarify the potential security threat of image rain-removal algorithms. It is a pain point for AI applications in autonomous driving. The hypothetical attack scenarios are given to elaborately describe the vulnerability detection and exploitation process of image rain-removal algorithms, forming a classic case study for AI-aware security research in autonomous driving.
(2) The possible robustness problem of DL-based image rain-removal algorithms is demonstrated. This paper proposes a universal perturbation generation method for the image rain-removal problem, which generates a unique non-additive perturbation through a neural network. The perturbation will distort the pixel distribution of the image rain-removal dataset. By analyzing the performance difference of the rain-removal algorithm after the perturbation, the robustness problem of the DL-based image rain-removal algorithm is clarified.
(3) The proposed perturbation generation method provides new evidence of the impact of sensor degradation on computer vision systems and the safety of autonomous driving. The unique and non-noticeable perturbation corresponds to potential sensor degradation. It can be used for malicious attacks and assessing sensor degradation's impact in autonomous driving applications.

RESULTS
This paper develops complete experiments in image rain-removal, attack perturbation generation, and AI performance interference. iScience Article Experimental setting A workstation was used to run all experiments in this work, which included two Intell Xeon E5-2620 processors and an Nvidia RTX 3090-24G GPU. All code in this paper is based on Python, where all neural networks are built, trained, and deployed based on PyTorch. In order to constrain the experimental cost, this paper takes RainCCN 16 and DerainRLNet 17 as the attack object and RainDS, 16 which contains 1000 photos collected in real, as the dataset for training and verification. Under the current computing power conditions, it takes an average of 0.23 s to attack 16 images based on the generated universal perturbations, and its attack frequency can reach about 4.23 Hz. The main hyperparameters used in this work are shown in Table 2. RainCCN and DerainRLNet was trained from scratch as an attack target, using the RainDS training set as input, with 16 images at a time and updated at a learning rate of 0.001 and 0.0002. This paper does not deeply study the training process of the DL-based rain-removal algorithm, but only focuses on its final rain-removal performance. URA, a deep neural network-based generative model, iterates through 200 epochs in the RainDS training set at a learning rate of 0.01, sampling 100 samples from the training set for each update. The Adam optimizer was used to train this neural network model with an L2 regularization weight of 0.0001 and a momentum gradient calculation constant (b1, b2) of 0.5, 0.9, respectively. The structural similarity index measure (SSIM) fractional calculation constants c1, c2, and c3 are 0.0001, 0.0009, and 0.00045, respectively. This work only employs of RainCCN as the victim rainremoval solution for URA during the training phase. During the subsequent testing phase, URA employs identical flow fields to attack both RainCCN and DerainRLNet, thereby substantiating its acquisition of generalized non-additional perturbation knowledge.
In addition to using SSIM and peak signal-to-noise ratio (PSNR) to describe the image quality after rain removal, an evaluation system for rain-removal images from an AI perspective will help us to evaluate the robustness of the rain-removal algorithm and its vulnerability rating. Unlike SSIM and PSNR, image reconstruction quality evaluation from the perspective of AI depends more on the restoration effect of the potential semantics of the image. This paper analyzes the vulnerability of rain-removal algorithms from an AI perspective by quantifying the semantic understanding gap of AI in rain-removal images by taking advantage of the output by potential AI applications on clean background images as an evaluation benchmark. This paper uses Yolo v5 18   where d represents the one-hot code of an image content given by AI, S is the content of the image as identified by the AI, I is the indicator function, and S and C denote the identify error and the confidence bias, respectively. Table 3 shows the evaluation results on the test dataset of the RainDS. RainCCN can achieve approximately 81% SSIM to clean images in the test set and reach a PSNR value of 27.13. DerainRL can achieve approximately 51% SSIM to clean images in the test set and reach a PSNR value of 19.98. The rain-removal performance of RainCCN is reduced in the case of random spatial variation attacks, with SSIM reduced by 17.3% and PSNR reduced by 16.9% and the rain-removal performance of DerainRLNet is reduced in the case of random spatial variation attacks, with SSIM reduced by 11.76% and PSNR reduced by 9.15%. Under the URA attack, the rain-removal performance of RainCCN is significantly reduced, with SSIM reduced by 39.5% and PSNR reduced by 26.4% and the rain-removal performance of DerainRLNet is reduced, with SSIM reduced by 15.68% and PSNR reduced by 13.86%.

Perturbations undetectable to humans
The rain-removal algorithm analyzes the association between image pixels to remove the blurring and obscure image scene information by raindrops/rainstreaks. Figure 4 shows the rain-removal and attack results for the ten observed images. These images demonstrate the potential threat to DL-based image rain-removal algorithms.
(1) Humans cannot distinguish whether a rain-observed sample has been attacked. The perturbations in the ''Adv-Rain'' images are so similar to the environmental information that human vision cannot analyze them for any potential suspicion.
(2) Humans are unable to distinguish the presence of malicious attacks or sensor degradation problems from the rain-removal results. Human vision can detect the degradation of rain-removal performance from the ''Adv-Removal'' images due to the perturbation attack, but the cause of the degradation is not directly available. The black-box nature of DL models makes this problem difficult to solve at all.
The presence of these potential threats greatly hinders the use of autonomous driving in high rainfall areas and creates insurmountable difficulties in the investigation of possible accidents. The pixel distribution of the samples further validates the concern that the presence of such threats is linked to technological developments. Figure 5 shows the pixel distribution of a sample in the RGB channels. The results show

AI testing in computer vision
In computer vision, the ability of AI to analyze and understand images is entirely different from humanity, and its ability to analyze and process features at the pixel level has led to a significant evolution in its pattern recognition capabilities compared to people. Table 3 shows the evaluation of potential threats to the DLbased rain-removal algorithm from the perspective of state-of-the-art AI. The rain-removal results without the URA attack can be identified by Tencent AI and Yolo v5 with high accuracy and precision. After the attack, the AI of computer vision began to suffer performance degradation. The random attack is generated by randomly initialized networks based on Gaussian noise, which represents the threat of unoptimized flow filed to the victim rain-removal network. Under the attack of random flow field perturbation, the rainremoval result of the attack reduced the scene information recognition ability of Tencent AI by 66.6% and the confidence of correctly identified scene information by 41.3%; relatively under the attack of URA, the rain-removal result of the attack reduced the scene information recognition ability of Tencent AI by 91.7% and the confidence of correctly identified scene information by 47.2%. The results show that Yolo v5 shows better robustness, but still realizes a similar trend.  iScience Article Figure 6 shows the scene recognition results of Tencent AI in the sample, where the rainy observation was identified as an artificial fountain with high confidence, which is related to the image being obscured by large rain streaks. Tencent AI gives convincing scene recognition results and corresponding confidence in clean background images. RainCCN was able to provide a satisfactory rain-removal image without the attack that Tencent AI could correctly identify with a reasonable level of confidence. However, the URA attack significantly reduces confidence in correct identification and leads to confusing identification errors. Even though Yolo v5 showed strong robustness, the potential threat was only mitigated rather than resolved. Figure 7 demonstrates the powerful multi-target object recognition capabilities of Yolo v5, which is enhanced by the rain-removal results provided by RainCCN in a rainy environment before it is used for an attack. However, the URA attack still reduces confidence in its correct identification results, (1) The robustness of image rain-removal algorithms may significantly impact subsequent image intelligence perception tasks. The AI pixel sensitivity, as demonstrated by the experimental results, implies that a high-performance rain-removal algorithm can effectively improve the AI deployment in rainy weather, but the robustness issues of the rain-removal algorithm may also lead to poor results.
(2) Potential malicious attacks and sensor degradation may exploit the vulnerability of the image rainremoval algorithm to cause probable traffic accidents. URA, as a universal perturbation generation  iScience Article method, not only has the potential to be deployed and used for criminal purposes in the real world but can also be seen as a model for sensor degradation. The experimental results demonstrate that the perturbation attack represented by the URA is able to exploit the vulnerability of the rainremoval algorithm to significantly weaken the AI.

DISCUSSION
This paper validates and explores the possible vulnerability and robustness issues of DL-based rainremoval algorithms by constructing a universal image perturbation. The experimental results show that the proposed perturbation generation method, URA, can significantly reduce the image rain-removal performance and exploit this feature to weaken post-attached computer-vision-based AI. It exposes a number of potential threats that computer-vision-based autonomous driving intelligence perception tasks may face.
(1) Malicious attacks in real. The proposed perturbation generation method has good potential for practical application and is executable for replication in the real world.
(2) Robustness issues of the image rain-removal. Inadequate performance of image rain-removal algorithms and sensor degradation issues may cause AI-weakening problems.
(3) Difficulties in accident investigation. Undetectable human perturbations in the image rain-removal task of the autonomous driving process can potentially cause serious intelligence perception problems and thus induce safety accidents. Such perturbations are difficult to identify in subsequent accident investigations.
The experimental findings additionally propose a potential solution: Advanced AI systems have the capacity to against rain-affected images. The experimental results showed that Yolo v5 exhibited resilience against the URA.

Limitations of the study
Under the influence of multiple and complex observable factors, this paper has the following limitations.
(1) Insufficient targets for the perturbation attack. Due to the limitation of computational resources, the lack of relevant open-source projects, and the fact that some of the studies cannot be reproduced, only RainCCN is used as the attack target in this paper. Not only does it achieve the state-of-the-art (SOTA) in the current rain-removal problem but it is also the best result reproduced in this paper. Adding attack targets will be an important task in subsequent research.
(2) Limited validation data. There is less high-quality real data in the image rain-removal dataset, and it is difficult to develop and validate perturbation attacks in synthetic data convincingly. Therefore, constructing rain-removal datasets from natural driving behavior in future research would be effective in developing this research methodology.
(3) No attack examples in the real world. This paper only shows the perturbation results of the proposed attack method on the dataset and indicates its potential for real-world applications. Due to the lack of relevant hardware devices, this paper cannot show an example of an attack that occurred in the real world. Therefore, the proposed attack method will be further extended, and real-world attack cases and analysis will be given in future research.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

ACKNOWLEDGMENTS
We would like to thank Dr. Yuqing Lin (Scientific Editor) and two anonymous reviewers for their constructive comments to improve our paper.

DECLARATION OF INTERESTS
The authors declare no competing interests.

DECLARATION OF GENERATIVE AI AND AI-ASSISTED TECHNOLOGIES IN THE WRITING PROCESS
During the preparation of this work, the author(s) used GPT-4 in order to improve the language and readability. After using this tool, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Image rain-removal
The image rain-removal solutions can be divided into physical attribute-based rain-removal algorithms and deep learning-based rain-removal algorithms. The detection and removal of raindrops from images have become to be a more and more vital research area based on the physical properties of rainwater. Falling raindrops are subjected to many physical conditions and thus deformed, such as surface tension, hydrostatic pressure, ambient illumination and aerodynamic pressure. 19 These irregular perturbations appear as raindrops/rain streaks of different brightness/orientation and contaminate the scene information in the image. 20 Beard and Chuang 21 proposed the equilibrium shape model of raindrops which described a kinematic model of raindrops in images. In addition to this, physical information such as luminance, chromaticity and temporal space have been developed for image raindrop/rainstreaks detection and elimination. However, there is room for improvement in the restoration of scene information in physical property-based image rain-removal methods because no reasonable linkage with scene information has been established. 22 In order to figure out how to use image filters to introduce image scene information in physical property-based rain-removal methods to enhance scene restoration. Xu et al. 23 proposed an enhancement scheme for chromaticity rain removal based on guided filters. In addition to this sparse iScience Article coding-based dictionary learning, histograms of oriented gradients-based 24 and a priori-based 25 approaches have been proposed successively to enhance the scene restoration capability of rain removal algorithms. As a data-driven feature self-learning method, Deep learning has a good advantage in learning image scene information. Aiming to remove dirt and water droplets adhering to glass windows or camera lenses, Eigen et al. 26 proposed a convolutional neural network-based (CNN-based) image rain-removal algorithm. However, the method was unable to handle the dense raindrops and dynamic raindrops and produced blurred output. Qian et al. 27 designed a fine-grained generative network to cope with the presence of a large number of raindrops. This work introduces visual attention mechanisms into the design of the generative adversarial network. 28,29 The generative network focuses on the raindrop region and its surroundings which is used to generate a similar image with the surroundings of the background images and without raindrops at the same time, and the discriminative network was mainly used to evaluate the similarity between the rain-removal images and the clear images. Fu et al. 30 designed a CNN-based approach, DerainNet, to specifically handle individual image rain streak removal, which automatically learns a non-linear mapping function between clean and rain image details from the dataset. The Deep Residual Network (ResNet) extends the structural depth of neural networks with powerful feature learning capabilities. 31 Fu et al. 32 proposed a Deep Detail Network (DDN) to reduce the range of mappings from input to output through the introduction of ResNet. Similarly, Fan et al. 33 proposed a residual-guided feature fusion network (ResGuideNet) that gradually obtains coarse to fine estimates of negative residuals as the network progresses. Zhang and Patel 34 further proposed a density-aware image drainage method using a multistream dense network (DID-MDN), which can adapt to the different rain densities by integrating a residual-aware classifier process. Li et al. 35 proposed a cyclic squeeze and excitation-based contextual aggregation network (CAN). The CAN for individual image rainwater removal, where the SE block not only assigns different values to various rain streak layers but also make the CAN obtain large receptive fields and have a good performance for adapting to the rainwater removal task. The previous research has placed great emphasis on exploring the structure of deep neural networks (DNNs), and they have aimed to improve the network structure to achieve better image recovery. Cheng and Hao 17 propose a deraining method that replaces low-quality features with latent high-quality features using closed-loop feedback from automatic control theory and introduces error detection and feature compensation to address model errors, resulting in superior performance compared to state-of-the-art methods on benchmark and real datasets. Quan et al. 16 proposed a complementary cascaded network (RainCCN) that effectively removes raindrops and rain streaks within a unified framework using neural architecture search and introduces a new real-world deraining dataset (RainDS) with diverse rain types and corresponding groundtruth images, demonstrating superior performance.

Adversarial sample
The Fast Gradient Sign Method (FGSM) 36 uses gradients to generate adversarial samples. The samples contained small perturbations that were imperceptible to humans, but the DNNs produced highly confident incorrect answers to these inputs. They further demonstrated that an attacker could create adversarial inputs for specific labels. Later researchers showed how scaled gradient iterations could be applied to the original input image. The Projected Gradient Descent (PGD) attack is proposed as an iterative method, which showed better attack performance than FGSM. 37 The Carlini and Wagner attack (C&W) describes how to generate adversarial samples by solving the optimization problem efficiently. 38 Moosavi et al. 39 proposed an important work to generate universal adversarial perturbations (UAP) by the neural network. The UAP integrates perturbations learned from each iteration. If the combination fails to mislead the target model, UAP will perform a new perturbation and then project the new perturbation onto the l2 norm sphere to ensure that the new perturbations are sufficiently small and satisfies the distance limit. This method will continue to run until the empirical edrror in the sample set is sufficiently large or the threshold error rate is satisfied.
Hayes and Danezis 40 further developed UAP and they proposed the universal adversarial network (UAN). It is composed of a deconvolution layer, a batch normalization layer with an activation function and several fully connected layers at the top. The UAN includes a distance minimization term in the objective function and the size of the generated noise is controlled by a scaling factor. Poursaeed et al. 41 used the ResNetbased generator to generate a generic adversarial perturbation (GAP) on the semantic segmentation tasks. iScience Article Mopuri et al. 42 proposed an adversary generative model called NAG, which proposes an objective function that aims to reduce the confidence in benign predictions and increase confidence in other categories. There is a diversity term is introduced into the objective function to encourage diversity in perturbations.
Fawzi and Frossard 43 suggests that convolutional neural networks are not robust in terms of rotation, translation, and expansion, which demonstrates the potential threat of pixel space variation to deep learning models. Xiao et al. 44 further proved this view, they argued that the traditional ln constraint might not be an ideal measure of similarity between two images. Therefore, they proposed an optimization method for spatial transformation that is able to generate perceptually realistic adversarial examples with high deception rates by changing the position of pixels rather than adding perturbations directly to a clean image. It manipulates the target image according to a pixel substitution rule called "flow field", which maps each pixel in the image to another. In order to ensure the perturbated image is perceptually close to the clear image, they also minimize local geometric distortion in the objective function. Their experimental results show that this non-additive perturbation (flow field) is more difficult for humans to detect than additive perturbations.
In previous research, DNNs have been shown to be potentially vulnerable and can be exploited to trigger erroneous output. The generic and non-additive methods of attack are more concealable and destructive. These studies provide a valuable reference for the methodology to be proposed in this paper and provide the methodological basis and feasibility rationale.

General DL-based rain-removal process
Deep learning has from the outset focused on mining potential information connections between image pixels, and its powerful feature self-learning capabilities have been demonstrated in a variety of image restoration tasks. In the image rain removal task, DL-based image rain removal methods are able to accurately identify raindrops/streaks and restore key scene elements that are obscured/blurred. This section clarifies the general process of DL-based image rain-removal methods.

Image observation in a rainy scene
Although images captured on rainy days in the real world often contain complex, unpredictable and incomprehensible blocks (sets) of pixels, in the potential consensus reached by the rain-removal algorithm, the observed image of a rainy day can be represented as: where O is the observation image of the monocular camera in the rain scene, B represents the image information of the background environment, and R represents the blurred mask with raindrops/streaks. There are three manifestations of rain information in rain scenes.
When the background environment raining and the image perception element protects the glass is not attached to the raindrops, the obtained perceptual image will be covered by rain streaks: where R s denotes the pixel information of rain streaks in the background.
In the case of a background scene where rain has stopped and there are raindrops adhering to the glass, the perceptual image obtained will be obscured by the raindrops and the scattering of light caused by the raindrops will further obscure the scene information in the image: In the case of a background scene with raining and the presence of raindrops adhering to the glass, the perceptual image obtained will be blurred by the raindrops while being obscured by the rain streaks: where h represents the atmospheric scattering constant, and relevant studies have proved that atmospheric light can be simplified to constant terms in image clarity research. The all of the variables in (Equations 2-5) are matrixes which shape is (rgb, h, w), where w and h are the width and height of the image respectively, the rgb represent the RGB channel, in this work rgb is 3. Each value in these matrixes is be- In general, the combined effect of rain tracks and raindrops will minimise visibility and corrupt scene information in images, negatively impacting on the environmental perception and cognitive abilities of humans and AI.

DL-based end-to-end rain-removal solution
Compared to traditional machine learning, which requires complex feature analysis and data processing methods, DL has a powerful feature self-learning capability. It can adopt an end-to-end supervised learning model with complex neural network architecture design and hyperparametric filtering to achieve perceptual capabilities beyond human vision. In neural network-based image rain-removal solutions, end-to-end neural network models are widely used for feature learning and reduction with the help of convolutional and deconvolutional neural networks, with a general objective function of: where q denotes the end-to-end neural network; q (O) denotes the rain-removal sample; L denotes the loss function, being used to describe the difference between samples; and Jq describes the optimization objective of the neural network. The improvement of the loss function is a key approach to maximize the image scene information reduction capability of deep neural networks, where the introduction of structural similarity of images (SSIM) is considered to be the most suitable objective function design for the needs of the human visual system. SSIM describes the degree of similarity of image samples by comparing local luminance, contrast and structure: 8 > > > > > > > > > > > > < > > > > > > > > > > > > : SSIMðx; yÞ = lðx; yÞ a $cðx; yÞ b $sðx; yÞ g lðx; yÞ = 2m x m y +c 1 m 2 x +m 2 y +c 1 cðx; yÞ = 2s x s y +c 2 s 2 x +s 2 y +c 2 sðx; yÞ = s xy +c 3 s x s y +c 3 ; x = qðOÞ; y = B (Equation 7) where x represents a rain-removal image sample and y represents a clear ambient image sample. l(x, y) is the luminance contrast function; c(x, y) is the contrast comparison function; and s(x, y) is the structure contrast function. Where m x and m y are present the pixel sample mean of x, y respectively. sx, sy, are the standard deviation of x, y; sxy means the covariance of x and y; c1, c2, c3 are the fractional stability constants. In general, the larger the SSIM value, the better the quality of the scene information reproduction the easier it is to be recognised and understood by humans. Therefore, together with a high-performance gradient optimiser, it is easier to achieve good rain-removal performance by using the negative value of SSIM as the objective function for neural network optimisation: L q ðqðOÞ; BÞ = L 1 ðqðOÞ; BÞ À lSSIM ðqðOÞ; BÞ (Equation 8) where L1 denotes the mean absolute value error between the rain-removal sample and the background image sample; l denotes the importance factor of SSIM. L1 loss ensures that the neural network optimiser reduces the absolute bias of the pixels, and SSIM ensures that the neural network optimiser attempts to reduce the visual comprehension bias. In addition to evaluating sample similarity using SSIM, this paper evaluates image rain-removal performance in terms of both image sharpness and semantic understanding. Peak Signal-to-Noise Ratio (PNSR), as a metric to quantify the reconstruction quality of images and videos affected by lossy compression, is used to express the ratio between the maximum possible power of a signal and the power of corrupted noise that affects the fidelity of its representation: The life cycle of DL-based rain removal As a classic AI system, Figure 1 illustrates the common life cycle of a DL-based rain removal algorithm. In an image rain-removal task, the data access, storage, and delivery will be strictly limited and managed from data acquisition to model evaluation, which is a production line process. However, when the system is deployed the AI system, the DL-based rain-removal algorithm, will access data from external sources. At this point in its life cycle, the closed loop of secure development is broken, and it is formally exposed to external threats.

Potential attack scenarios
By analysing the life cycle of the DL-based rain-removal algorithm, this paper constructs a potential attack scenario as shown in Figure 2. External malicious adversarial attacks and sensor degradation are common threats to in-vehicle artificial intelligence. In the proposed attack scenario, the normal rain-removal module would generate a high-quality clean background image based on the rain observation image. However, in both malicious attacks and sensor degradation, an unnoticeable noise can be applied while or after the camera captures an image. This noise can affect subsequent intelligent computer-vision-based perception modules and thus potentially increase the risk of accidents in autonomous driving. The ability to identify or exclude the effect of noise on rain removal results is a gap in current research on DL-based rain removal algorithms. Assuming that noise is not readily recognisable by humans and can significantly corrupt image rain-removal results, such noise would significantly increase the difficulty of accident investigation, even if accident investigators are unable to determine the presence of noise or the presence of a potentially malicious attacker.
The generate perturbations that are difficult for humans to detect.
The perturbation has universal rain-removal attack (URA) capabilities and has the potential for realistic applications.
The perturbed rain observation image can significantly degrade the rain removal performance of DLbased images.
Based on this potential attack scenario, this paper identifies the basic objectives of the proposed attack method.

End-to-end universal non-additive perturbation generation
This section describes an end-to-end attack method for DL-based rain-removal by generating an image space perturbation that cannot be detected by humans to degrade the rain-removal performance. The proposed method consists of two main modules: 1) an image spatial change module. It is responsible for updating the pixel values in a specified direction of spatial variation and by a prescribed interpolation method. 2) A deep neural network-based pixel flow field generator. It will generate the pixel flow field for a data-driven image rain-removal algorithm. Its general process is: where w denotes the neural network employed in the proposed method, which maximises the disturbance capacity of the perturbation by learning the dataset features; Uw denotes the Universal flow field (UFF) and is the output of the neural network, which describes the direction and magnitude of the spatial ll OPEN ACCESS