Machine learning (ML) has become an integral part of modern life, starting from the medical field (medical diagnoses with subsequent treatment) and ending with finding friends in social networks. In recent years there has been an increase in the amount of available data and computing power, which as a result has led to a wider use of ML methods as a convenient tool for data classification and clustering and other tasks in the medical field. The relevance of the study comes from the increase in the number of different attacks on medical ML systems. In this regard, it is necessary to analyze existing vulnerabilities and improve known protection methods.

Medical images for diagnosing and detecting diseases are black-and-white CT images. For example, using such images, the following diseases can be diagnosed: Parkinson’s disease, breast cancer, lung cancer, brain cancer, pneumonia, coronavirus infection, and others.

In [1], an attack is considered in the context of medical systems (medical image recognition system), where the authors note that evasion attacks on medical images can be more successful than attacks on natural (ordinary) images. That is, a successful attack requires less perturbation, and black-and-white computed tomography images are used to diagnose diseases, the noise pollution of which may not be noticed by the human eye, which excludes the possibility of the sample being put aside by a specialist.

Source [2] considers a study by Israeli infosecurity specialists who managed to attack a medical system that diagnoses lung cancer. They managed to edit the CT images of patients, and neither doctors nor the programs they use could recognize the fake. In most cases, the diagnoses were wrong.

The researchers noted that such an attack can have a great impact on people’s lives, affecting both the health and moral order of individuals, and even influencing policy.

1 AML ATTACKS

1.1 Review of the Literature

Works [3, 4] consider the classification of adversarial machine learning (AML) attacks. According to sources, three types of AML attacks can be distinguished: poisoning, garbling, and exploratory. Knowledge of the intruder is also taken into account: they may know the learning algorithm or the data involved in learning, or both, or nothing at all. Thus, three attack strategies can be distinguished: white box, gray box, and black box. According to [5, 6], attacks can be classified according to the method of impact, security breach, and specificity. The disadvantage of all existing AML-attack systematizations is the lack of a unified and complete systematization of the most important criteria, as well as security violations regarding the so-called component parts of the system (such as input data, the system itself, and the output result).

In [1], an attack is considered in the context of medical systems (medical image recognition system), where the authors note that evasion attacks on medical images can be more successful than attacks on natural (ordinary) images. That is, a lesser perturbation is required for a successful attack. In [7], an adversarial attack on medical ML systems is considered in the context of insurance payments, when this type of attack is used for profit.

In [8], the authors test two modern neural networks for resistance to ten different types of evasion attacks, demonstrating that these models are indeed vulnerable to this type of attack. In [9], the neural network InceptionV3 is also tested using chest X-ray images and histological images of malignant tumors. The authors determined that neural networks trained to classify histological images were more resistant to attacks than networks trained to classify chest CT images.

Work [10] considers a universal adversarial attack on deep neural networks for the classification of medical images. The authors found that deep neural networks are vulnerable to both untargeted attacks of this type and targeted ones. It was also found that the vulnerability to this attack depended very little on the model architecture. The authors emphasize that adversarial learning, which is known to be an effective method of adversarial protection, increases the resilience of deep neural networks to a universal adversarial attack only in very few cases.

The disadvantage of the reviewed studies is the lack of consideration of various approaches to protection against evasion attacks in the context of the medical field for solving a specific problem—for example, for classifying CT images. This problem is considered in this work.

1.2 Systematization of AML Attacks

Based on already existing classifications [36], the following systematization of AML attacks was compiled, shown in Table 1. The authors identified two modes of operation of the protected object (ML system): operation mode and training mode. The classification is based on the type of security violation or violation of the availability, integrity, and confidentiality of objects (from which the training set, input data, and model are selected).

Table 1. Systematization of AML attacks

Next, an evasion attack and methods of protection against it will be considered, since this type of attack, according to the authors, is the most dangerous. A successful evasion attack can affect medical actions, and thus the lives and health of people (compared to exploratory attacks) in the context of the medical field. This method does not require access to the training sample, like poisoning attacks, so it is applicable to systems that were created in a trusted execution environment at the stage of their operation.

1.3 Features of Medical Image Analysis Systems As an Object of Attack

In medicine, ML algorithms can be applied in various fields. Hereinafter, the field of diagnostics, in which medical images are directly applied, will be considered. Medical images in the diagnosis and detection of diseases are often black-and-white CT ones. For example, using such images, the following diseases can be diagnosed: Parkinson’s disease, breast cancer, lung cancer, brain cancer, pneumonia, coronavirus, and others.

In [1], the authors note that evasion attacks on medical images can be more successful than attacks on natural (ordinary) images. This is due to several reasons:

(A) Some medical images have complex biological textures resulting in higher gradient regions that are sensitive to small opposing perturbations;

(B) Modern neural networks that are part of ML systems and designed for large-scale natural image processing can be overparameterized for medical imaging problems (too complex for relatively simple classification problems), which leads to a high vulnerability to attacks (a small perturbation gives a “strong” reaction; strong influence of the loss function on the final result).

It can also be noted that black-and-white CT images are often used for the diagnosis of diseases, the noise pollution of which may not be noticed by the human eye, which excludes the possibility of the sample being put aside by a specialist.

2 EVASION ATTACKS PROTECTION METHODS

All evasion attacks protection methods can be divided into two main categories: proactive and active methods [11]. Proactive methods warn of a possible attack, while active ones fight in real time and try to eliminate or mitigate the possibility of an attack. Let us give a brief description of known solutions.

An adversarial learning method [12, 13] is an attempt to improve the reliability of a neural network by training it with adversarial samples. The disadvantages of this method are the complexity and inability to cover the full range of possible competitive samples in all cases.

A randomization method, which includes random noise pollution and/or data modification [1416], is based on a random change in input data: for example, image size, image filling with zeros, image compression, and bit depth reduction before being fed to the classifier. This protection attempts to transform hostile effects into random effects, which are not a problem for most neural networks. If the intruder knows the system construction (white box), the protection can be compromised, taking into account the possibility of its existence. The disadvantage is also a decrease in the model performance.

A compression method [17] is an active protection method. The fact that a sample is adversarial is determined by comparing the model’s predictions on the original and compressed images. Data cleaning methods (GAN-based methods [18, 19]) are aimed at removing perturbation from a sample recognized as adversarial. However, this protection may not be effective for the attack described by Carlini and Wagner in [20].

An automatic noise suppression method [21] is similar to the previous one: the automatic encoder learns the variety of benign samples, and the detector distinguishes between hostile and benign samples based on the learned sample. The reformer, on the other hand, transforms a malicious sample into a legitimate one. The algorithm in [22] uses a loss function to minimize the difference at the feature level between benign and hostile samples. The disadvantages of this method are the relative complexity in model building, and this method may not always be effective against white box attacks.

Table 2 provides a systematization of methods for protecting against evasion attacks (or adversarial attacks).

Table 2. Evasion attack protection methods

3 PROTECTION AGAINST ML EVASION ATTACKS IN MEDICAL IMAGE ANALYSIS

3.1 Pilot Study

The Python 3.7 interpreter and the TensorFlow library [23] were used for the experiment. A simple convolutional neural network was trained, and CT images of people diagnosed with COVID-19 were taken as training and test data [2427]. The number of training and test sets are presented in Table 3.

Table 3. Number of training and test samples

Table 4 presents the results of neural network training.

Table 4. Neural network training results

The accuracy of the trained model was 98.03%.

Further, noise of varying intensity was added to each of the samples, one of which is attributed to the COVID class, and the other to the NORMAL one. The first one looks like small inclusions, and the second one is Gaussian noise, more homogeneous.

Two images were generated for each type of noise: less intense and more intense. To overlay more intense Gaussian noise, several iterations of adding noise were made. In the tables below, less intense noise is indicated under number 1 in brackets; more intense noise is under number 2 in brackets. The original image was also rotated 90°.

As a result, the images obtained by changing the sample of the COVID class were also attributed to this class (the probability of classifying the sample as NORMAL only slightly increased, but very slightly). Images resulting from changing the NORMAL class sample were attributed to the COVID class. Here the probability of classification was significantly reduced.

The results are shown in Table 5, which presents the probabilities of classifying, where 0 is COVID and 1 is NORMAL. Let us note that, in this case, the noise had almost no effect on the sample that belongs to class 0, but it greatly changed the result for the sample that originally belonged to class 1. Such results may be due to the specifics of the image data: if you look at the sample, you will notice that class 0 is characterized by large bright areas and blurring (and of different intensity), which is not typical for the samples of class 1.

Table 5. Model test results on modified images before using the protection methods

After that, the protection methods discussed earlier were applied to this model and data set.

3.2 Analysis of Results

The results are presented in Table 6. The cells show the probability that the ML system will attribute the image to each of the classes (COVID and NORMAL). Incorrect classification results are highlighted in red, and marginal results are in blue. Table 7 provides a comparative analysis of these protective methods based on this experiment.

Table 6. Results from implementing the protection methods
Table 7. Comparative analysis of the protection methods

As is shown in the experimental study, all considered protection methods can effectively cope with the evasion attack; however, each of them has its own drawbacks. Typically, they are associated with a decrease in the model accuracy or the inability to cope with adversarial samples in all cases.

For all methods that deal effectively with noisy samples (adversarial learning, noise reduction, autoencoder noise suppression, and compression), the classification of an inverted image is still a problem. Therefore, these methods should be combined with the randomization method, which can be used to make the training sample more representative.

As is shown by practical results, the noise pollution method is very sensitive to the choice of coefficients, which complicates the implementation of effective protection. The active method using an autoencoder requires more computing power, but is quite efficient. The active method by means of input data compression is also efficient, and vice versa, it does not require much computing power.

Thus, several protection methods belonging to both proactive and active methods should be combined. These should include increasing the representativeness of the training sample, adding some noise to the model layers during the training phase, and compressing the test data or using an autoencoder (which will be the second protection stage). The disadvantage of using multiple protection methods is, of course, increased system runtime, increased complexity of system design and operation, and reduced model accuracy, depending on the quality of sampling in the development of protective components.

4 CONCLUSION

In this work, AML attacks were investigated and their systematization was carried out. It was shown that the most dangerous, from the point of view of the authors, are evasion attacks. As a result of such effects, an attacker can influence medical procedures, and therefore, the health of people. These attacks allow an attacker to get a false positive from the system by introducing small perturbations into the test sample. For this type of attack, the main protection methods were also analyzed and their features, advantages, and disadvantages were identified, which showed the lack of a universal approach to solving the task.

One distinctive feature of medical images is that they are in most cases black-and-white, which makes it possible to introduce noise pollution that is imperceptible to the human eye. Often, the pathological areas are more “exposed” (which can be seen when examining images of healthy and sick people), which gives an advantage for the attacker: in this way, the desired sample can be attributed to another class.

As part of the study, several protective methods from the class of active and proactive protection methods were considered, namely, adversarial learning, randomization, and noise reduction. The advantages and disadvantages were highlighted in each of the methods. The experimental results showed that the defensive method either greatly affects the accuracy of the final model or is not able to cope with all types of attacks, which requires their combination.

The considered protection methods can effectively cope with an evasion attack with an accurate selection of parameters, as well as with a combination of several protection methods. The shortcomings of these approaches were identified: an increase in the training sample and requirements for computing power, a slight decrease in the accuracy of the original model, and a requirement to change the model architecture to implement a security solution.