MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection

Qi, Xuyan; Liu, Wei; Lou, Rui; Li, Qinghao; Jiang, Liehui; Tang, Yonghe

doi:10.3390/electronics12102272

Open AccessArticle

MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection

State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(10), 2272; https://doi.org/10.3390/electronics12102272

Submission received: 1 March 2023 / Revised: 9 May 2023 / Accepted: 15 May 2023 / Published: 17 May 2023

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Malware detection has always been a hot topic in the cyber security field. With continuous research over the years, many research methods and detection tools based on code visualization have been proposed and achieved good results. However, in the process of code visualization, the existing methods have some issues such as feature scarcity, feature loss and excessive dependence on manual analysis. To address these issues, we propose in this paper a code visualization method with multi-channel image size adaptation (MC-ISA) that can detect large-scale samples more quickly without manual reverse analysis. Experimental results demonstrate that MC-ISA achieves both higher accuracy and F1-score than the existing B2M algorithm after introducing three mechanisms including image size adaptive, color enhancement and multi-channel enhancement.

Keywords:

malware detection; code visualization; image size adaptive; color enhancement; multi-channel

1. Introduction

Malware is any program designed to cause damage to computer systems. It is one of the most serious threats to cyber security, both in terms of quantity and threat impact. Malware detection has always been a hot topic in the security field. With continuous research over the years, many research methods and detection tools have been proposed and achieved good results, such as signature-based detection [1,2], code semantics-based detection [3,4], heuristic scanning methods [5], system monitoring methods [6,7,8] and dynamic tracking methods [9].

However, with the emergence of the virus writing engines, it is possible to automate the mass production of malware, which also makes the types and quantities of malware explode. New malware is increasing year over year, which poses a great threat to the security of cyberspace. In the face of hundreds of millions of documents to be tested, it is not a smart choice for professional security personnel to manually analyze them one by one. However, the traditional detection methods based on known feature databases are not enough to deal with the practical problem that the number of new unknown varieties increases rapidly.

To cope with this rapid growth, in recent years, a series of machine learning-based malware detection and classification methods have been proposed. However, the results of traditional machine learning-based methods depend on the quality of features, which are extracted manually by researchers. This requires high professional knowledge level of analysts, and the overdependence on manual analysis to extract features may affect detection efficiency because professionals need to analyze, identify and correct errors repeatedly [10]. With the progress of network attack technology, the anti-analysis ability of malware variants is enhanced, and the limitations of existing detection methods are gradually revealed. In recent years, a new research direction based on visualization has been proposed to tackle the task of malware classification. Conti et al. [11] proposed an algorithm to visualize the binary form of code into an image. Nataraj et al. [12] introduced the methods in the field of computer vision into the malware classification, and achieved the goal of family classification based on machine learning. Lee et al. [13] found that code visualization technology can speed up the malware detection process. Moussas et al. [14] identified malware visualized as images and proposed a new malware detection system based on a two-level artificial neural network. Anandhi et al. [15] visualized malware as Markov images to preserve semantic information, applied Gabor filter to extract textures from Markov images and proposed VGG-3 and densely connected network model. Abdulbasit et al. [16] integrated deep learning, feature engineering, image transformation and processing techniques to detect obfuscated malware. Shao et al. [17] proposed a malware visual classification method based on a deep residual network and hybrid attention mechanism for edge security. Wang et al. [18] proposed a malware detection method by combining convolutional neural networks and generative adversarial networks. Some other works have made progress on this, such as generating RGB images instead of the gray-scale images [19,20] or converting binary files into entropy images [21]; then, most of them used the image features such as texture to achieve malware family classification. However, no matter how the images are generated, the essence is to visually represent binary data. That means the original binary bytes are mapped to the code image. Most of them still used the code visualization method proposed by [12] in this step, which mapped the original bytes to a two-dimensional grayscale image. According to its mapping principle, this mapping method was abbreviated as the B2M (binary-to-matrix) algorithm. In the B2M algorithm, to solve the problem of inconsistent file size, it set a threshold to truncate or fill in the files when it was too large or small. That may cause the loss and change of features in the image processing stage, and affect the classification result. Furthermore, the majority of research in this field has focused on classifying families of malware, with relatively little emphasis on developing methods for malware detection. It should be noted that there was a difference between detection and classification. Therefore, how to efficiently and accurately detect malware was also a crucial topic.

To address these issues, in this paper, we propose a multi-channel code visualization method named MC-ISA (code visualization method with Multi-Channel Image Size Adaptation) for malware detection. In view of the possible feature loss in the process of image generation and normalization of the B2M algorithm, which is commonly used in the existing visualization-based malware classification methods, we propose a code visualization algorithm named ISA with image size adaptive mechanism. To solve the limitations of single-channel visualization methods, we propose the multi-channel mechanism. The first-order Markov state transition matrix is used as the local semantic supplement of the byte file, which can enhance the stability of the code images and characterize the correlation between bytes.

The main contributions of this paper are as follows:

(1): We propose an image size adaptive code visualization algorithm that preserves almost all byte information of the sample to avoid information loss, while better preserving the original spatial distribution properties of the sample.
(2): We propose a novel multi-channel code image generation method that incorporates instruction information, spatial properties and statistical information of executable files, resulting in more comprehensive feature representation and improved detection performance.
(3): We leverage pre-trained deep learning models from transfer learning research and apply them to the field of malware detection, partially alleviating the problem of insufficient training samples and achieving encouraging results in malware detection.
(4): The proposed method extracts features without manual reverse engineering and generates the image automatically to improve the efficiency of large-scale malware detection.

The rest of this manuscript is organized as follows. Section 2 introduces a review of the literature on malware classification based on code visualization from the past few years. In Section 3, the basic theory of the MC-ISA method is introduced. The MC-ISA method is tested and compared in Section 4. Finally, conclusions are given in Section 5.

2. Related Work

With the development of artificial intelligence technology, the detection method combining code visualization and deep learning has received widespread attention over the past few years. Visualizing the code as an image can reflect the structure information and similarity of the code to a certain extent. Compared with the feature vectors extracted by artificial reverse analysis, the malware image contains rich code information. Therefore, this section mainly reviews the related work based on malware visualization and artificial intelligence technology. Table 1 provides a detailed overview.

From the above table, it can be seen that according to the selected model, intelligent malware classification and detection methods based on code visualization mainly include methods based on traditional machine learning technology and methods based on deep learning technology. Therefore, we mainly reviewed the related work according to malware detection based on machine learning and malware detection based on deep learning.

2.1. Malware Detection Based on Machine Learning

The machine learning algorithm can automatically analyze the data rules and predict and analyze the unknown data. After the malware is visualized as an image, the detection model based on machine learning is used for detection. Usually, different types of features are extracted from the image through image processing technology, then the machine learning algorithm is used to train the image features of the known label samples and a classifier is constructed to achieve detection. The key to determine the detection effect of this kind of work is whether the features are excellent. As long as the features are representative and anti-jamming, the global or local features of the image can be extracted and optimized to achieve good results.

Liu et al. [22] selected two features: grayscale images and n-gram opcodes, and used the Shared Nearest Neighbor (SNN) clustering algorithm as the classification model to evaluate the accuracy of the two single features and their combination. They have certain recognition ability for new types of malware. Liu et al. [23] extracted and fused global features (GIST) and local features (LBP or dense SIFT), solving the problem of sharply reducing classification accuracy when using only global features when malware grayscale images have high similarity or significant differences. They used the visualization method in reference [12]. Naeem et al. [24] transformed malware binaries into grayscale images, using texture features as global features and image edges as local patterns to achieve family classification of malware. Li et al. [25] extracted symbiotic gray-scale matrix and color features from gray-scale images, combined with n-gram text features extracted from assembler files, and used random forest algorithm for classification. Wang et al. [26] divided the PE file into three parts based on its structure: data segment, code segment, PE header and other segments. Each part generated a grayscale image and synthesized an RGB image. Several typical machine learning and deep learning classification models were then selected for validation. Ren et al. [27] converted the malware into entropy pixel images and utilized KNN to classify malware based on its visualization.

In summary, the existing work mainly focuses on the family classification of malware, while there is less research on malware detection. Due to the characteristics of images and the continuous development of deep learning technology, with the flourishing of deep learning technology in the field of computer vision, research on malware detection and classification based on visualization technology also tends to choose deep learning models. In recent years, research based on machine learning models has been relatively scarce.

2.2. Malware Detection Based on Deep Learning

Deep learning is a branch of machine learning. Compared with traditional machine learning, deep learning can automatically learn and extract features. Through learning and training of a large number of hidden layers, it can extract the internal relationship of samples and improve the generalization of the model. Compared with traditional machine learning methods, this kind of work no longer relies heavily on manual feature extraction, which can greatly reduce the time and labor cost of manual extraction, and has a higher degree of automation, scalability and flexibility. Therefore, it has been widely used in malware detection and classification in recent years.

Convolutional Neural Network (CNN) is the most widely used work based on visualization. CNN is a typical neural network model developed in the field of computer vision around 2012, which usually includes a convolution layer, a pooling layer and a full connection layer. The convolution and pooling layer of CNN enable it to extract the two-dimensional features of data well, so it can perform well in the field of image recognition. In recent years, related work has also begun to use CNN for intelligent detection of malware. From Table 1, it can be seen that references [28,29,30,31,32,33,34,35,36,37,38,39,40] have coincidentally selected CNN or its variants as classification models or as a component of the classification model, and implemented malware classification after visualizing the samples to be tested as images. Among them, Zhao et al. [28] proposed a visualization-based in-depth learning malware classification framework MalDeep. They visualized the malware as gray-scale images, extracted the texture features of the images, and used CNN as a model for family classification. The model classification accuracy could reach more than 99%. The classification accuracy of the literature [29] on the Kaggle data set reached 98.60%. The accuracy of this method on the Malimg dataset is 98.82%. Awan et al. [30] proposed a convolutional neural network framework named SACNN based on spatial attention and deep learning for image-based classification. Narayanan et al. [31] proposed a classification system comprising convolutional and recurrent neural networks. Vasan et al. [32] proposed a novel malware classifier called IMCFN, using CNN-based deep learning architecture. Khan et al. [33] converted EXE files into opcodes, then mapped them onto images and verified their recognition ability for new types of malware on two different models: GoogleNet and ResNet. Daniel et al. [34] visualized malware as gray-scale images and trained a convolutional neural network to achieve malware classification. Cui et al. [35] and Venkatraman et al. [36] proposed malware family detection methods based on deep learning algorithms, which have better detection performance compared to traditional methods. Cui et al. [35] converted the malware byte data into gray-scale images and implemented the detection of malware families based on CNN. They applied a method of sample fine-tuning to supplement the number of samples in some malware families, improving the classification performance of the detection model in the case of insufficient training samples. Venkatraman et al. [36] combined various visual features with deep learning algorithms to achieve malware detection. Falana et al. [37] proposed an integrated malware detection model named Mal-Detect that combines CNN and deep generative adversarial neural networks (GAN). They converted the original binary sample file into RGB images, generated new malware images using deep GAN and trained with CNN to extract important features from the dataset. Sun et al. [38] generated a malware feature images by combining the static analysis of malware with CNN and RNN. Bensaoud et al. [39] proposed a multi-task learning framework which generated bitmap (BMP) and (PNG) images from malware features for malware image classification and detection. Asam et al. [40] proposed a deep boosted hybrid learning-based malware classification framework which extracted features from customized CNN architectures and fed into the conventional machine learning classifier to improve the classification performance.

After investigation, it can be concluded that existing works, whether based on machine learning or deep learning methods, mainly focus on the classification of malware families and have achieved good results. However, there is relatively little research on malware detection. At the same time, current visualization-based malware classification work mainly focuses on researching feature extraction methods and model optimization for images, while image generation methods mostly follow traditional B2M methods. Among the papers listed in Table 1, except for [39] which has its own method for generating feature images, all other works used B2M algorithms during visualization, which poses a risk of information loss during the processing that needs to be solved. In addition, methods with higher accuracy often seek to extract more complex and refined features during the data processing stage, which cannot be separated from a large amount of manual reverse engineering work; this affects the overall efficiency of the method and cannot effectively solve the detection needs of large-scale malware in reality.

3. Methodology

To solve these problems, we propose a multi-channel malware visualization method based on image size adaptive, and adopt a CNN architecture of migration learning to detect malware, which can detect large-scale samples more quickly without manual reverse analysis; that is, it can judge whether the software is malicious or benign. Our approach is shown in Figure 1. The main works of this paper as follows: (1) constructing data sets containing malware and benign software, (2) adaptive binary code image generation, (3) multi-channel image generation and (4) malware detection based on deep convolution neural network.

3.1. Data Preprocessing

In the malware detection method based on deep learning, there are usually restrictions on the input format of the model. The purpose of data preprocessing is to convert the file into a format that meets the requirements of subsequent processes.

We first preprocess all the Portable Executable files in the dataset. Specifically, we first use IDA Pro disassembly tools to batch process all the software in the data set, and convert them to obtain the required binary files and assembler files (i.e., files in “.byte” and “.asm” formats). Then, we convert each byte of the pre-processed file into a decimal unsigned integer in sequence, which ranges from 0 to 255, and take the decimal value as the chrominance value of the pixel points of the gray image, so that each file can get a corresponding sequence of pixel chrominance values after processing. The read sequence is shown in Figure 2. When reading and converting bytes, we count the size of each preprocessed file (the number of bytes contained).

3.2. Image Size Adaptive Mechanism

Most of the existing code visualization work is to set a fixed width to generate images according to the standard proposed by [12]. However, due to the different numbers of bytes in different files, the generated images are bound to have different sizes. Deep learning models, especially convolution neural network models, which have good results in computer vision, have strict requirements on the input image size.

To solve this problem, we propose a new technique, which adaptively generates square images with equal height and width according to the size of each file. The method rounds down the byte count value to compute a complete square number, and obtains the height and width of the image through a square operation. The binary byte matrix is rearranged according to the calculated image width and height values, and the decimal number between 1 bit and [0, 255] corresponding to each byte is taken as the pixel value corresponding to the byte. “0” corresponds to black and “255” corresponds to white, and the rearranged byte matrix position is mapped with its corresponding pixel value one-to-one to obtain a gray image with equal height and width, the method is shown in Algorithm 1. At this time, we get a gray image tailored for each PE file. The gray image generated by each file is different in size, but it contains all the binary byte information in the corresponding file.

Algorithm 1: The image size adaptive code visualization algorithm

Input: B = {b₁, b₂, b₃…b_n} represents a binary file of the PE file, where b_i represents the ith byte.

Output: Gray-scale image with equal height and width adapted according to the files size.

1: LENGTH(x) is a function to get number of bytes in the set x.

2: VALUE(x) is a function to convert the binary bytes in the set x to the decimal value.

3: W represents the width of the code image.

4: H represents the height of the code image.

5: V = VALUE(B)

6: L = LENGTH(B)

7: W = math.floor(sqrt(L))

8: H = W

9: for i = 0 → H − 1 do

10: for j = 0 → W − 1 do

11: A(i, j) = V(i × W + j)

12: end for

13: end for

14: Generate a W × H gray-scale image by using the values in the array A as the pixel values

This method ensures that all byte features of each file can be retained in the image generation stage, avoids the problem of feature loss caused by truncating longer images in the existing work and lays a foundation for retaining and learning more feature information in the subsequent malware detection work to achieve a better detection effect.

3.3. Image Normalization

Because the size of the images obtained in the previous step is different, to meet the needs of model training, we use the Bilinear interpolation algorithm to normalization all the constructed initial code images, and get the code image data set with normalization size. Image normalization makes the image resistant to the attack of geometric transformation, and it can preserve the invariants in the image. After a large number of experimental verifications and comparisons, it can further reduce the amount of input data while taking into account the time cost and detection effect. This method standardizes all code images into 256 × 256 square images to form a new code image data set for visual analysis, which can be used as the input of subsequent training models.

As shown in Figure 3, in the Bilinear algorithm, assuming that the values of function f at A₁₁, A₁₂, A₂₁ and A₂₂ are known, in order to obtain the values of unknown function f at point P = (X, Y), linear interpolation operations need to be performed in X and Y directions respectively. Orange dots A₁₁, A₁₂, A₂₁ and A₂₂ are known as four-pixel dots. X-direction linear interpolation is to insert green point B₁ in A₁₁ and A₂₁, and insert green point B₂ in A₁₂ and A₂₂. P point is calculated by Y-direction linear interpolation of B₁ and B₂ obtained in the first step. It can be expressed as Formulas (1)–(3).

f (B_{1}) \approx \frac{x_{2} - x}{x_{2} - x_{1}} f (A_{11}) + \frac{x - x_{1}^{}}{x_{2} - x_{1}} f (A_{21})

(1)

f (B_{2}) \approx \frac{x_{2} - x}{x_{2} - x_{1}} f (A_{12}) + \frac{x - x_{1}^{}}{x_{2} - x_{1}} f (A_{22})

(2)

where B₁ = (x, y₁), B₂ = (x, y₂), then

f (P) \approx \frac{y_{2} - y}{y_{2} - y_{1}} f (B_{1}) + \frac{y - y_{1}^{}}{y_{2} - y_{1}} f (B_{2})

(3)

3.4. Color Enhancement Mechanism

Although a gray-scale image can present the overall structure and features of program samples to a certain extent, it still has some shortcomings in the performance of some internal features and visual interaction analysis. Therefore, we performed color enhancement on the generated code image. Among various colormaps categories, we chose Miscellaneous, which is more suitable for the characteristics of discrete instruction byte sequences, for color mapping. In Miscellaneous, we chose the Rainbow ribbon setting with a larger color span and more obvious boundary value contrast (as shown in Figure 4) to add color attributes to the image, to enhance the color contrast of the generated code image and enrich the image information, which is very helpful to improve the visual discrimination and analysis effect of coarse-grained manual analysis in practical application.

3.5. Multi-Channel Mechanism

The code visualization method mentioned above is suitable for any file that can be expressed as a binary byte sequence. That is to say, this method is not limited by platform and instruction format, which provides the possibility to realize multi-platform compatible detection. However, this kind of image is generated by a single file, and the selected different program feature data and their representations can only cover a part of the features of the corresponding software and lose other useful information; that is, the features contained in a single channel image are limited. At the same time, most of the existing related researches use the texture features of images to classify malware families, and the results are not very ideal when used in detection. To solve these problems and make the image contain more features and be suitable for detection, we propose a new multi-channel code image generation method for malware detection.

The existing methods of using RGB images in malware classification are mostly aimed at the binary form of a single file, starting from the first byte in sequence, and corresponding three consecutive adjacent bytes to the three channels of R, G and B, respectively, until the last byte. Although this method uses multi-channel images, it still contains only a single file feature in essence, and loses the advantages that multiple channels may bring. No matter whether for PE files or Android files, the length of 24 bytes has no interpretable meaning and association at the corresponding instruction level.

Therefore, to make malware detection and screening based on code visualization not only ensure high efficiency but also further optimize the detection accuracy, aiming at the problem of insufficient information in a single gray image, we propose a brand-new multi-channel code visualization method. The method selects the binary byte file corresponding to each sample, the assembly language metadata file, and the normalized first-order Markov byte state transition matrix file as the input data of the three channels, respectively. According to the code image generation method proposed above, the adaptive image of the map sheet corresponding to the binary byte file corresponding to each sample, the assembly language metadata file, and the first-order Markov transition matrix file of the binary byte is respectively generated as a blue channel, a green channel and a red channel, and finally, a RGB color three-channel image is synthesized as a code multi-channel image sample participating in subsequent detection. The process is shown in Figure 5.

The reason for choosing a binary byte file is that the binary byte sequence file (“.byte” file) of the software itself is the machine code of the program, which can reflect the underlying semantics of software code. Because our method selects complete files, it has great advantages in information coverage. We use it as the input of the first channel (B channel) in the multi-channel image.

Because machine instructions can reflect the semantics of the underlying computer code and have a certain context relationship, binary code, as its representation, has certain significance in the sequence and continuous combination of its codes, and there is also a certain correlation between adjacent bytes. As a statistical feature of binary byte files, the similarity of first-order Markov state transition matrix can usually reflect the similarity of binary code functions, so we use first-order Markov state transition matrix to characterize the correlation between bytes of byte files, which can be used as a local semantic supplement of byte files and enhance the stability of visual images of codes. If the byte file is represented as B = {b₁, b₂, b₃ … b_n}, and b_i ((i ∈ {0, 1, 2, …, n})) represents the value of the i-th byte, the first-order Markov transition matrix should be calculated; that is, the transition probability between the corresponding values of each byte should be calculated. According to Markov properties, the probability of occurrence of byte b_i is only related to its previous byte b_i−1, as shown in Formula (4):

P (b_{i} | b_{1}, b_{2}, b_{3}, \dots, b_{n}) = P (b_{i} | b_{i - 1})

(4)

Since the corresponding value range of each byte is [0, 255], there are 256 possible values of b_i, so the corresponding first-order Markov transition matrix M is a 256 × 256 matrix (Formula (5)), which is why we normalize the image to 256 × 256 in the other two channels. This can ensure that the image size of the three channels is the same, and that no redundant deformation is carried out. The set V is used to denote the value space of byte b_i, then V = {0, 1, 2, … 255}, and P_Va,Vb is used to denote the state transition probability of byte b_i−1 to b_i, then M can be expressed in the form of Formula (5). Here we calculate the frequency by using the number of adjacent occurrences of (b_i−1, b_i) in each file to obtain its transition probability. We use this as a supplement to the original byte file to enrich the features contained in the image and enhance the stability of the code visualization image. At the same time, the computational complexity of the first-order Markov state transition matrix is much lower than other algorithms, and manual reverse analysis is not needed, which can better ensure the efficiency of processing and detection. Therefore, we use it as the input of the second channel (R channel).

M = (\begin{matrix} P_{0, 0} & P_{0, 1} & \dots & P_{0, 255} \\ P_{1, 0} & P_{1, 1} & \dots & P_{1, 255} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ P_{255, 0} & P_{255, 1} & \dots & P_{255, 255} \end{matrix})

(5)

Because the “.asm” file contains assembly instructions, addresses, data and readable strings, which contain a lot of useful information, we chose the assembler file obtained by disassembly processing in the third channel (.asm). In the method proposed by the champion of the Kaggle competition in 2015, the “.asm” file was truncated, and only the first 800 pixels (.bytes) of information were retained for visualization. After that, because the “.asm” files are generally large, almost all the existing malware visual analysis work truncate the “.asm” files when they are used but the truncation thresholds are different. In this article, we kept it in its entirety. Because instructions, addresses, data and readable strings can be represented in ASCII codes, we regard their contents as ASCII codes, and then map them according to the method proposed above, and the corresponding assembler file images obtained are used as the input of the third channel (G channel).

The images of these three channels are all generated by the visualization method proposed in the previous chapter, and the generated images are sent to three channels to synthesize the final code multi-channel image. This visualization method of code image can completely break away from the dependence on manual reverse analysis in the data representation stage, fully realize automatic operation and, at the same time, keep more comprehensive information and avoid the hidden feature loss caused by excessive manual intervention.

4. Experimental Results and Discussion

4.1. Dataset and Experimental Configuration

As mentioned above, the existing datasets for malware detection and classification based on code visualization methods still have some research limitations. To conduct more comprehensive and flexible visualization detection methods, a dataset was constructed for this paper containing 8870 malware files and 8590 benign files, totaling 17,460 samples, serving as the data foundation for our research on malware detection. In fact, due to the special functional characteristics of malware itself, only a few websites are able to provide the original malware samples. Therefore, the 8870 malware samples in the dataset constructed in this paper were collected online (https://virusshare.com/, accessed on 28 February 2023). We also collected 8590 benign samples in the dataset, including 1390 executable files and dynamic link libraries from personal computers in the laboratory, covering clear operating systems such as Windows 7, Windows 10 and Windows 11, while the other part is the Windows application software samples collected online (https://download.cnet.com/windows/, accessed on 28 February 2023), totaling 7200. Table 2 shows the composition of our dataset. The experimental environment and configuration information is listed in Table 3.

4.2. Experimental Design and Evaluation

We designed three sets of comparative experiments to evaluate the effectiveness of our proposed MC-ISA visualization method through qualitative and quantitative comparisons. The specific experiments are as follows:

Experiment 1:: Image size adaptive mechanism evaluation (ISA-Gray vs. B2M)

The purpose of this experiment was to evaluate the impact of the image size adaptation mechanism on the results of malware detection in our method. To ensure the fairness of the experiment, we only selected single-channel data (the .byte file) and used only the image size adaptation mechanism to generate single-channel grayscale images (ISA-Gray for short), rather than the complete MC-ISA visualization method. We chose the images generated by the B2M algorithm (B2M for short) as the baseline for comparison.

In the process of mapping the original binary data to images, the only difference between these two visualization methods was in the mechanism used to set the image size. Therefore, this comparative experiment can evaluate the effectiveness of the image size adaptation mechanism in our proposed method.

Experiment 2:: Color enhancement mechanism evaluation (ISA-CE vs. ISA-Gray)

The purpose of this experiment was to evaluate the effect of the color enhancement mechanism in our method. The comparison objects were the code images generated by ISA-Gray, which only introduces the image size adaptation mechanism and the code images generated by adding the color attribute enhancement mechanism on top of it (ISA-CE for short).

To ensure the fairness of the experiment, we still chose the single-channel images extracted from the “.byte” file as the baseline for comparison. Therefore, this comparative experiment can evaluate the effectiveness of the color enhancement mechanism in our proposed method.

Experiment 3:: Multi-channel mechanism evaluation (MC-ISA vs. ISA-CE)

The purpose of this experiment was to evaluate the impact of the multi-channel mechanism in our method for malware detection. The comparison objects were the single-channel code images generated by the “.byte” file (ISA-CE for short) and the multi-channel code images generated by using the complete visualization method proposed in this paper (MC-ISA for short).

In MC-ISA, image generation has integrated both the image size adaptation mechanism and the color enhancement mechanism in each channel, so we still chose the single-channel images generated by the “.byte” file as the baseline for comparison.

The experimental design for the above three comparative experiments is summarized in Table 4 in detail.

For these three groups of experiments, we used the detection models of ResNet50, Inception v3, Inception-ResNet-V2, and VGG16. In the aspect of detection model implementation, we adopted the Keras framework and Tensorflow as the back-end framework, and the versions are Keras 2.3.1 and Tensorflow 1.15.0, respectively. The experimental running environment is shown in Table 3.

To evaluate the test results objectively and avoid the randomness of the test results, 10-fold cross-validation was adopted in the experiments in this paper. All the experimental data were randomly divided into ten parts; one part was selected as the test set in turn, and the remaining nine parts were selected as the training set. The final experimental results are the average of all the test results.

Detection accuracy has always been the most direct indicator to reflect the performance of malware detection and classification work, and it is also an important indicator that all research works focus on, which can reflect the proportion of correct predictions. F1-score is an important metric to evaluate the comprehensive classification performance of a model and examples. The F1-score combines precision and recall and is a weighted average of them. Therefore, we used Accuracy and F1-score as metrics to evaluate the effectiveness of this method.

The confusion matrix of the classification problem is shown in Table 5, where TP represents the prediction of positive samples as positive samples, and here represents the number of malware correctly identified. FP represents predicting negative samples as positive samples, and here represents the number of benign samples identified as malware. FN means predicting positive samples as negative samples, and here it means the number of malware identified as benign samples, which is fatal in real-world malware detection, so the smaller this value, the higher the practical feasibility of the model. TN represents the number of negative samples predicted as negative samples, where the number of benign samples is correctly identified.

Accuracy, which represents the proportion of samples predicted correctly to all samples in the test set, is shown as Formula (6).

A c c u r a c y = \frac{T N + T P}{T P + F P + T N + F N}

(6)

Precision, which is used to describe the proportion of positive samples predicted as positive by the model over the actual predicted positive samples in the test set, is shown as Formula (7).

P r e c i s i o n = \frac{T P}{T P + F P}

(7)

Recall, which is used to describe the proportion of positive samples predicted by the model as positive samples in the test set that were actually predicted correctly, is shown as Formula (8).

R e c a l l = \frac{T P}{T P + F N}

(8)

F1-score is a metric used to describe the weighted harmonic mean of Precision and Recall. It combines the performance of Precision and Recall, and can be used to evaluate the performance of classification models. Its specific calculation method is shown as Formula (9).

F 1 - s c o r e = \frac{2 \times p r e c i s i o n \times r e c a l l}{p r e c i s i o n + r e c a l l}

(9)

4.3. Results and Discussion

4.3.1. Evaluating the Effectiveness of the Image Size Adaptive Mechanism

In this experiment, we only focused on the effect of the image size adaptation mechanism. We compared our proposed visualization method, which only includes the image size adaptive mechanism to generate a single-channel grayscale code image, with the code images generated by B2M as the baseline, to evaluate whether the improvement of the image size adaptive mechanism has an effect on detection results.

Figure 6 illustrates a comparison between the code images generated by the two methods. Five files were selected from both malware and benign software for demonstration and comparison. In both categories of files, the images generated by the B2M algorithm are shown at the top (B2M for short), while those generated by the proposed method, which only includes the image size adaptive mechanism, are shown at the bottom (ISA-Gray for short).

Through the comparative analysis of the two methods, the following observations can be made:

(1): It is evident from the images that the grayscale images generated by the B2M algorithm have varying sizes. Even if the input requirements of the detection model are not considered, these image sizes may be used as feature information for the model, which can affect the training effect of the model to some extent.
(2): From the visual effect of the generated images, it is apparent that the B2M algorithm is fixed-width and that the height and width differences of the generated images corresponding to different file sizes are significant. As a result, the image size is mostly “elongated” and “flat”, which makes it difficult to meet the input requirements of the existing mainstream convolutional neural network. Standardized deformation of the image is necessary, but the existing methods use compression or truncation, which inevitably results in information loss.

In contrast, the grayscale images generated by our algorithm have equal height and width and contain all bytes of information from the file during generation. Therefore, our method does not artificially discard large data pieces due to size during the generation stage, which could result in the loss of sample information. As a result, our proposed method retains more original code information and is more suitable for deep learning detection methods. We used quantitative methods to evaluate the effect of the image size adaptive mechanism in the proposed method on malware detection through experimental data. We used the above two methods to generate code images and send them into the four different convolutional neural networks selected for malware detection; the detection results are shown in Figure 7.

Figure 7 compares the detection results of code gray images generated before and after adding the adaptive mechanism in different neural network models (ResNet50, InceptionV3, Inception-ResNet-V2 and VGG16).

It can be seen from the results that compared with the B2M algorithm, the detection accuracy and F1-score of the code images generated by our proposed method with only one mechanism of map adaptation were improved in the four models, indicating that the image size adaptive mechanism is effective. The main reasons are as follows:

(1): The images generated by our visualization method only with the image size adaptive mechanism retain all the original code information, avoiding information loss and containing more hidden features.
(2): The code images generated by our method are square, which can better preserve the original spatial distribution and texture features of the image during the normalization process. Because in the process of normalizing the code images generated by B2M algorithm from rectangle to square, the truncation will cause feature loss, and the direct scaling will affect the retention of texture features.
(3): In the process of normalization for the code images generated by B2M algorithm, the original spatial distribution features will be changed due to the inconsistent compression degree between vertical and horizontal. Therefore, the detection results obtained with the images generated by our visualization method are better.

From Figure 7, we can find that the detection accuracy of InceptionResNet-v2 is the highest, followed by ResNet50 and inceptionV3, with VGG16 as the lowest, which is consistent with the performance in the ImageNet dataset. This is determined by the model structure, computational complexity and training time.

We focused on the difference in the detection results of the images generated by the two methods in the same model. After using the proposed method, the detection accuracy of the ResNet50 model can be improved most obviously, because ResNet50 is very sensitive to the texture, the detail information and the spatial property of the images. It also shows that the image size adaptive mechanism in our method is beneficial to retain more image texture features and code spatial distribution information.

With the same models, the improvement of the F1-score indicates that the proposed visualization method the image size adaptive mechanism generates images with higher category discrimination, which also proves that our method retains more useful features than B2M for malware detection.

Combining Figure 7a,b, it can be seen that although the accuracy improvement of VGG16 is not significant compared with the other three models, the F1-score improvement of VGG16 is very significant, indicating that our visualization method is very effective in enhancing the category discrimination of samples.

4.3.2. Evaluating the Effectiveness of the Color Enhancement Mechanism

In this experiment, we evaluated the effectiveness of the color enhancement mechanism introduced by the proposed method. The main purpose of introducing the color enhancement mechanism was to enhance the visual expression. Since visual detection methods are usually used for rapid filtrating, many existing visual analysis systems have designed human interaction interfaces, such as the visual analysis system developed by [41] and the MalwareVis system developed by [42]. Therefore, the introduction of this mechanism can make the code images have more friendly visual discrimination during manual analysis.

In addition to comparing the visual presentation of the generated images, we also conducted a quantitative analysis to evaluate whether such a mechanism would have an impact on the effect of detection.

In order to avoid the interference of other factors and ensure the rigor and continuity of the evaluation, we still used the single-channel image generated by “.byte” file as the comparative experimental object. In the comparison experiment, the single channel grayscale image with only the amplitude adaptation mechanism and the single channel image with the color enhancement mechanism on the basis of the previous experiment were used as the comparison objects to verify the influence of the proposed color enhancement mechanism on the malware detection results.

Figure 8 presents a comparison between the code images generated by the two methods. We selected five files from malware and benign software for demonstration and comparison. In these two types of file images, the upper side is the code image (ISA-CE for short) generated by the proposed method using both the amplitude adaptation and color enhancement mechanisms, and the lower side is the image generated by the proposed method using only the amplitude adaptation mechanism in Experiment 1 (ISA-Gray for short).

Through the visual presentation of the code images generated by the two methods, it can be seen that the code images after adding the color enhancement mechanism have stronger visual discrimination ability, enhance the contrast of the image in visual expression and are more in line with human visual experience and more suitable for coarse-grained manual analysis.

Moreover, the method can select different color schemes and mark the corresponding bytecode or field according to the requirements. Compared with traditional gray-scale images, it can help analysts understand the structure of malware more intuitively or locate important areas. Therefore, the introduction of color enhancement mechanism is more suitable for visual interaction and analysis of malware than traditional gray-scale images.

We used quantitative methods to evaluate the effect of the color enhancement mechanism in the proposed method on the malware detection results through experimental data. We used two methods to generate code images and feed them into the selected four convolutional neural networks for detection. The detection results are shown in Figure 9.

Figure 9 illustrates the malware detection results of code images generated by two different methods under four classification models. It can be seen from the Figure that the code images generated by the color increasing mechanism are superimposed, which improves the detection accuracy of the four neural network models.

The main reasons are that the color image generated by our proposed color enhancement mechanism contains more spectral information, and the selected Rainbow colormap is good at enhancing the contrast of the boundary values, which strengthens the boundary discrimination of the grayscale image and enhances the texture features of the image, so it can help the deep learning model learn more texture and boundary features.

From Figure 9a,b, it can be seen that the inceptionV3 model has the most improvement in both detection accuracy and F1-score. This is because the model uses the Inception module, which is very sensitive to the color and brightness features of the image, and the color enhancement mechanism also enhances the texture features. Therefore, the results of the VGG model are also significantly improved.

4.3.3. Evaluating the Effectiveness of the Multi-Channel Enhancement Mechanism

In this experiment, we evaluated the effect of the multi-channel mechanism in our proposed method. We chose the single-channel image which adopts the two mechanisms of map adaptation and color enhancement and the multi-channel image generated by the method proposed in this section for comparison.

The file selection of the three channels is shown in Figure 5. For each sample, we selected the corresponding binary byte file (.bytes), assembly language metadata file (.asm) and binary byte first-order Markov transfer matrix, and used the proposed method to generate the corresponding single channel code visualization image. We regarded the above three images as the input of the blue, green and red channels, respectively, to form an RGB three-channel image. The code images generated by all the samples formed a code multi-channel image sample set for subsequent detection, which was sent to four detection models for detection.

Figure 10 illustrates the visualization results of the multi-channel images. Two files were selected from both malware and benign software for demonstration and comparison. The B channel is the corresponding image of the binary byte file, the G channel is the corresponding image of the assembly language metadata file, the R channel is the corresponding image of the binary byte first-order Markov transition matrix file, and the last column is the final generated multi-channel RGB image.

As can be seen from Figure 10, for the same sample file, the code images corresponding to different channels show different color and texture forms, and each channel has its own characteristics. The synthesized multi-channel image integrates the characteristics of the three channel files so that the image can contain richer features, and correspondingly, more file information is retained. Thus, more useful deep features can be provided for the deep neural network model.

In addition, we also compared and observed all malware images generated in the dataset, and found that there is a certain similarity between byte files and assembler files of malware in the same family, and there are great differences between different families. This is the basis of existing related work that can achieve a certain accuracy of family classification with a single channel, but this is not static. After observing all the samples of the dataset, we found that within the same family, there are also binary-byte files with different color and texture arrangement styles, but these binary-byte file images with different features are very similar to the first-order Markov transfer matrix files. This also gave us an additional surprise, indicating that our proposed visualization method also has a positive effect on optimizing the accuracy of malware family classification, which is also of great significance for the family traceability step after the detection work in this paper.

We also adopted a quantitative method to conduct the experiment. We fed the images generated by the two methods into four detection models for evaluation. The detection results are shown in Figure 11.

It can be seen from the results that compared with the single-channel code images only using binary byte files, the detection accuracy and F1-score of the multi-channel code images generated by our proposed method were improved in each of the four different detection models, which proves that our proposed multi-channel mechanism integrates more levels and types of sample features. The code image contains more useful information, which is very beneficial to improve the malware detection.

Among the four models, Inception-ResNet-V2 network still had the best detection result (99.36%). This is because InceptionResNetv2 uses both the Inception module and the residual module, so it has better feature extraction ability than other models. After the optimization of the multi-channel mechanism, we can clearly see that we narrowed the gap between the models, indicating that the images generated by our proposed method provided richer file features and optimized the final effect of detection from the feature level.

In addition, in this experiment, the detection results and F1-score of the VGG16 network were the most improved, reaching 5.94% and 5.3%, which is still very considerable. This is because the VGG16 model has the simplest structure and a relatively weak ability to extract image features compared with the other three models, so when facing single-channel images, the extracted deep features are limited. However, the VGG16 model is very sensitive to data augmentation, so its improvement is very significant after the images contain richer features, which also indicates that the multi-channel code images generated by our proposed method retain much more information than the single-channel images.

Combining the previous two experiments, we can conclude that the three main optimization mechanisms in our proposed method help to optimize the detection results, which indicates that the retention, standardization, information enhancement and statistics of as much code information as possible are beneficial to the improvement of detection results in the visualization-based malware detection method. By combining multiple features, this method can help the deep learning network discover more potential features and connections of malware.

5. Conclusions and Future Work

In this paper, a multi-channel code visualization method, MC-ISA, is proposed for malware detection. Aiming to overcome the limitations of existing single-feature visualization methods and the representation problems of existing multi-channel images, three optimization mechanisms are proposed, which include map sheet adaptation, color enhancement and multi-channel. We have evaluated the three optimization mechanisms in the method, and achieved encouraging results, which provide ideas for rapid and automatic detection of large-scale malware and are beneficial to help security personnel to screen quickly in the early stage of detection.

The work in this paper also has certain limitations. Although our experiments achieved a high level of accuracy, the underlying principle of the detection method actually depends to a certain extent on the similarity between samples. Therefore, when applied to other application scenarios or problems in the future, if the training set is not large enough or the sample coverage is not comprehensive enough, the detection accuracy may be affected when facing a large number of unknown malware. This is also an aspect that needs improvement in the future, which involves further research on how to enhance the generalization of the method and maintain the stability of detection performance when facing different data compositions.

In the future, on the one hand, since the code visualization method proposed in this paper does not impose any specific system architecture for sample files and does not use information strongly associated with system architecture, we will continue to explore methods for detecting cross-architecture malware based on this method in the future, which requires further optimization. On the other hand, the current network security situation shows that malware targeting IoT devices will become a new challenge in the future. Therefore, we will also conduct further research on lightweight improvement of the detection model, with the aim of providing valuable research references for the detection of IoT malware.

Author Contributions

Conceptualization, X.Q. and Y.T.; methodology, X.Q.; software, X.Q. and R.L.; validation, X.Q. and W.L.; formal analysis, Y.T.; investigation, R.L.; data curation, Q.L.; writing—original draft preparation, X.Q.; writing—review and editing, W.L. and Y.T.; visualization, R.L. and Q.L.; supervision, L.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to copyright reasons.

Conflicts of Interest

The authors declare no conflict of interest.

References

Perdisci, R.; Dagon, D.; Lee, W.; Fogla, P.; Sharif, M.I. Misleading worm signature generators using deliberate noise injection. In Proceedings of the 2006 IEEE Symposium on Security and Privacy (S&P’06), Berkeley/Oakland, CA, USA, 21–24 May 2006; pp. 15–31. [Google Scholar]
Brumley, D.; Newsome, J.; Song, D.X.; Wang, H.; Jha, S. Towards automatic generation of vulnerability-based signatures. In Proceedings of the 2006 IEEE Symposium on Security and Privacy (S&P’06), Berkeley/Oakland, CA, USA, 21–24 May 2006; pp. 15–16. [Google Scholar]
Feng, Y.; Anand, S.; Dillig, I.; Aiken, A. Apposcopy: Semantics-based detection of Android malware through static analysis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, China, 19–21 November 2014. [Google Scholar]
Christodorescu, M.; Jha, S.; Seshia, S.A.; Song, D.X.; Bryant, R.E. Semantics-aware malware detection. In Proceedings of the 2005 IEEE Symposium on Security and Privacy (S&P’05), Oakland, CA, USA, 8–11 May 2005; pp. 32–46. [Google Scholar]
Jia-fu, T.; Zhen-dong, P.; Jun, G.; Shi-xin, L. Combined heuristics for determining order quantity under time-varying demands. J. Syst. Eng. Electron. 2008, 19, 99–111. [Google Scholar] [CrossRef]
Chow, J.; Garfinkel, T.; Chen, P.M. VMwareDecoupling Dynamic Program Analysis from Execution in Virtual Environments. In Proceedings of the USENIX Annual Technical Conference, Boston, MA, USA, 11–13 July 2018. [Google Scholar]
Willems, C.; Holz, T.; Freiling, F.C. Toward Automated Dynamic Malware Analysis Using CWSandbox. IEEE Secur. Priv. 2007, 5, 32–39. [Google Scholar] [CrossRef]
Egele, M.; Krügel, C.; Kirda, E.; Yin, H.; Song, D.X. Dynamic Spyware Analysis. In Proceedings of the USENIX Annual Technical Conference, Santa Clara, CA, USA, 17–22 June 2007. [Google Scholar]
Father, H. Hooking Windows API-Technics of hooking API functions on Windows. CodeBreakers J. 2004, 1, 1–30. [Google Scholar]
Wagner, M.; Aigner, W.; Rind, A.; Dornhackl, H.; Kadletz, K.; Luh, R.; Tavolato, P. Problem characterization and abstraction for visual analytics in behavior-based malware pattern analysis. In Proceedings of the Eleventh Workshop on Visualization for Cyber Security, Paris, France, 10 November 2014; ACM: New York, NY, USA; pp. 9–16.
Conti, G.; Bratus, S.; Shubina, A.; Lichtenberg, A.; Ragsdale, R.; Perez-Alemany, R.; Sangster, B.; Supan, M. A Visual Study of Primitive Binary Fragment Types. 2010. Available online: http://www.rumint.org/gregconti/publications/taxonomy-bh.pdf (accessed on 28 February 2023).
Nataraj, L.; Karthikeyan, S.; Jacob, G.; Manjunath, B. Malware images: Visualization and automatic classification. In Proceedings of the 8th International Symposium on Visualization for Cyber Security, Pittsburgh, PA, USA, 20 July 2011; ACM: New York, NY, USA; p. 4.
Lee, D.H.; Song, I.S.; Kim, K.J.; Jeong, J.H. A Study on Malicious Codes Pattern Analysis Using Visualization. In Proceedings of the International Conference on Information Science and Applications, Jeju, Republic of Korea, 26–29 April 2011; IEEE Computer Society: Washington, DC, USA; pp. 1–5.
Moussas, V.; Andreatos, A. Malware Detection Based on Code Visualization and Two-Level Classification. Information 2021, 12, 118. [Google Scholar] [CrossRef]
Darem, A.; Abawajy, J.; Makkar, A.; Alhashmi, A.; Alanazi, S. Visualization and deep-learning-based malware variant detection using OpCode-level features. Future Gener. Comput. Syst. 2021, 125, 314–323. [Google Scholar] [CrossRef]
Anandhi, V.; Vinod, P.; Menon, V.G. Malware visualization and detection using DenseNets. Pers. Ubiquitous Comput. 2021. [Google Scholar] [CrossRef]
Shao, Y.; Lu, Y.; Wei, D.; Fang, J.; Qin, F.; Chen, B. Malicious Code Classification Method Based on Deep Residual Network and Hybrid Attention Mechanism for Edge Security. Wirel. Commun. Mob. Comput. 2022, 2022, 3301718. [Google Scholar] [CrossRef]
Wang, Z.; Wang, W.; Yang, Y.; Han, Z.; Xu, D.; Su, C. CNN- and GAN-based classification of malicious code families: A code visualization approach. Int. J. Intell. Syst. 2022, 37, 12472–12489. [Google Scholar] [CrossRef]
Han, K.; Lim, J.H.; Im, E.G. Malware analysis method using visualization of binary files. In Proceedings of the 2013 Research in Adaptive and Convergent Systems, Montreal, QC, Canada, 1–4 October 2013; ACM: New York, NY, USA; pp. 317–321.
El-Ghamry, A.; Gaber, T.; Mohammed, K.K.; Hassanien, A.E.; on behalf of the Scientific Research Group. Optimized and Efficient Image-Based IoT Malware Detection Method. Electronics 2023, 12, 708. [Google Scholar] [CrossRef]
Han, K.S.; Lim, J.H.; Kang, B.; Im, E.G. Malware analysis using visualized images and entropy graphs. Int. J. Inf. Secur. 2014, 14, 1–14. [Google Scholar] [CrossRef]
Liu, L.; Wang, B.; Yu, B.; Zhong, Q. Automatic malware classification and new malware detection using machine learning. Front. Inf. Technol. Electron. Eng. 2017, 18, 1336–1347. [Google Scholar] [CrossRef]
Liu, Y.S.; Wang, Z.H.; Yan, H.B.; Hou, Y.R.; Lai, Y.K. Method of anti-confusion texture feature descriptor for malware images. J. Commun. 2018, 39, 44–53. [Google Scholar]
Naeem, H.; Guo, B.; Naeem, M.R.; Ullah, F.; Aldabbas, H.; Javed, M.S. Identification of malicious code variants based on image visualization. Comput. Electr. Eng. 2019, 76, 225–237. [Google Scholar] [CrossRef]
Li, S.J.; Wang, C.; Shi, Y. Malicious code detection based on multi-feature random forest. Comput. Appl. Softw. 2020, 37, 328–333. [Google Scholar]
Wang, R.Z.; Gao, J.; Tong, X.; Yang, M. Research on malicious code family classification combining attention mechanism. J. Front. Comput. Sci. Technol. 2021, 15, 881–892. [Google Scholar]
Ren, Z.; Chen, G. EntropyVis: Malware classification. In Proceedings of the 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 14–16 October 2017; pp. 1–6. [Google Scholar]
Zhao, Y.; Xu, C.; Bo, B.; Feng, Y. MalDeep: A Deep Learning Classification Framework against Malware Variants Based on Texture Visualization. Secur. Commun. Netw. 2019, 2019, 4895984. [Google Scholar] [CrossRef]
Qianfeng, C.; Gongshen, L.; Zhu, X. Visualization Feature and CNN Based Homology Classification of Malicious Code. Chin. J. Electron. 2020, 29, 154–160. [Google Scholar]
Awan, M.J.; Masood, O.A.; Mohammed, M.A.; Yasin, A.; Zain, A.M.; Damaševičius, R.; Abdulkareem, K.H. Image-Based Malware Classification Using VGG19 Network and Spatial Convolutional Attention. Electronics 2021, 10, 2444. [Google Scholar] [CrossRef]
Narayanan, B.N.; Davuluru, V.S.P. Ensemble Malware Classification System Using Deep Neural Networks. Electronics 2020, 9, 721. [Google Scholar] [CrossRef]
Vasan, D.; Alazab, M.; Wassan, S.; Naeem, H.; Safaei, B.; Zheng, Q. IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 2020, 171, 107138. [Google Scholar] [CrossRef]
Khan, R.U.; Zhang, X.; Kumar, R. Analysis of ResNet and GoogleNet models for malware detection. J. Comput. Virol. Hacking Tech. 2018, 15, 29–37. [Google Scholar] [CrossRef]
Llauradó, D.G.; Mateu, C.; Planes, J.; Vicens, R. Using convolutional neural networks for classification of malware represented as images. J. Comput. Virol. Hacking Tech. 2018, 15, 15–28. [Google Scholar]
Cui, Z.; Xue, F.; Cai, X.; Cao, Y.; Wang, G.; Chen, J. Detection of Malicious Code Variants Based on Deep Learning. IEEE Trans. Ind. Inform. 2018, 14, 3187–3196. [Google Scholar] [CrossRef]
Venkatraman, S.; Alazab, M.; Vinayakumar, R. A hybrid deep learning image based analysis for effective malware detection. J. Inf. Secur. Appl. 2019, 47, 377–389. [Google Scholar]
Falana, O.J.; Sodiya, A.S.; Onashoga, S.A.; Badmus, B.S. Mal-Detect: An intelligent visualization approach for malware detection. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 1968–1983. [Google Scholar] [CrossRef]
Sun, G.; Qian, Q. Deep Learning and Visualization for Identifying Malware Families. IEEE Trans. Dependable Secur. Comput. 2018, 18, 283–295. [Google Scholar] [CrossRef]
Bensaoud, A.; Kalita, J.K. Deep multi-task learning for malware image classification. J. Inf. Secur. Appl. 2022, 64, 103057. [Google Scholar] [CrossRef]
Asam, M.; Khan, D.H.; Jamal, T.; Zahoora, U.; Khan, A. Malware Classification Using Deep Boosted Learning. arXiv 2021, arXiv:abs/2107.04008. [Google Scholar]
Gove, R.J.; Saxe, J.; Gold, S.; Long, A.; Bergamo, G. SEEM: A scalable visualization for comparing multiple large sets of attributes for malware analysis. In Proceedings of the Eleventh Workshop on Visualization for Cyber Security, Paris, France, 10 November 2014. [Google Scholar]
Zhuo, W.; Nadji, Y. MalwareVis: Entity-based visualization of malware network traces. In Proceedings of the Visualization for Computer Security, Seattle, WA, USA, 15 October 2012. [Google Scholar]

Figure 1. Overview of the proposed method.

Figure 2. Pixel value sequence obtained after data preprocessing.

Figure 3. Schematic diagram of bilinear interpolation method.

Figure 4. The selected colormaps.

Figure 5. The process of the MC-ISA visualization method.

Figure 6. Comparison of bytecode gray images generated by two algorithms.

Figure 7. The impact of the image size adaptive mechanism on the detection results. (a) Accuracy, (b) F1-score.

Figure 8. Comparison of bytecode images generated by ISA algorithm with and without color enhancement.

Figure 9. The impact of the color enhancement mechanism on the detection results. (a) Accuracy, (b) F1-score.

Figure 10. Visualization result of the multi-channel code image.

Figure 11. The impact of the multi-channel mechanism on the detection results. (a) Accuracy, (b) F1-score.

Table 1. Summary of recent malware classification approaches based on visualization.

References	Year	Models	Technique	Image Type	Accuracy	Precision	Recall	F1-Score
[22]	2017	SNN	Machine learning	Gray	98.90%	--	--	--
[23]	2018	KNN, RF	Machine learning	Gray	94.98%	--	--	--
[24]	2019	KNN, SVM, NB	Machine learning	Gray	98.40%	--	--	--
[25]	2020	RF	Machine learning	Gray	97.04%	95.58%	95.55%	95.51%
[26]	2021	KNN, SVM, RF, VGG, Inception, ResNet	Machine learning and deep learning	RGB	98.38%	98.49%	98.61%	98.55%
[27]	2017	KNN	Machine learning	Gray	95.31%	--	--	--
[28]	2019	CNN	Deep learning	Gray	92.5%	--	--	--
[29]	2020	CNN	Deep learning	Gray CAM	98.6%	--	--	--
[30]	2021	CNN	Deep learning	Gray	97.62%	97.68%	97.5%	97.2%
[31]	2020	CNN + RNN	Deep learning	Gray	99.8%	--	--	--
[32]	2020	CNN	Deep learning	Gray RGB	99.5%	--	--	--
[33]	2018	GoogleNet ResNet	Deep learning	Gray	74.5%	--	--	--
[34]	2018	CNN	Deep learning	Gray	98.48%	--	--	--
[35]	2018	CNN	Deep learning	Gray	94.5%	94.6%	94.5%	--
[36]	2019	CNN RNN	Deep learning	Gray	96.3%	91.5%	91.8%	91.6%
[37]	2022	CNN GAN	Deep learning	RGB	96.77%	--	--	--
[38]	2018	CNN RNN	Deep learning	Gray	99.5%	--	--	--
[39]	2022	CNN	Deep learning	RGB	99.87%	--	--	--
[40]	2021	CNN	Deep learning	Gray	98.61%	96%	96%	96%

Table 2. Composition of the dataset in this paper.

	Sample Type	Number	Total
Malware	Locker	650	8870
	Mediyes	2900
	Winwebsec	7582
	Zbot	3626
	Zeroaccess	1134
Benign	System software	1390	8590
Benign	Application software	7200	8590

Table 3. Software and hardware environment configuration of the experiment.

Software and Hardware	Configuration
CPU	Intel(R) Xeon(R) Silver 4214 CPU @ 2.20 GHz 2.19 GHz
GPU	NVIDIA Quadro RTX 5000
Memory	64.0 GB
Operating System	Windows 10

Table 4. Experimental Design Summary.

No.	Evaluation Objective	Baseline	Our Proposed Method
1	Image size adaptive only	B2M	ISA-Gray
2	Image size adaptive + Color enhancement	ISA-Gray	ISA-CE
3	Image size adaptive + Color enhancement + Multi-channel enhancement	ISA-CE	MC-ISA

Table 5. Confusion Matrix of Two Classification Problems.

Confusion Matrix		True Value
Confusion Matrix		Positive	Negative
Predicted Value	Positive	TP (True Positive)	FP (False Positive)
Predicted Value	Negative	FN (False Negative)	TN (True Negative)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qi, X.; Liu, W.; Lou, R.; Li, Q.; Jiang, L.; Tang, Y. MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection. Electronics 2023, 12, 2272. https://doi.org/10.3390/electronics12102272

AMA Style

Qi X, Liu W, Lou R, Li Q, Jiang L, Tang Y. MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection. Electronics. 2023; 12(10):2272. https://doi.org/10.3390/electronics12102272

Chicago/Turabian Style

Qi, Xuyan, Wei Liu, Rui Lou, Qinghao Li, Liehui Jiang, and Yonghe Tang. 2023. "MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection" Electronics 12, no. 10: 2272. https://doi.org/10.3390/electronics12102272

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MC-ISA: A Multi-Channel Code Visualization Method for Malware Detection

Abstract

1. Introduction

2. Related Work

2.1. Malware Detection Based on Machine Learning

2.2. Malware Detection Based on Deep Learning

3. Methodology

3.1. Data Preprocessing

3.2. Image Size Adaptive Mechanism

3.3. Image Normalization

3.4. Color Enhancement Mechanism

3.5. Multi-Channel Mechanism

4. Experimental Results and Discussion

4.1. Dataset and Experimental Configuration

4.2. Experimental Design and Evaluation

4.3. Results and Discussion

4.3.1. Evaluating the Effectiveness of the Image Size Adaptive Mechanism

4.3.2. Evaluating the Effectiveness of the Color Enhancement Mechanism

4.3.3. Evaluating the Effectiveness of the Multi-Channel Enhancement Mechanism

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI