Deep Learning in Skin Disease Image Recognition: A Review

The application of deep learning methods to diagnose diseases has become a new research topic in the medical field. In the field of medicine, skin disease is one of the most common diseases, and its visual representation is more prominent compared with the other types of diseases. Accordingly, the use of deep learning methods for skin disease image recognition is of great significance and has attracted the attention of researchers. In this study, we review 45 research efforts on the identification of skin disease by using deep learning technology since 2016. We analyze these studies from the aspects of disease type, data set, data processing technology, data augmentation technology, model for skin disease image recognition, deep learning framework, evaluation indicators, and model performance. Moreover, we summarize the traditional and machine learning-based skin disease diagnosis and treatment methods. We also analyze the current progress in this field and predict four directions that may become the research topic in the future. Our results show that the skin disease image recognition method based on deep learning is better than those of dermatologists and other computer-aided treatment methods in skin disease diagnosis, especially the multi deep learning model fusion method has the best recognition effect.


I. INTRODUCTION
The skin, the largest organ of the human body, is an important barrier. The main function of the skin is to protect the human body from harmful substances from the outside world and prevent the outflow of various nutrients in the human body [1]. In human productive life, the skin health status is affected by many factors, such as solar radiation, smoking, drinking, sports activities, viruses, and working environment [2]. These factors not only affect the integrity of skin function but also cause certain damage to the skin, have an adverse effect on human health, and can even threaten human life in severe cases. Therefore, skin disease has become one of the common diseases of human beings. Skin disease covers all cultural regions and occurs in all ages.
The associate editor coordinating the review of this manuscript and approving it for publication was Yudong Zhang . Approximately 30% to 70% of people are in high-risk groups [3]. According to the British Skin Foundation Report in 2018, approximately 60% of the British people suffer from skin disease [4]; 5.4 million new cases of skin cancer are recorded in the United States every year; one in five Americans will be diagnosed with a cutaneous malignancy in their lifetime [43]. Skin disease brings not only a significant impact to human beings, such as daily activities damage, loss of the interpersonal relationship, and internal organ damage, but also death. This condition can also constitute mental illness, leading to isolation, depression, and even suicide [5]. Accordingly, skin disease has become one of the major topics in the field of medicine. Figure 1 shows the global hot situation of skin disease.
In the treatment of skin disease, early detection is the critical condition to cure the disease, effectively reduce its impact, and improve the survival rate. Take melanoma in skin disease FIGURE 1. Global Skin Disease Thermal Map. Panel (a) is a global trend chart of skin disease search fever, and the horizontal axis represents the year's longitudinal axis. The chart shows that the global search fever for skin disease has started to increase since 2011, and the search fever has been increasing year by year. By 2020, the search fever for skin disease has reached its peak. Panel b is a global heat map of skin disease regions, starting from 2011. The map demonstrates skin disease problems in all parts of the world, except Greenland in the Arctic Circle, and the search for skin disease is hot near the equator. The highest search for skin disease in the world is France, followed by Monaco and Japan.
as an example. In recent years, malignant neoplasms in skin diseases have increased significantly. Malignant melanoma (the deadliest type) is responsible for 10,000 deaths annually just in the United States [6]. Melanoma is a highly lethal but not incurable disease. If abnormal proliferation of skin melanocytes is detected in the early stage, then the survival rate is 96%; if it is detected in the late stage, then the survival rate is only reduced to 5% [7]. Therefore, early diagnosis and treatment of skin disease can minimize the damage caused by skin disease. However, the skin disease recognition accuracy is unideal due to the similarity between different skin diseases and the limited number of dermatologists with professional knowledge. The identification of skin disease has become a serious scientific challenge.
To address the issue of skin disease diagnosis and treatment, people used computer-aided diagnosis for automatic skin disease recognition based on the skin disease images earlier [26]. With the rapid development of the artificial intelligence technology, deep learning has quickly developed a computer vision. The medical image processing of skin disease has become an essential component and received great attention in the cross-field of image processing, machine science, and intelligent medicine. Many experts and scholars have been engaged in the image recognition of skin disease. The recent article published by Dick et al. is a good starting point. This article lists in detail the relevant articles on the diagnosis of melanoma in deep learning [8].
This study investigates the research status of skin disease recognition in recent years, summarizes the datasets used by researchers, and analyses from the aspects of image preprocessing, data augmentation, deep learning model, and framework performance indicators. On the one hand, this study provides a reference for deep learning methods for dermatologists. On the other hand, this study facilitates researchers to quickly and accurately retrieve the literature related to dermatological image recognition. This survey's foundation is the rapidly developing artificial intelligence-based diagnosis technology in the medical field, which has become increasing popular among researchers. The application of artificial intelligence in other fields has shown its great potential. The fact that at least 45 studies have used deep learning to address skin disease identification issues and have achieved promising results encourages authors to prepare the survey.
This study mainly summarizes the research and application progress of skin disease image recognition based on deep learning. Section II briefly introduces the methods and status of skin disease recognition. Section III introduces the development history of skin disease detection. Section IV focuses on the research progress of skin disease image recognition based on deep learning. Section V summarizes the full text and discusses skin disease image recognition's future research trends based on deep learning.

II. METHOD
In recent years, deep learning has been given great attention to skin disease recognition, and research achievements increased. This study summarizes the relevant literature in the field of skin disease identification from 2016 to 2020. The distribution of the selected papers is shown in Figure 3. The three main steps in analyzing the literature in this field are as follows: (a) Use hierarchical search strategies to retrieve and collect relevant literature on each database, (b) conduct detailed review and analysis of collected literature, and (c) statistical analysis of relevant data.
The first step is to search for conference papers or journal articles from CNKI and the scientific databases IEEE Xplore, ScienceDirect, Web of Science, Google Scholar, PubMed, arXiv, and Medline by using hierarchical search strategies. The strategy is: First, the year limit is set, and the search year is from 2016 to the present; then, the first-level keywords are used for the subject search for each database. The first-level keywords are as follows: [''Artificial intelligence'' and ''skin disease recognition''] or [''deep learning'' and ''skin disease recognition''] or [''deep learning'' and ''skin lesion''].
Approximately 2431 documents were retrieved through the first-level keyword search; then, the second-level keywords were used to retrieve the documents obtained at the upper level. The second-level search function was to obtain additional detailed granularity (such as melanoma, acne, and pigment lesions) to search for achieving the first level of missing and filling. In the field of deep learning image recognition, the mainstream method is to use convolutional neural networks (CNNs). Accordingly, the second level uses CNNs as keywords to narrow the search range. The second level keywords are as follows: [''Convolutional Neural Network'' and ''Melanoma Recognition''] or [''Convolutional Neural Network'' and ''Acne Classification''] or [''Convolutional Neural Network'' and ''Pigmented Skin Disease Classification''].
In this way, 312 papers related to deep learning are screened out and are suitable for the field of skin disease recognition. These papers are then sorted by ''relevance'' and ''time''. We quickly browse the downloaded and cited documents. The ''Abstract, Introduction and Conclusions'' ''Exclude non-deep learning and non-dermatological identification documents''. We also check the relevant literature cited in the paper and whether any literature meets skin disease recognition; and, if studies exist, we search them. During the browsing, if an expert or scholar is found to have unique insights in the field, then we will directly search the author's name for a literature search. Finally, 45 papers were retrieved through a hierarchical search strategy. The search strategy process is shown in Figure 2. In the second step, the 45 papers selected in the previous step are analyzed one by one. The following research questions are considered: Which categories of skin disease have been identified in this study? What are the data sources and types used? Are the datasets public, and is it a dermoscopic image? Is the data preprocessed or enhanced? (4) What are the models used in deep learning, and what frameworks are chosen to build them? (5) What are the innovations of this document and the main improvements of the model? Is model fusion used? (6) What are the performance indicators used, and how is the overall performance? Did the researchers test the performance of the models on different datasets?
The third step is to sort out the analyzed information, summarize the collected datasets, and classify the standard preprocessing techniques and deep learning neural network models in the literature. The network models are divided into single and multiple models, and the performance indicators used in the article are summarized. The main findings are discussed in detail in Section 4. Twenty-one studies were published in 2018 and 2020, thereby accounting for more than 86.6%. From the content of the study, 91% (41 articles), 6% (3 articles), and less than 2% (1 article) were involved in skin disease identification, data generation, and interpretable studies of skin disease identification, respectively.

III. DEVELOPMENT OF SKIN DISEASE DIAGNOSIS TECHNOLOGY A. TRADITIONAL MEDICAL DIAGNOSTIC PROCESS OF SKIN DISEASE
The traditional medical diagnosis of skin disease comes from the doctor according to his knowledge and experience or the characteristics and rules presented by the dermatoscopic images to distinguish the status of the patient's skin lesions. The diagnostic process can be summarized as follows: first, through the doctor's visual observation, namely, visual diagnosis, to locate the necessary information of the patient, then dermoscopy and histopathological examination. Dermoscopy is a noninvasive skin imaging technology that can observe the skin structure at the junction of the lower epidermis and the superficial dermis, and it is a high-definition imaging technology [23]. Doctors analyzed the nature, distribution, arrangement, color, edge and boundary, shape and appearance of pigmented skin lesions according to dermatoscopy detection methods such as seven-point checklist [12], ABCD rule [13], chaos and Clues [16], three-point checklists [14], and cash (color, architecture, symmetry, and homogeneity) [15]. However, only experienced dermatologists can accurately identify pigmented skin disease' pathological features due to the similarity of skin lesions in color, texture, edge contour, and other features and the difference of pathological tissues between different patients. However, this method of relying on experience for diagnosis is far from meeting the patients' needs for medical resources. The process from a sampling test to a doctor's diagnosis, the histopathological examination, and then to the patient's report generally takes 4 to 5 days. This process requires a large amount of time and affects the patient's cure.
In summary, the traditional dermatological diagnosis has the following shortcomings: First, the lack of medical resources. Dermatologists with professional skin knowledge are limited. The mismatch between the dermatologists' growth rate and the incidence rate of skin disease has resulted in many patients with few professional dermatologists. Second, the accuracy of diagnosis is low. Dermatologists with professional knowledge have different work experiences and may have varying diagnoses for the same patient under subjective thoughts. Even the same doctor, affected by light, fatigue, and other factors, has different diagnostic results for the same skin disease picture. Third, the skin disease images are complex due to the skin disease characteristics, and small gaps exist between categories and large gaps within categories. Accordingly, the diagnosis is prone to misdiagnosis or misses a diagnosis, leading reduction in the diagnostic accuracy.

B. SKIN DISEASE IMAGE RECOGNITION BASED ON MACHINE LEARNING
With the development of machine learning, the solving of the shortcomings of the traditional skin disease diagnostic process and image recognition technology of skin disease based on machine learning was stablished. Image recognition based on machine learning is an interdisciplinary field integrating medical skin disease imaging, mathematical modeling, and computer technology through feature engineering and machine learning classification algorithms to complete the recognition and diagnosis of skin disease. The flow chart is shown in Figure 4.
The earliest study on the automatic classification of skin disease dates back to 1987 [9]. In 2007, Stanley et al. extracted melanomas' color characteristics, established color histograms, and classified them [10], [11]. In 2012, Rahil et al. used wavelet decomposition to derive texture features, modeled and analyzed lesion boundary sequences to derive boundary features, and based on shape indicators to derive geometric features. Finally, four classifiers, namely, Support Vector Machine, Random Forest, Logical Model Tree, and Hidden Naive Bayesian, are used for classification [17]. In 2013, Ballerini et al. proposed a hierarchical K-NN classifier algorithm for melanoma skin disease based on color and texture, which uses three classifiers for hierarchical combination and feature selection to adjust each classifier's feature set to suit its task. The recognition accuracy is more than 70% [18]. In the same year, Ning et al. used machine learning ID3 [19], classification and regression tree [20], and AdaBoost three different algorithms for feature extraction of their performance [21]. The AdaBoost performed well in these algorithms [22]. In the 400 skin images collected with laser confocal scanning microscopy, the recognition accuracy was 94.75%, the specificity was 93%, and the sensitivity was 96.5%.
The image recognition method for skin disease based on machine learning is to extract the features of skin disease by manually setting extractors and classifying them by traditional machine learning methods. This method requires great professional medical knowledge to conduct deep exploratory data analysis and reduce its dimension. Finally, after a complex parameter adjustment, the results can be outputted, which requires a large amount of time and energy. This method has low portability, so the feature engineering is effective only in the same field. These shortcomings limit the development of machine learning in skin disease recognition. Deep learning has achieved good results in image recognition with its advancement. The deep learning method can automatically mine the deep-seated nonlinear relationship in medical images and do not need to establish feature engineering compared with the traditional image recognition methods. The extraction efficiency is also efficient. Deep learning is adaptable and easy to transform, and the technology can be more easily adapted to different fields and applications.

IV. SKIN DISEASE IMAGE RECOGNITION BASED ON DEEP LEARNING A. APPLIED SKIN DISEASE FIELD
The main application of deep learning in skin disease recognition is skin disease classification, that is, the quantitative feature extraction of lesion tissues through skin disease images. The classification is analyzed and judged. This direction is the mainstream application direction of skin disease recognition, and the main types of skin disease are benign neoplasms and malignant neoplasms. Benign neoplasms are a type of skin disease with a gradually increasing incidence, and the gap between the lesions is small, and the recognition is low. Benign neoplasms commonly used for research includes nevus (23 articles) and seborrheic keratosis (17 articles).
Malignant neoplasms are another type of skin disease widely used in research. Malignant neoplasms are cellular dysplasia disease that occur in the skin, which are lifethreatening through constant proliferation and metastasis. Malignant tumor identification in skin disease identification is exceptionally significant due to the high mortality rate of malignant neoplasms. The malignant neoplasms commonly used in research are basal cell carcinoma (13 articles), squamous cell carcinoma (seven articles), and malignant melanoma. Among the retrieved literature, the largest number of studies on melanoma recognition was 34. However, Non-neoplastic skin diseases are scarce, and only three articles on deep learning of eczema and psoriasis identification [46], [50], and [73] are collected in this study.

B. DATA SOURCES
Deep learning requires a large amount of data to extract features during training. However, large-scale image data of skin disease are difficult to obtain due to certain aspects, such as the image of skin disease involves patients' privacy, variety of skin diseases, and the presence of some rare diseases. Skin disease images need to be labeled by experts with appropriate medical knowledge due to the similarity of lesion manifestations among various skin diseases, which limits the size of the skin disease dataset that is publicly available in academia. Currently, the acquisition of skin disease datasets is mainly divided into self-collected and public datasets. Self-collected datasets are currently less publicly available. Most published dermatological datasets are image data obtained by using dermoscopic imaging and collected from dermatological image databases [30], [32]- [34], [40], [96]. Some datasets are also collected by universities in collaboration with renowned hospitals [25], [26], [29], [37], [61]. Pathological sections of basal cell carcinoma and seborrheic keratosis studied by Meifeng et al. are obtained from the Second Affiliated Hospital of Xi'an Jiaotong University [61]. The HAM10000 dataset is a dermoscopic image collected from the Dermatology Department of Vienna Medical University in Austria and the dermatology practice of Cliff Rosendahl in Queensland, Australia [29].

C. IMAGE PREPROCESSING
In deep learning image recognition, a deep learning model has high requirements for image quality because a good image quality can improve the model's generalization ability [107]. Image preprocessing is carried out before model training. The primary purpose is to eliminate the irrelevant information in the image, enhance the detectability of the useful and related information in the image, and simplify the data to the great extent to improve the model's feature extraction ability and  recognition reliability. This work has 28 studies on image preprocessing, which are divided into data cleaning and data conversion. Table 3 shows the details.

1) DATA CLEANING
Data cleaning is mainly to ensure the integrity of data features of the skin disease images. In skin disease image recognition, the commonly used data cleaning approach is to remove the noise to reduce hair and shadow influence on skin disease recognition. The image's quality is affected by the skin's nature, the environment, the equipment, and the lighting conditions. The image with a low quality will affect the recognition effect, resulting in the loss of accuracy and calculation cost. The commonly used denoising algorithms include spatial domain filtering, transform domain filtering, and partial differential equation. In the selected literature, four papers carried out noise removal on the selected datasets. Hameed [57]. Xiaoyu added noise to the skin disease image to study image noise's influence on skin disease recognition [38].

2) DATA CONVERSION
The purpose of data transformation is to transform data from one format or structure to another according to the requirements of a deep learning model. In the collected literature, the commonly used data conversion technologies are size adjustment (18 articles), normalization (10 articles), and graying (three articles). Table 3 shows the details.

a: SIZE ADJUSTMENT
Its input size limits the network model of deep learning, and the size of its input image is mostly fixed. If the input size in the network model is large, then the abstract level of VOLUME 8, 2020 information cannot meet the needs of the network, and the amount of calculation will increase; if the input size is small, then the picture information will be lost; hence, the network model in deep learning mostly uses 224 × 224 or 227 × 227 fixed-size input. Seventeen papers carried outsize adjustment, among which four papers did not introduce the size adjustment method used, while the other literature mainly utilized scaling (seven articles) and clipping (eight articles). Mahbod used zooming and clipping [52], while Mahbod and Zhang used bilinear interpolation method for scaling [47], [49].

b: NORMALIZATION
In the selected literature, 11 papers normalized the images. Normalization is to convert the samples' eigenvalues to the same dimension: to map the data to the interval of [0,1] or [−1,1], and the extreme value of variables only determines the data. This method aims to limit the preprocessing data to a specific range to eliminate the particular sample data's adverse effects.

c: GRAYSCALE CONVERSION
Gray conversion operates on a single pixel of an image; its primary purpose is to improve the contrast of the image and threshold processing. In the collected literature, Mahbod et al. [47], Singhal et al. [56], and Hasan et al. [69] used grayscale transformation to process skin disease pictures.

D. IMAGE DATA AUGMENTATION
Skin disease data are difficult to collect due to the problems of personal privacy and professional equipment involved in the collection process of the medical skin disease dataset. Accordingly, less skin disease data has been collected. Some diseases' rarity makes the data collection of this category less, resulting in the uneven distribution of the collected datasets. In deep learning, small-scale datasets can easily lead to insufficient model learning and overfitting. To solve the problem of small skin disease dataset and improve the network model's generalization ability, researchers use data augmentation technology to expand the amount of training data. Data augmentation uses existing data to create new data under the guidance of task objectives. The traditional image data augmentation expands the dataset by introducing geometric transformation and image operation to the original data without changing the data label. The leading technologies are rotation, mirror image, adding noise, and dimension reduction. The new data amplification technology produces simulation data on the basis of the original data and by generating Gans' model [62]. The internal distribution law of pictures indicates that the generated confrontation network is not only limited to within-class information but also uses the information between categories to synthesize pictures.
Among the 17 papers discussed in this study, 17 papers used data augmentation, 14 of which used the traditional data augmentation technology to expand the dataset.
Eleven papers used the rotation method, and the commonly used rotation angles were 90 • , 180 • , and 270 • . Given that the generation of confrontation network is a new technology, only three papers (Bissoto et al. [65], Albarqouni et al. [67], and Xianzhen et al. [68]) were collected for the generation of simulation data of skin disease image. Among these papers, Bissoto et al. and Albarqouni et al. used the generated countermeasure network to generate data from the ISIC 2017 dataset. See Table 4 for details. Although the traditional data amplification technology will increase the number of images, the diversity of data has not been improved due to the lack of learning the image's characteristics, which is just a simple copy transformation. The method of model generation is to generate images by learning the features of images. The generated model has strong feature learning and expression abilities compared with the traditional image amplification technology. The generated image is different from the original image, and the image quality is also higher. This method can effectively solve the unbalanced data amount of various skin disease categories, and the diversity of generated images has been dramatically improved.

E. MODEL AND FRAMEWORK
The advantage of deep learning over traditional machine learning is that significant feature representations are automatically learned from raw data, such as image pixels, to no longer rely on traditional feature engineering. Currently, the primary method of skin disease image recognition is to use a convolution neural network in deep learning, convolution, and pooling operation of convolutional network in image recognition, which has translation, rotation, and scale invariances. The CNN has excellent superiority in feature representation. All the literature collected in this study is based on the CNN model.
From the number of network models used by researchers, the collected model methods can be classified into two types: the first one is a single model method based on deep learning; the other type is a multimodel method based on deep learning. Twenty-seven papers focused on a single model method and 10 papers on the multimodel method.

2) SKIN DISEASE IMAGE RECOGNITION BASED ON MULTIMODEL FUSION
Multimodel fusion is mainly to learn through multiple learners and integrate multiple learners' results by using some rules. In the conventional multimodel fusion, the average method, weighting method, linear regression method, simple majority voting (SMV) rule, and maximum probability rule are used to fuse and recalculate the probability results of a certain type of output of each model. The final output of the fusion results is obtained. The other way is to fuse the features learned from multiple models to fuse the features and then output the final results. The flow chart of the above-mentioned two fusion methods is showed in Figure 5. Fourteen pieces of literature are about multimodel fusion collected in this study, and the details are shown in Table 6. The main methods are multi-input model fusion (two articles), multimodel extraction feature SVM classification (two articles), fusion strategy combined with clinical information (four articles), the combination of human and artificial intelligence (two articles), and conventional multiple model fusion (four articles). AlexNet, VGG, Inception, and ResNet are often used for fusion.
A large number of experiments have proved that the multimodel fusion method is better than the single model method [59], [61] in the field of skin disease recognition. Given that the objective factors limit a single model, a model generalization bottleneck is commonly encountered when dealing with problems. The multimodel fusion can combine the excellent models and integrate each model's advantages to break through the bottleneck of the generalization ability of a single model

3) DEEP LEARNING FRAMEWORK
Among the collected literature, only 16 papers published the deep learning framework, including Keras [92], Tensor-Flow [93], MATLAB [94], and PyTorch [95]. Keras has become a popular framework owing to its consistent and concise API, which can significantly reduce the workload of users. Six articles have used the framework. Tensorflow is often used with Keras. The framework can deploy training models on various servers and mobile devices without executing a separate model decoder or loading a Python interpreter. Pytorch is a deep learning framework released by Facebook AI research in 2017. This framework has the advantages of flexibility, ease of use, and fast speed. PyTorch is a rookie in the deep learning framework. Five articles collected use the framework. MATLAB has a long history of development and has advantages in the image, video, and audio data for accurate value annotation. Five papers use MATLAB. The details are shown in Table 7.

F. EVALUATING INDICATOR
To evaluate the performance of deep learning in skin disease image recognition, several performance indicators are used: accuracy (ACC) represents the percentage of correct prediction results in the total sample [99]; mean average precision (mAP) represents the average accuracy of all categories [100]; true positive rate (TPR), also known as sensitivity and Recall (R) [101], represents the probability of being predicted to be positive in the actual positive samples [102]; false positive rate (FPR) refers to the percentage of actual disease-free but judged to be disease-free; true negative rate (TNR) [103], also known as specificity, indicates that the actual disease-free is correctly judged to be disease-free; area under the receiver operating characteristic (ROC) curve (AUC) refers to the probability that the classifier outputs positive and negative samples, and the likelihood that the classifier outputs a positive sample is greater than that of the negative sample; ROC is the working characteristic curve of subjects, which shows the performance of classification model under all classification thresholds [104]. The specific performance indicators are shown in Table 8.

G. MODEL PERFORMANCE AND ANALYSIS
Two papers are difficult, or even impossible, to compare because the datasets used by researchers in various tasks, the types of skin disease identified, models, parameters, and performance indicators are different. Therefore, readers should carefully consider the comments in this section.
Among the 29 papers with accuracy indicators, four papers have an accuracy rate of less than 70%, 15 papers have an accuracy rate of 80% to 89%, and the other nine papers have an accuracy rate of more than 90%. This finding shows the good generalization ability of deep learning, and the recognition accuracy of the model built by Sarkar et al. is as high as 99.5% [57]. Among the 24 papers that use the AUC as a measure of performance, nine papers have more than 90% of the AUC, and the highest area under the ROC curve reaches 97.5% [51]. Eleven articles have a range of 80% to 89%, FIGURE 5. Two fusion methods of multimodel fusion. The figure is the conventional multimodel fusion method. After the image is inputted, the feature is extracted by multiple CNNs and then categorized by the classifier to obtain the probability value of each sample image category. Finally, the probability value of the classification is fused by the fusion module to output the category of the maximum probability value. Panel B is the multimodel fusion of feature fusion, which mainly includes the following steps: After feature extraction, the CNN fuses the extracted features, and then the probability value of each category is outputted by the classifier. The output with the maximum probability value is selected. and only four articles are below 80%. In terms of sensitivity, 28 papers are used as evaluation indicators, 10 papers have a sensitivity of more than 90%, six papers have a sensitivity of 80% to 89%, and 12 papers have a sensitivity lower than 80%. Twenty-four papers have specificity as an indicator, 11 papers have specificity higher than 90%, five papers are between 80% and 89%, and the other eight papers are less than 80%. See Table 9 for details. This paper analyses the collected literature and concludes that deep learning provides excellent performance in most related work. The deep learning-based methods used in 45 papers are effectively and correctly compared. The papers are difficult to summarize because the datasets, preprocessing techniques, indicators, models, and parameters involved in each article are different. Accordingly, the comparison of this study is strictly limited to the techniques VOLUME 8, 2020  used in each paper. Based on these constraints, we observed that deep learning performance is better than that of traditional methods, and the automatic feature extraction of deep learning models is more efficient than that of traditional methods (such as color [105], histogram [106], statistics, and texture [107]). Hang et al. used deep learning to extract the feature vectors of skin disease and categorized them by using SVM as a classifier [64]. The AUC of the model performance of the manual feature extraction method is 6.17% higher than those of traditional manual feature extraction methods. The multimodel fusion performance of a single convolution network is better that that of single-model skin disease image recognition. Harangi et al. and Xu Meifeng et al. compared the performance of multimodel fusion with that of a single model in their respective experiments [59], [61]. The results show that the performance of the multimodel fusion is better than that of a single network model. The multimodel approach combines the advantages of every single model to obtain the optimal solution. Another way of combining multiple models is to fuse the patient's clinical information (e.g., gender, age, and location of lesions.). Hagerty et al. used this method to resolve the patient's information [31], excluding the image recognition of skin disease. The ROC area increased from 83% to 94% for deep learning only.

V. SUMMARY AND PROSPECT A. SUMMARY
In this article, we outline the development of skin disease diagnostic technology and the process of traditional medical diagnosis and machine learning-based image recognition of skin disease and investigate the results of deep learning-based research in the field of skin disease recognition. Forty-five relevant papers have been identified to obtain their concerns about the areas and skin disease types. These papers are utilized as basis to study the used data source, the preprocessing and data expansion techniques, the technical details of the models, and the performance indicators' overall performance. Deep learning models AlexNet, VGG, GoogleNet, and ResNet are widely used in skin disease recognition. Researchers often use the multimodel fusion technology to improve the performance of models. In future work, we plan to apply the concepts and best practices of deep learning described in this survey to other medical fields that have not fully utilized this technology. This survey aims to encourage many researchers to conduct deep learning experiments and apply the model of deep learning in the field of computer vision involving medicine, thereby achieving smart and convenient development for the medical industry.

B. PROSPECT
Deep learning, a new field of machine learning, has good application prospects in computer vision and provides a new direction for medical image recognition. Currently, deep learning has achieved empirical progress in the field of medical image recognition [97]. However, deep learning is still in its infancy in medical image recognition. Additional developments and comprehensive attempts should be carried out in the field of skin disease recognition in the future. The following summarizes four possible directions for research development:

1) SIMULATION DATA GENERATION OF SKIN DISEASE IMAGE
Deep learning has good generalization ability in skin disease recognition; however, it requires many data in the learning process. Few data may cause insufficient feature extraction, which will affect the diagnosis and recognition of lesions. One of the significant problems in skin disease image recognition is the difficulty in obtaining a large number of data sets due to the complications in medical image collection and the rare occurrence of individual diseases. Generating simulated pictures of skin disease by using antinetwork is an VOLUME 8, 2020 excellent method to solve this problem. Such an approach also improves the performance of the network. This technology will have good development prospects in the field of skin disease recognition in the future.

2) CLINICAL IDENTIFICATION OF SKIN DISEASE
At present, the types of pictures recognized in the field of skin disease recognition are relatively single. Most graphics are based on the pictures generated by the dermoscopic imaging technology. Few studies have been conducted on the pictures taken in clinical or real life. The location of the lesion is difficult to identify due to several problems, such as low pixels and presence of shadows, and the lesion area only occupies a small part or incomplete in clinical or real scenes pictures. This aspect needs further study. In clinical recognition models, attention mechanisms or target detection methods can be introduced to focus on the lesion sites of skin disease for facilitating the feature extraction of lesion sites to improve the accuracy of clinical recognition models.

3) INTERPRETABILITY STUDY ON SKIN DISEASE IDENTIFICATION
Now, researchers focus on the performance indicators of models. However, various factors in the environment result in the transformation of model concepts and change of performance. The factors that drive the model should be understood to make certain decisions. Currently, the interpretability research on skin disease identification models is a big gap in this field and must be explored. In the skin disease recognition field, deep learning models can be applied to identify skin disease to guide clinical automated medical diagnosis. However, these algorithms are still a ''black box'' to generate predictions on the basis of input data. There is no specific interpretation on what skin disease features does the deep learning model as a basis for judgment. The development of deep learning should be trustworthy and explainable. Making the algorithm public and transparent in decision-making will give users reliability and security. Interpretability research on skin disease identification can resolve prejudices and auditing brought about by artificial intelligence [108]. Interpretability makes artificial intelligence open and transparent in legal, moral, and philosophical aspects.

4) AI DIAGNOSIS AND TREATMENT OF SKIN DISEASE
Mobile devices, such as smartphones, PDAs, and tablets, are becoming an essential part of human life [108]. Embedding AI diagnosis and treatment of skin disease on smart devices will be a significant trend in the future. However, most skin diseases are diagnosed on the basis of high-performance graphics processors. The computational complexity of the algorithm should be minimized while improving the algorithm recognition capability to ensure that it can be easily used on mobile phones and wearable intelligent devices [109]. This study is of great significance for AI diagnosis and treatment of skin diseases. In intelligent questions and answers on skin disease, AI assistants can replace dermatologists to inquire about patients and communicate with them about some repetitive contents, such as diagnosis, prescription, and health promotion. At present, some issues related to philosophy, law, and ethics still persist in addition to technical bottlenecks in AI diagnosis and treatment. Examples of the questions are as follows: Is the subject of AI diagnosis and treatment a human or a medical device on the legal level?; what is the legal standard in the clinical application?; what are the criteria for judging a medical defect or medical negligence?; and who is legally responsible for medical accidents in AI diagnosis and treatment?.  He was a Lecturer (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999) and was an Associate Professor (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) with the School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, where he has been a Professor since 2008. He has authored over 30 articles. His current research interest is the application research of RFID and the Internet of Things technology.