Melanoma Disease Detection Using Deep Learning

: Skin diseases are mostly caused by fungal infection, bacteria, allergy, or viruses, etc. The lasers advancement and Photonics based medical technology is used in diagnosis of the skin diseases quickly and accurately. The medical equipments for such diagnosis is limited and most expensive. So, Deep learning techniques helps in detection of skin disease at an initial stage. The feature extraction plays a key role in classification of skin diseases. The usage of Deep Learning algorithms has reduced the need for human labor, such as manual feature extraction and data reconstruction for classification purpose. A Dataset of 10015 images has been taken for the Classification of Skin diseases. They include Benign Melanoma and Malignant Melanoma. By using CNN algorithm, 92% accuracy is achieved in classification of skin disease.


I. INTRODUCTION 1.1 Deep Learning
Deep learning is a subset of a Machine Learning algorithm that uses multiple layers of neural networks to perform in processing data and computations on a large amount of data.Deep learning algorithm works based on the function and working of the human brain.The deep learning algorithm is capable to learn without human supervision, can be used for both structured and unstructured types of data.Deep learning can be used in various industries like healthcare, finance, banking, e-commerce, etc. Deep learning algorithms are dynamically made to run through several layers of neural networks, which are nothing but a set of decision-making networks that are pre-trained to serve a task.Later, each of these is passed through simple layered representations and move on to the next layer.However, most machine learning is trained to work fairly well on datasets that have to deal with hundreds of features or columns.For a data set to be structured or unstructured, machine learning tends to fail mostly because they fail to recognize a simple image having a dimension of 800x1000 in RGB.It becomes quite unfeasible for a traditional machine learning algorithm to handle such depths.This is where deep learning.Deep learning is a subfield of machine learning where artificial neural networks -complex algorithms modelled to work in a way similar to human brains -learn from large sets of data.In simple terms, deep learning focuses on training computers to mimic how people learn things.As humans, we perceive information from images, text, and speech using not only our sensory organs (eyes, ears, etc.) but also a network of neurons in our brain.Millions of neurons exchange electrical and chemical signals and transmit information in this way.Artificial neural nets pass information similarly: They do that through multiple interconnected layers consisting of neurons (nodes).In deep learning, the socalled depth typically refers to neural nets having more than three layers (they can have 100+ layers, just so you know.)

Convolutional Neural Network (CNN)
Convolutional Neural Networks come under the subdomain of Machine Learning which is Deep Learning.Algorithms under Deep Learning process information the same way the human brain does, but obviously on a very small scale, since our brain is too complex (our brain has around 86 billion neurons).Image classification involves the extraction of features from the image to observe some patterns in the dataset.Using an ANN for the purpose of image classification would end up being very costly in terms of computation since the trainable parameters become extremely large.For example, if we have a 50 X 50 image of a cat, and we want to train our traditional ANN on that image to classify it into a dog or a cat the trainable parameters become -(50*50) * 100 image pixels multiplied by hidden layer + 100 bias + 2 * 100 output neurons + 2 bias = 2,50,302 We use filters when using CNNs.Filters exist of many different types according to their purpose.
Step 1: Choose a Dataset Choose a dataset of your interest or you can also create your own image dataset for solving your own image classification problem.An easy place to choose a dataset is on kaggle.com.
Step 2: Prepare Dataset for Training Preparing our dataset for training will involve assigning paths and creating categories(labels), resizing our images.
Step 3: Create Training Data Training is an array that will contain image pixel values and the index at which the image in the CATEGORIES list Step 4: Shuffle the Dataset.Dataset is distributed and shuffled into various folders.There are six various skin disease datasets in our project which consists of dermoscopic images.
Step 5: Assigning Labels and Features This shape of both the lists will be used in Classification using the NEURAL NETWORKS.
Step 6: Normalising X and converting labels to categorical data Categorical data is qualitative.That is, it describes an event using a string of words rather than numbers.Categorical data is analysed using mode and median distributions, where nominal data is analysed with mode while ordinal data uses both.Here we briefly review some of the techniques as reported in the literature.In, a system is proposed for the dissection of skin diseases using colour images without the need for doctor intervention.The system consists of two stages, the first the detection of the infected skin by uses colour image processing techniques, k-means clustering and colour gradient techniques to identify the diseased skin and the second the classification of the disease type using artificial neural networks.The system was tested on six types of skin diseases with average accuracy of first stage 95.99% and the second stage 94.016%.In the method of extraction of image features is the first step in detection of skin diseases.In this method, the greater number of features extracted from the image, better the accuracy of system.The author of applied the method to nine types of skin diseases with accuracy up to 90%.Melanoma is type of skin cancer that can cause death, if not diagnose and treat in the early stages.The author of focused on the study of various segmentation techniques that could be applied to detect melanoma using image processing.Segmentation process is described that falls on the infected spot boundaries to extract more features.The work of proposed the development of a Melanoma diagnosis tool for dark skin using specialized algorithm databases including images from a variety of Melanoma resources.Similarly, discussed classification of skin diseases such as Melanoma, Basal cell carcinoma (BCC), Nevus and Seborrheic keratosis (SK) by using the technique support vector machine (SVM).It yields the best accuracy from a range of other techniques.On the other hand, the spread of chronic skin diseases in different regions may lead to severe consequences.Therefore, proposed a computer system that automatically detects eczema and determines its severity.The system consists of three stages, the first effective segmentation by detecting the skin, the second extract a set of features, namely colour, texture, borders and third determine the severity of eczema using Support Vector Machine (SVM).In, a new approach is proposed to detect skin diseases, which combines computer vision with machine learning.The role of computer vision is to extract the features from the image while the machine learning is used to detect skin diseases.The system was tested on six types of skin diseases with accurately 95%.Title: Skin Disease Classification from Image -A Survey Author: Tanvi Goswami, Vipul Dhabi, Harshad Kumar B Prajapati Date: 2020 Skin diseases are one of the most common types of health illnesses faced by the people for ages.The identification of skin disease mostly relies on the expertise of the doctors and skin biopsy results, which is a time-consuming process.An automated computer-based system for skin disease identification and classification through images is needed to improve the diagnostic accuracy as well as to handle the scarcity of human experts.Classification of skin disease from an image is a crucial task and highly depends on the features of the diseases considered in order to classify it correctly.Many skin diseases have highly similar visual characteristics, which add more challenges to the selection of useful features from the image.The accurate analysis of such diseases from the image would improve the diagnosis, accelerates the diagnostic time and leads to better and cost-effective treatment for patients.This paper presents the survey of different methods and techniques for skin disease classification namely; traditional or handcrafted feature-based as well as deep learning-based techniques.
Title: Malignant Melanoma Classification Using Deep Learning: Datasets, Performance Measurements, Challenges and Opportunities Author: Shoaib Farooq, Adel Khalif, Adnan Abid Date: 2020 A pre-trained convolution neural networks CNN technique; named Alex Net, has been proposed by Pomponius et al, which produce high level skin samples for the classification of skin disease.Moreover, the proposed method derives features from the last three entirely linked layers which were utilized to prepare a k nearest neighbour (NN) classifier for skin malignant classification.In comparison to this, Estevan et al. suggested a pre-trained CNNs for skin malignant diagnosis using a huge dataset for their analysis (129,450 clinical images).More over Mahboob et al used pre-trained CNN to study the classification of skin lesions, a pre-trained Alex Net and VGG-16 architecture were implemented in their algorithm for the classification of skin lesions to extract distinct features from dermoscopic imageries.Whereas Soudan et al used a hybrid technique of amalgamation of pre-trained architectures (VGG16 and ResNet50) to extract characteristics from the coevolutionary sections.A classifier with an output layer has been designed which consists of five nodes.These nodes represent the classes of the segmentation methods and predict the most effective skin lesion detection and segmentation technique in any image data.A pre-trained deep learning network and transfer learning were proposed in Khalid et al.The transfer learning was applied to Alex Net to identify skin lesions in addition to fine-tuning and data increase.An automated system for melanoma detection has been developed by Bisola et al. which counter the limitation of datasets.Moreover, proposed method relies heavily on the processing unit to eliminate image occlusions and the unit for data generation for skin lesion classification.Many studies use ensemble deep learning techniques for melanoma classification like Milton et al. uses a cooperative deep learning model, which was tested on a benchmark dataset of ISIC 2018.To identify skin lesions from dermoscopic images.However, Mahboob et al. propose a cooperative deep learning-based technique that was designed by unifying intra-architecture and inter-architecture network fusion for convolutional neural networks (CNNs).This was a completely automatic and instinctive computerized process.III.SYSTEM ARCHITECTURE 3.1 System Architecture A system architecture is the conceptual model that defines the structure, behaviour, and more views of a system.An architecture description is a formal description and representation of a system, organized in a way that supports reasoning about the structures and behaviours of the system.V. CONCLUSION This work performed experiments using CNN structure for the skin image diagnosis of three common skin diseases and had constructed a dataset consisting mainly of skin disease images.The results demonstrates that CNNs have the ability to recognize and classify skin diseases.Further, our experiments also showed that a reasonable network structure could improve the performance of the model.The performance of the current network structure is used for classification in some diseases, but the overall performance is yet to be improved.
Step 7: Split X and Y for use in CNN Ethically, it is suggested to divide your dataset into three parts to avoid overfitting and model selection bias called -Training set (Has to be the largest set)  Cross-Validation set or Development set or Dev set  Testing Set The test set can be sometimes omitted too.It is meant to get an unbiased estimate of algorithms performance in the real world.People who divide their dataset into just two parts usually call their Dev set the Test set.Step 8: Define, compile and train the CNN Model These are the steps used to training the CNN (Convolutional Neural Network).Step 1: Upload Dataset Step 2: The Input layer Step 3: Convolutional layer Step 4: Pooling layer Step 5: Convolutional layer and Pooling Layer Step 9: Accuracy and Score of models In these 9 simple steps, you would be ready to train your own Convolutional Neural Networks model and solve real-world problems using these skills.You can practice these skills on platforms like Analytics Vidhya and Kaggle.You can also play around by changing different parameters and discovering how you would get the best accuracy and score.Try changing the batch size, the number of epochs or even adding/removing layers in the CNN model.II.LITERATURE REVIEW Title: Skin Disease Prediction Author: Mr. T.K. Jagtap, Mr. H.P. Shinde Mr. O.V. Gaware, Mr. S.R. Maurya Date: 2021 Several researchers have proposed image processing-based techniques to detect the type of skin diseases.

Fig 3. 2
Fig 3.2 CNN Architecture CNN model consists of an input layer, it is composed of artificial input neurons and output layer and also has multiple hidden layers.The hidden layers of CNN are convolutional layer, pooling layer, fully connected layer, receptive field and weights.Feature extraction and classification parts are the two components of the CNN model.The feature extraction part is performed by the convolution and pooling layer.The fully connected layers then act as a classifier on top of these features and assign a probability to provide final output.Input layer: Input layer has nothing to learn, at its core, what it does is just provide the input image's shape.So, no learnable parameters here.Thus, number of parameters = 0.  CONV layer: This is where CNN learns, so certainly we'll have weight matrices.To calculate the learnable parameters here, all we have to do is just multiply the by the shape of width m, height n, previous layer's filters d and account for all such filters k in the current layer. POOL layer: This has got no learnable parameters because all it does is calculate a specific number, no backprop learning involved!Thus, number of parameters = 0.  Fully Connected Layer (FC): This certainly has learnable parameters, matter of fact, in comparison to the other layers, this category of layers has the highest number of parameters. Overfitting: A significant thought in Deep Learning is that the estimation of the objective capacity that has been prepared utilizing preparing information, sums up to new information.Speculation works best if the sign or the example that is utilized as the preparation information has a high sign to commotion proportion. Regularization: Regularization is the strategy to assess a favoured multifaceted nature of the Deep Learning model with the goal that the model sums up and the over-fit/underfit issue. Parameter and Hyper-Parameter: Parameters are design factors that can be believed to be inside to the model as they can be assessed from the preparation information.Hyper parameters of a model are set and tuned relying upon a mix of certain heuristics and the experience and area information on the information researcher.

Fig 4. 4
Fig 4.4 Model Accuracy Our model gives the training accuracy up to 92% and validation accuracy up to 90.5%.

Fig 4. 5
Fig 4.5 Model Loss The above graph illustrates the training loss and validation loss of the model.