Optical coherence tomography confirms non‐malignant pigmented lesions in phacomatosis pigmentokeratotica using a support vector machine learning algorithm

Abstract Introduction Phacomatosis pigmentokeratotica (PPK), an epidermal nevus syndrome, is characterized by the coexistence of nevus spilus and nevus sebaceus. Within the nevus spilus, an extensive range of atypical nevi of different morphologies may manifest. Pigmented lesions may fulfill the ABCDE criteria for melanoma, which may prompt a physician to perform a full‐thickness biopsy. Motivation Excisions result in pain, mental distress, and physical disfigurement. For patients with a significant number of nevi with morphologic atypia, it may not be physically feasible to biopsy a large number of lesions. Optical coherence tomography (OCT) is a non‐invasive imaging modality that may be used to visualize non‐melanoma and melanoma skin cancers. Materials and Method In this study, we used OCT to image pigmented lesions with morphologic atypia in a patient with PPK and assessed their quantitative optical properties compared to OCT cases of melanoma. We implement a support vector machine learning algorithm with Gabor wavelet transformation algorithm during post‐image processing to extract optical properties and calculate attenuation coefficients. Results The algorithm was trained and tested to extract and classify textural data. Conclusion We conclude that implementing this post‐imaging machine learning algorithm to OCT images of pigmented lesions in PPK has been able to successfully confirm benign optical properties. Additionally, we identified remarkable differences in attenuation coefficient values and tissue optical characteristics, further defining separating benign features of pigmented lesions in PPK from malignant features.


INTRODUCTION
Phacomatosis pigmentokeratotica (PPK) is a distinct and rare type of epidermal nevus syndrome characterized by coexisting speckled lentiginous nevus (SLN) of the papular type and nonepidermolytic organoid sebaceous nevus. 1 Patients with PPK also present with extracutaneous symptoms, which may include neurological, musculoskeletal, and ocular disorders, commonly correlating to the limbs affected cutaneously. [1][2][3] A systematic search retrieved 95 cases reported in literature. 4 PPK is hypothesized to be due to a single dominant heterozygous activating HRAS c.37G>A mutation, which causes the two different types of nevi. The mutation affects a multi-potent progenitor cells, which then gives rise to cutaneous and extracutaneous manifestations seen in PPK. Sebaceous nevus, otherwise known as nevus sebaceous of Jadassohn, is a congenital malformation that involves hamartomas of the pilosebaceous follicular unit. The coexisting SLN, otherwise known as nevus spilus, is described as larger café-au-lait macules with numerous nevi or smaller superimposed darker black or brown melanocytic proliferations. 5 Sizes of the nevi may range from a millimeter up to 10 cm. 2,5,6 Spitz nevi may also be found within speckled lentiginous nevi of PPK patients. 7 Generally, for melanocytic lesions, the gold standard for a clinical suspicion of melanoma is a full-thickness biopsy of the lesion, which allows for adequate histopathologic interpretation and determination of margins of resection. 13 Atypical nevi can often be asymmetric, have irregular borders, different colors, diameters >6 mm, and evolve over time, fulfilling clinical diagnostic criteria for suspicion of melanoma. 14 Moreover, visual inspection only has a specificity of 59%−78% and is highly dependent on physician expertise. 15 Approximately 15−30 benign lesions are biopsied to diagnose one melanoma. 16  OCT is an emerging non-invasive imaging technology that generates cross-sectional images of a tissue in real time. [17][18][19][20] It uses a nearinfrared low coherence light source 21 and has imaging capability of up to 2 mm in depth and up to 6 mm in width. 22 Swept-source OCT has a high spatial resolution of less than 10 μm, which is 10−100 times finer than clinical high-frequency ultrasound. 23  Full-field OCT (FF-OCT) uses wide-field illumination rather than beam scanning. 26 Line-field confocal OCT (LC-OCT) uses a broadband laser coupled with line detection using a line-scan camera where the focus is continuously adjusted during the scan to achieve confocal spatial filtering. 27 Reflectance confocal microscopy (RCM) is another method for high-resolution skin imaging for diagnostic purposes. RCM also uses confocal illumination to display high-resolution images based on changes in the refractive index of tissue, but its penetration depth is limited to approximately 200-250 μm. 28 Melanin has a high absorption in both broad spectrum visible light and near-infrared light bands. 29

MATERIALS AND METHODS
All imaging procedures and experimental protocols were approved and carried out based on guidelines of the Institutional Review Board of University of Illinois at Chicago College of Medicine (IRB #2021-0249).
Informed consent was obtained from all subjects prior to enrollment in the study.

OCT configuration
The OCT used in this study was a multi-beam, swept-source system

Calculation of attenuation coefficient and analysis
When light waves penetrate tissue, the intensity decays exponentially due to light scattering and absorption of the tissue microstructures under different physiological conditions. 46 This attenuation of light is quantified by the AC and is governed by Beer-Lambert law. 47 In literature, the study of ACs has proven to be successful in characterizing tissue and structural changes. 46,48,49 In this study, a method is used to calculate ACs by converting each pixel of region of interest (ROI) of an OCT image and converting it to a corresponding "optical absorption coefficient pixel." This method allows for improved accuracy in detecting data in homogenous and heterogeneous tissue without pre-segmenting or pre-averaging. 24,46 The single scattering equation implemented is as: where I is the value of detected intensity, I 0 is the incident light intensity, ρ is the backscattering coefficient, μ is AC, and x is the depth. The factor, 2, accounts for the light traveling to the tissue and back to detector. The calculation of AC is done by fitting an exponential curve to the above equation, from which a decay constant can be extracted. An ROI must be selected within a depth range before fitting the curve. 47 The For classification tasks, the supervised nature of SVM causes this algorithm to be highly dependent on feature extraction. 67 Thus, the image must be translated into quantifiable numerical and textural data for computer-based analysis. For our feature extraction, we determined wavelet transformation to be ideal. Wavelet transformation is widely used for frequency domain analysis and texture-based feature analysis of an image. 68 Frequency in an image processing is defined as change and diversity between pixels; for example, contrast between black and white has a high pixel value diversity and thus a high frequency. 69,70 Wavelet feature analysis allows for the localization of meaningful signals within an image in time and space and separates these signals from noise. 71 We implement the Gabor wavelet filter, which is a group of wavelets, with each wavelet encompassing energy at specific frequency and orientation. 68,72 Textural and edge features can then be constructed from this data set of energy distributions. 68,73 The Gabor filter is a Gaussian kernel function governed by a sinusoidal component. 74 The Gabor wavelet transformation formula 73  F I G U R E 2 Invalid segments of the image that were omitted prior to attenuation coefficient (AC) analysis and mapping. Yellow bracket: stratum corneum aka skin entrance signal. Red arrow: hair shaft. Red rectangle: hair follicle.

F I G U R E 3
Regions of interest (ROIs) (red rectangle) selected from an optical coherence tomography (OCT) image. Each segment size is 10 pixels in width and 150 pixels in height. All ROIs begin below the stratum corneum (yellow bracket). Yellow bracket: stratum corneum aka skin entrance signal. Red vertical rectangle: ROI.
Gaussian major and minor widths, respectively: We chose to use SVM because of its capability in creating the widest plane, or separation, between our two classes of benign and malignant.
SVM is able to map points to other dimensions by use of nonlinear relationships for classification of data that is not linearly separable. 75 In our methods, because our data are multidimensional (75 × 75 as opposed to typical 2D or 3D), we use a radial basis function kernel, where a realvalue function depends on the distance between the input and another fixed point such as the origin or elsewhere, called a center point. 76 This allows for a multidimensional way to classify data with greater accuracy. For example, our OCT AC data are collected in an original space.
SVM is able to map the data in hyperspace, or for example, the sine (etc.) of OCT ACs. This helps to classify data in a hyperspace, creating a hyperplane, which is positioned to optimally separate benign and malignant. 75 The main goal of our method is to find the optimal hyperplane to classify data in the n-dimension, which corresponds to the number of features extracted from our data. Our algorithm is detailed in the flowchart below ( Figure 4).    For our binary classification, two classes were assigned: benign or melanoma. The confusion matrix (Table 2)  Finally, the F1 score, which combines precision and recall, is calculated.

Attenuation coefficient calculations
The calculation for the F1 score is demonstrated in the formula below: For the testing phase in our SVM model for benign lesions, the precision calculation was 79%, with recall of 82% and F1 score of 81%. For the testing phase for melanoma lesions, the precision calculation was 81%, with recall of 78% and F1 score of 80%. The data are represented below (Table 3).

DISCUSSION
PPK is a rare epidermal nevus syndrome characterized by the coexistence of nevus sebaceous and SLN with additional extracutaneous syndromic features that involve multiple organ systems. 2 Machine learning research in the scope of melanoma and nevi differentiation has previously been applied to dermoscopic and nondermoscopic gross images of the lesion with pixel-by-pixel analysis. 84 SVM with Gabor wavelet transformation applications to melanoma versus nevi has been reported, but with analysis of histopathologic slide and dermoscopic images. 85,86 We are the first and only study thus far to apply SVM learning in conjunction with Gabor wavelet transformation to swept-source OCT imaging of melanoma and benign nevi.
Based on our training and testing set, we conclude that our algorithm was successfully able to predict benign diagnosis. Our results indicate a potential increased applicability of these methods to studies of a larger sample size.
Tissues have intrinsic optical properties that allow for the detection of changes in light scattering, absorption, and attenuation, which affect AC. 87  Swept-source OCT allows for higher scan rate and less motion artifacts, allowing for improved image contrast. 101,102 However, limitations include light signal intensity decay, artifacts such as blood vessels, speckle noise produced in the OCT image, tissue or OCT probe motion, and blurring, all of which could affect pixel value and thus AC calculations. 101 Bright signal intensities, such as high melanin pigment content, could reflect the light signal, thereby obstructing further light penetration to regions below. While resolution of OCT has not been able to display morphology of single cells, clear architectural changes can be seen. With implementation of post-image processing by analyzing optical properties and AC values of melanoma, PPK-atypical nevi, and normal skin, we can further characterize tissue features without biopsy. Often, however, to accurately diagnose dysplastic or Spitz nevi from melanoma, discrete and subtle histological findings must be seen.
Machine learning applications have limitations as well. Because machine learning is highly dependent on the data it is given, the quality of images must be clear and accurate. Poor-quality images may cause machine learning to draw inaccurate conclusions and thus must be properly vetted. 103 Additionally, machine learning does not take into consideration the effect of other physiological or pathological etiologies that might affect the presentation. Thus, machine learning and artificial intelligence in the scope of dermatologic applications should be used in combination with a physician's clinical interpretation to ensure a comprehensive diagnostic approach to melanocytic lesion management. In our study specifically, due to many options of feature extraction and selection methods, fine-tuning these combinations may continue to optimize our method's specificity score.

CONCLUSION
Swept-source OCT has been demonstrated to be a promising technological advancement in a step toward potentially decreasing biopsy numbers. While OCT lacks the ability to discriminate detail at the cellular level, the addition of post-image processing and analysis by SVM machine learning, Gabor wavelet transformation, and AC has demonstrated the ability to confirm benign features in pigmented lesions in a patient with PPK. Studies with larger sample sizes must be explored to further investigate the utility of this non-invasive post-image analytical approach to pigmented lesions.

ACKNOWLEDGMENTS
The authors would like to thank Department of Dermatology, University of Illinois at Chicago, Julia May, Juliana Benevides, and Dharm Sodha. They would also like to thank the Melanoma Research Alliance (grant number 624320) for their support.

CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.