Artificial intelligence in head and neck cancer diagnosis

Introduction Artificial intelligence (AI) is currently being used to augment histopathological diagnostics in pathology. This systematic review aims to evaluate the evolution of these AI-based diagnostic techniques for diagnosing head and neck neoplasms. Materials and methods Articles regarding the use of AI for head and neck pathology published from 1982 until March 2022 were evaluated based on a search strategy determined by a multidisciplinary team of pathologists and otolaryngologists. Data from eligible articles were summarized according to author, year of publication, country, study population, tumor details, study results, and limitations. Results Thirteen articles were included according to inclusion criteria. The selected studies were published between 2012 and March 1, 2022. Most of these studies concern the diagnosis of oral cancer; in particular, 6 are related to the oral cavity, 2 to the larynx, 1 to the salivary glands, and 4 to head and neck squamous cell carcinoma not otherwise specified (NOS). As for the type of diagnostics considered, 12 concerned histopathology and 1 cytology. Discussion Starting from the pathological examination, artificial intelligence tools are an excellent solution for implementing diagnosis capability. Nevertheless, today the unavailability of large training datasets is a main issue that needs to be overcome to realize the true potential.


Introduction
Head and neck squamous cell carcinoma (HNSCC) affects approximately 880 000 new patients each year worldwide and represents a leading cause of mortality in some countries. 1The main risk factors of HNSCC are tobacco smoking, alcohol abuse.and human papillomavirus (HPV) infection. 2The 5-year overall survival is globally 50%-65% at 5 years with combined therapeutic strategy (surgery alone, surgery combined with adjuvant treatment, and exclusive radiotherapy with or without chemotherapy). 2In general, HNC is often diagnosed at advanced stages, and prognosis depends on anatomic site of involvement, stage, and HPV status. 3 Primary and secondary prevention are key points in HNC management.Diagnosis usually requires an otolaryngological examination and eventually an endoscopic evaluation when required.Pre-treatment evaluations include radiological examinations and histopathological analysis of procured tissue. 4However, these diagnostic methods are usually dependent upon the interpretation of human pathologists, 5,6 which can result in inconsistencies in cancer diagnosis, grading, and prognostication. 7,8Diagnostic error can negatively impact patient outcome.
The field of pathology has witnessed significant developments in recent years.From a biological standpoint is the turning point in the therapy of HNSCC that ensued with the introduction of immunotherapy. 9,10From a technological standpoint is the widespread adoption of whole slide imaging (WSI) which has advanced diagnosis and research.Accruement of large digital datasets from scanning pathology glass slides has boosted to the the application of AI in pathology.By using AI-based tools to analyze histopathology whole slide images, we are now able to enhance and possibly automate pathology diagnosis, as well as better interrogate and quantify parameters of the tumor microenvironment.AI algorithms developed using deep learning methods are based on the concept of an artificial neural network trained using a large number of digital images to subsequently classify unknown images.In general, as the volume of training data increases the AI model performance improves. 11This learning process can be "supervised" by utilizing human experts to annotate histopathological images, or it can be "unsupervised" by training the algorithm to deduce data directly from preset information.AI-based tools in pathology have the potential to automate workflow processes, improve diagnosis, standardize reporting, quantify scoring, and provide information that most pathologists today are unable to provide using light microscopy such as reporting about prognosis and predicting therapy response. 12he aim of this article is to provide an overview of the existing scientific literature on the applications of AI for the pathological diagnosis of HNC.

Materials and methods
This study aimed to comply with the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) guidelines. 13A systematic search was carried out in the electronic databases Pubmed-MEDLINE and Embase including all archived literature until March 1, 2022.The complete search string for Pubmed-MEDLINE and Embase is shown in Table S1 of the supplementary material.Two authors (SB, NS) independently reviewed all article titles and abstracts with the aid of Rayyan QCRI reference manager web application. 14Full texts of the articles that fulfilled the initial screening criteria were acquired and reviewed for subsequent inclusion against the eligibility criteria (Fig. 1).Any disagreement with respect to inclusion of a particular article was resolved by consensus.In the second stage of study selection, the same authors independently assessed full-text reports to obtain a list of relevant articles.Inclusion criteria were as follows: • Studies using AI for histopathological and/or cytological diagnosis (detection, grading.and classification) of malignant HNC.• Studies evaluating diagnostic accuracy of AI and ML algorithms for malignant HNC.• Studies evaluating accuracy of AI and ML algorithms in the differential diagnosis of precancerous and malignant HNC lesions.
Exclusion criteria: • Studies not using WSI.
• Studies not using pathology imaging modalities (e.g., radiology and endoscopy images).• Studies regarding AI/ML not primarily investigating the detection, grading, and classification of HNC (e.g., those predicting progression, prognosis, or treatment efficacy).• Studies using AI/ML for only thyroid, skin, and esophageal cancer detection.• Studies using AI/ML for only benign head and neck lesions.
• Narrative reviews, letters to editors, commentaries, and conference abstracts.
• Studies not available in the English language.
Data from eligible articles were summarized in a standard format that included: authors, year of publication, country, type of study, aim, study population, primary tumor location, type of diagnosis, main study results, and limitations (Table 2).The primary outcome was to assess the histopathological features used for diagnosis and grading of the head and neck lesion under study, in addition to the methods and performance of the proposed AI/ML algorithms.A descriptive analysis was subsequently conducted, and the performance of the algorithms used, when present, was reported.

Results
Of 6225 abstracts evaluated, 13 articles satisfied inclusion criteria and were accordingly included.The flow chart depicting article screening is shown in Fig. 1.The selected studies were published between 2012 and March 1, 2022.Most of these studies concern the diagnosis of oral cancer; in particular, 6 are related to the oral cavity, 2 to the larynx, 1 to the salivary glands, and 4 to HNSCCC not otherwise specified (NOS).As for the type of diagnostics employed, 12 concerned histopathology and 1 cytology.The AI algorithms tested were: U-Net (a convoluted neural network architecture), Support Vector Machine (SVM) (a supervised machine learning algorithm), InceptionV3 (a deep convolutional neural network), Chan-Vese algorithm (an active contour model), Decision Tree, Linear discriminant (a supervised machine learning algorithm), K-Nearest Neighbor (a supervised machine learning algorithm), InceptionV4 (a convoluted neural network architecture), GoogLe Net (a convoluted neural network architecture), MobilNet-v2 (a convoluted neural network architecture), ResNet50 (a convoluted neural network architecture), ResNet101 (a convoluted neural network architecture), Random Forest (a supervised machine learning algorithm), Gaussian naive Bayes (a supervised machine learning algorithm), Logistic regression (a supervised machine learning algorithm), Tree Model (a supervised machine learning algorithm).

Oral cavity
Six papers were about oral cavity HNC.Macaulay et al. evaluated the use of AI for the cytological recognition of suspicious lesions.Cytology samples from 369 patients were included and different algorithms were compared, with the best accuracy of 92.5% achieved in recognizing normal samples and 89.4% in recognizing suspicious ones. 15Das et al. introduced an AI-based tool for diagnosing oral squamous cell carcinoma (OSCC) based on identification of keratinized areas and keratin pearls.Their study population consisted of 10 patients diagnosed with OSCC from a single institution.Thirty digitized images were used as ground truth and the Chan-Vese method was used as a diagnostic algorithm validated by 2 expert pathologists.The Chan-Vese method is an algorithm designed to segment objects without clearly defined edges. 16The average performance of this segmentation method was evaluated by the Jaccard coefficient (77.76%), yielding a first-rate correlation value (0.85) and segmentation accuracy (95.08%). 17Another study by Das et al. described a convolutional neural network (CNN) used to distinguish epithelial, subepithelial, and keratinizing regions as well as detect keratin pearls in OSCC.CNN is an artificial neural network that contains many layers to perform operations through numerous hyperparameters, that can be useful to build a classification model.CNNs permit to learn different high-level features from the set of image patches at different layers and testing then into classifiers.This model is inspired from the animal visual cortex structure.Images from 25 patients with low-grade OSCC, 15 with high-grade, and 2 healthy subjects from a single center were evaluated.The performance of their deep learning algorithm for epithelial layer segmentation was as follows: accuracy 98.42%, sensitivity 97.76%, Jaccard coefficient 90.63%, dice coefficient 95.03%; to distinguish the keratin region the performance was: accuracy 96.88%, Jaccard coefficient 71.87%, dice coefficient 75.19%; and the performance for keratin pearl detection was: accuracy 96.88%. 18In the study by Rahman et al., images (n = 134) with normal tissue and images (n = 135) with malignancy from a multicentric case series were used.These authors' model achieved 89.7% accuracy for the dataset generated by applying t-test and 100% accuracy for the principal component analysis (PCA) generated datasets.T-test and PCA are 2 different methods used to select significant features; PCA, in particular, is statistical data compression technique used to identify a smaller number of uncorrelated variables.Area under the curve (AUC) was 0.92 and 1.0 for the first and second datasets. 19In another paper by Rahman et al., performance of 5 classifiers was further evaluated for the recognition of OSCC based on characteristics such as shape, texture, and color, obtaining an accuracy, specificity, sensitivity, and precision of over 99% with four classifiers except one (K-Nearest Neighbor classifier). 20Dos Santos et al. tested a CNN model to determine OSCC areas within oral mucosal samples.Fifteen WSIs and 1050 image patches extracted from a single center were evaluated.Their proposed model showed an accuracy of 97.6%, precision of 91.1%, Dice coefficient of 92.0%, Jaccard coefficient of 85.2%, specificity of 98.4%, and sensitivity of 92.9%. 21

Larynx
Two studies dealing with the larynx were included.Zhou et al. developed a dual-modality optical imaging microscope combined with machine  learning algorithms for the automatic detection of laryngeal squamous cell carcinoma (SCC).Image patches derived from 20 slides, from 10 patients, from a single center were used.Algorithms used were SVM, Random Forest, Gaussian naive Bayes, and Logistic regression.The highest accuracy was reached using SVM. 22In the study by He et al., a method for diagnosing laryngeal SCC was tested.A pool of 3458 pathological images were taken from 1228 patients underwent AI-aided endoscopy of the upper aerodigestive tract to look for suspicious laryngeal lesions that were biopsied and histologically examined.The pathology images were randomly divided into a training dataset, a validation dataset, and a testing dataset.The algorithm used was Inception V3 and AUC was 0.994 for the validation dataset and 0.981 for the testing dataset. 23

HNSCC NOS
Four studies dealing with HNSCC where included.Halicek et al. used a CNN on 381 WSIs from 156 patients to diagnose HNSCC, both on the primary tumor and at the lymph node level.Results in terms of performance were: accuracy 84.8 ± 1.6%, sensitivity 84.7 ± 2.2%, specificity 85.0 ± 2.2%. 24Mavuduru et al. evaluated the ability of a CNN using U-Net tested on 200 tissue samples from 84 HNSCC patients from a single center.Their study showed an AUC of the testing group of 0.89, with a threshold of 0.2845 and average time of segmentation from WSI to be 72 s. 25 In the study of Rodner et al., a CNN was used to diagnose HNSCC distinguishing between cancer, normal epithelium, background stroma, and other tissue types on 114 images from 12 patients.Average and overall recognition rates for the 4 classes were 88.9% and 86.7%, respectively.Average segmentation time was 113 s, where the best segmentation time was 55 s. 26 In the study by Tang et al., a CNN was used to extract high-dimensional features from hematoxylin and eosin slides to detect HNCSCC in 135 cervical lymph nodes from 20 patients.In their primary model that used all 4 CNNs, the accuracy was between 97.3% and 98.7%, and the AUC was between 0.9957 and 0.9982.In general, performance of the development dataset were 100% for accuracy, sensitivity, and specificity; whereas in the test dataset sensitivity was 100%, specificity 75.9%, and accuracy 86%. 27Only 1 study regarded salivary glands tumors.Lopez-Janeiro et al. used a tree model algorithm to diagnose different histotypes of malignant tumors on 115 samples of salivary glands.The total accuracy achieved was 84.6%. 28

Discussion
Digital pathology has evolved from using static images to the WSI era.Despite technical and diagnostic issues, 29 WSI has shown great diagnostic concordance compared to traditional light microscopy in anatomical pathology, [30][31][32] including cytopathology. 33WSI has enabled the application of AI tools in pathology. 34,35A brief and general description of different AI modalities is summarized in Table 1.
Our review included 13 studies.Tumor site varied among these included papers, reflecting the complexity of the head and neck region.
Despite various primary anatomical sites of involvement, the predominant histotype was SCC in nearly all of the papers, except for Lopez-Janeiro et al. who studied salivary gland tumors.All studies were conducted on images of histology, except for MacAulay et al. who used 369 cytological samples from oral cavity brushings.Slides were scanned using Cyto-Savant, which is an automated quantitative system used largely for cervical cytology and sputum samples, with software that automatically segments and focuses all objects on the slide.This system showed good accuracy to correctly recognize normal and abnormal oral cavity samples.Further application of AI tools in cytology samples is anticipated.
Regarding the performance of AI tools in histology, all studies reported an accuracy over 90%, except for 2 papers. 24,27They also reported excellent performance via AUC calculations, as well as high sensitivity and specificity.Different types of AI algorithms were tested in the included studies, from supervised machine learning systems (e.g., SVM) to deep learning systems such as CNNs.While deep learning is more advanced than simple machine learning algorithms, this AI methodology introduces new challenges.With deep learning larger datasets are typically required for training to develop an accurately performing model.For this reason, if training datasets are limited or of poor quality (e.g., lack heterogeneity), deep learning performance could be hindered.Of note, the quantity of data used in the included studies was restricted and imperfect.In this review, both supervised and unsupervised models showed good performances.All papers included were focused on SCC, with the exception of the study by Lopez-Janeiro et al. that was focused on salivary gland tumors.Salivary gland neoplasms are a heterogeneous group of tumors, and given the fact that they represent a diagnostic challenge for pathologists due to overlapping morphologic features further AI-assisted diagnostic tools in this area would be applauded.

Fig. 1 .
Fig. 1.PRISMA Flow diagram: flowchart of the systematic review according to the PRISMA guidelines.

Table 1
Brief description of the different modalities.

Table 2
Details of the studies included.