The Expanding Role of Artificial Intelligence in the Histopathological Diagnosis in Urological Oncology: A Literature Review

The ongoing growth of artificial intelligence (AI) involves virtually every aspect of oncologic care in medicine. Although AI is in its infancy, it has shown great promise in the diagnosis of oncologic urological conditions. This paper aims to explore the expanding role of artificial intelligence in the histopathological diagnosis in urological oncology. We conducted a focused review of the literature on AI in urological oncology, searching PubMed and Google Scholar for recent advancements in histopathological diagnosis using AI. Various keyword combinations were used to find relevant sources published before April 2nd, 2024. We approached this article by focusing on the impact of AI on common urological malignancies by incorporating the use of different AI algorithms. We targeted the capabilities of AI’s potential in aiding urologists and pathologists in histological cancer diagnosis. Promising results suggest AI can enhance diagnosis and personalized patient care, yet further refinements are needed before widespread hospital adoption. AI is transforming urological oncology by improving histopathological diagnosis and patient care. This review highlights AI’s advancements in diagnosing prostate, renal cell, and bladder cancer. It is anticipated that as AI becomes more integrated into clinical practice, it will have a greater influence on diagnosis and improve patient outcomes.


INTRODUCTION
The ongoing growth of artificial intelligence (AI) involves virtually every aspect of oncologic care in medicine.AI is a computer's ability to replicate the cognitive functions of the human mind at a higher level, continuously improving its performance over time.This is done through the perception of external stimuli and the determination of an ef-Folia Medica I 2024 I Vol.66 I No. 3   fective strategy for a desired outcome. [1]Over the past two decades, the field of urology has undergone a significant transformation, transitioning from a primarily surgical discipline with limited drug treatments to one marked by remarkable advancements, which include the widespread adoption of endoscopic procedures and the continuous expansion of pharmacotherapies for prevalent conditions such as urinary incontinence and erectile dysfunction. [2]nlike traditional methods relying heavily on statistics, AI has the ability to analyze vast amounts of information while simultaneously identifying highly complex data patterns, facilitating more accurate predictions.Although the use of AI is in its infancy, it has shown great promise in the diagnoses in urological oncology.Exciting AI developments may aid in the histopathological diagnosis of urological oncology.Embracing the advantages of AI is expected to improve the quality of healthcare facilities, therefore increasing patient's quality of life. [3]

AIM
This paper aims to explore the expanding role of artificial intelligence in the histopathological diagnosis of urological oncology.By analyzing the complexities of AI, we will evaluate the strengths and weaknesses of different divisions of tumor oncology.

METHODS
We conducted a focused, non-systematic review of published literature concerning the use of AI in urological oncology.The PubMed and Google Scholar databases were searched to gain insights into recent advancements in histopathological diagnosis using AI and their impact on urological oncology.We analyzed the existing diagnostic methods and how AI algorithms have enhanced them, facilitating more efficient diagnosis for pathologists and urologists, especially with specimens that are harder to diagnose requiring extensive training and expertise.Various combinations of keywords, such as urological oncology, AI, urology, and histopathological diagnosis with AI, were used to narrow down and identify relevant sources.Only materials published before April 2nd, 2024, were considered for inclusion.Relevant articles were selected, and information related to the keywords was extracted for analysis.

RESULTS
Understanding the fundamentals of AI is important to evaluate its implementation.Artificial Intelligence (AI) is a field of computer science that uses machine learning techniques to make predictions.
Machine learning (ML)-based approaches entail machines "learning" from the data provided to them to make predictions.Deep learning (DL) is a specific ML approach that has evolved through the advancement of artificial neural networks.DL involves multiple layers of artificial neural networks consisting of an input layer, an output layer, and multiple hidden layers.The algorithm is then used for different tasks, such as classification, segmentation, and detection. [4]Whole-slide imaging is the digital conversion of entire microscopic slides with the help of high-resolution scanners.A convolutional neural network (CNN) is a variation of the DL model designed for processing input and output data.It consists of multiple convolutional layers applying filters to the input data.During training, the filters are designed to extract valuable information from the input data, enabling the network to perform end-to-end learning while simultaneously utilizing all the parameters. [5]AI algorithms can identify data from digital images from radiology and pathology that the human eye cannot detect, a process known as radiomics and pathomics. [6]

AI in renal cancer
Between 1983 and 2002, the prevalence of renal cancer increased from 7.1 to 10.8 cases per 100,000 patients.The diagnosis was made incidentally when imaging studies were conducted for unrelated clinical reasons.Despite the progress in diagnosis, patient outcomes have not improved, with a reported increase in mortality from 1.5 to 6.5 deaths per 100,000 patients within the same time interval. [7]enal cell carcinoma (RCC) is the sixth most diagnosed cancer in men and the tenth in women among all cancers.It possesses extensive histopathological features, creating challenges for accurate diagnosis and prognosis. [5]CT imaging holds 90% sensitivity for renal mass, nearing 100% for lesions greater than 2 cm. [8]The majority of patients undergo surgical interventions; however, a large proportion of tumors are benign or undetermined, so active surveillance is sufficient in some patients. [9]Consequently, renal mass biopsy (RMB) has been adopted for differentiation between benign and malignant tumors. [10]On the other hand, they are invasive procedures and non-diagnostic in approximately 10%-15% of cases. [11]n some cases, interobserver variability remains a barrier to RMB interpretation. [12]AI can overcome these limitations and help diagnose different clinical groups more efficiently.
In the era of the "Big Data" generation, the use of electronic health records, digitalized radiology, and virtual pathology has created an abundant source of digital data that is applicable to the data-characterization algorithms of AI. [5] Erdim et al. differentiated both benign and solid renal mass oncocytomas and fat-poor angiomyolipoma from all RCCs with a promising area under the receiver operating characteristic curve (AUC) of 0.91. [13]Fenstermaker et al. created a DL-based algorithm to aid in the identification of RCC in histopathological specimens.The model was trained on 3000 normal and 12,168 RCC tissue samples from 42 patients, achieving a high-level accuracy of 97.5% in differentiating clear cell, papillary, and chromophobe subtypes when only using a 100-square-micrometer (μm 2 ) sample.Furthermore, the prediction accuracy for Fuhrman grade was 98.4%. [14]ncocytoma is the most frequent benign renal mass found.Pathologists encounter difficulties when differentiating oncocytoma from chromophobe RCC. [15]Previous models on machine learning applications for kidney cancer involved mainly resection slides and three RCC subtypes without any consideration for benign and oncocytoma.Zhu et al. showed promising results in analyzing five subtypes: clear cell RCC, papillary RCC, chromophobe RCC, renal oncocytoma, and normal surgical resection slides using a deep neural network model.Recent studies combined convolutional neural networks and directed cyclic graph-support vector machines to classify three RCC subtypes.In addition, a common approach in digital pathology requires training a diagnostic model without region-of-interest annotations with weakly supervised learning.Their study differs from these methods as they follow a more intuitive methodology based on patch-level confidence scores achieving an AUC of >0.95.This study establishes a solid foundation for future studies in the classification of RCC subtypes as it was evaluated on both tertiary medical institution data and surgical slides from The Cancer Genome Atlas.Additionally, this model was tested on 79 RCC biopsy slides, with 24 diagnosed as oncocytoma.This model holds a great advantage for pathologists by automatically pre-screening slides to eliminate false negatives, highlight important regions on the slides, and provide diagnosis optimizing efficiency. [16]ith the help of two pre-trained convolutional neural networks (CNN), Tabibu et al. differentiated between clear cell and chromophobe RCC with a classification accuracy of 93.39% and 87.34%, respectively.Furthermore, they amplified the deep network with a directed acyclic graph support vector machine (DAG-VSM) for subtype classification.This functionality not only enhanced the model's performance but also addressed data imbalances inherent in multi-class classification tasks involving histopathological images.High-probability cancerous regions and tissues from different origins were effectively characterized. [17]CC classification is a challenging task because of the complexity of the method.The periodic updates to the classification system additionally complicate the process.Clear cell papillary RCC is a relatively new subtype that has only been recognized as of 2016. [18]These tumors consist of features of clear cell and papillary RCC, yet they are different in immunohistochemistry and genetic profiles. [19]Abdeltawab et al. developed a CNN-based computerized-aided diagnostic (CAD) system to distinguish clear cell RCC from papillary RCC, achieving an accuracy of 91% using cases from the institution files and 90% in the diagnosis of clear cell RCC using an external dataset. [20]ome researchers have used neural networks initially developed to diagnose different tumors to enhance diagnosis.Faust et al. utilized a CNN primarily programmed to recognize the histomorphology of brain tumors on 550 digital images of 396 clear cell RCC and 154 papillary RCC.Five hundred and twelve different features were extracted from more than 550 digital WSIs of RCC to perform image set-clustering, therefore allowing the algorithm to evaluate clinical and biological differences between the patient subgroups.By employing this methodology, researchers were able to recognize different subtypes of RCC and predict survival rates within each group.The algorithm could also identify specific patterns and features specific to kidney cancer.The results of this study demonstrate that CNNs pretrained on large histologic datasets can learn new pathologies with or without unsupervised applications. [21]

AI in bladder cancer
Bladder cancer (BCa) is the most prevalent malignancy of the urinary tract.In 2018, there were 549,393 new cases, of which 199,922 resulted in death. [22]The most observed manifestation in BCa is hematuria, which is gross or microscopic in character.The presence of ≥3 red blood cells per high-power field on 2 of 3 microscopic urinalysis defines hematuria according to the best practice panel of the American Urological Association. [23]Cystoscopy with additional cytology, contrast urography, and ultrasound are the best diagnostic methods for the diagnosis and monitoring of bladder tumors. [24]Although white light imaging (WLI) is the main principle of cystoscopy in suspicion of BCa, a great incidence of error remains (10%-40%).The immunity of AI to human error may prove valuable in increasing diagnostic accuracy. [25]hkolyar et al. developed CystoNet, which is a CNNbased model for automated detection of bladder tumors.The model was established upon a dataset of 2335 frames of normal bladder mucosa and 417 frames of histologically confirmed papillary urothelial carcinoma obtained from 95 patients.The effectiveness of CystoNet was assessed on 54 more patients, showing a sensitivity of 90.9% and a specificity of 98.6% per frame.The model was successful in detecting 39 out of 41 papillary and three of three flat bladder cancers, demonstrating that it could assist in training, and diagnostic decision-making while ensuring consistent performance among providers enhancing the accessibility and quality of cystoscopy. [26]rologists are trained to identify various cancers during cystoscopy; nonetheless, the diagnosis relies on their expertise.It is particularly difficult to recognize tumors with a small diameter and flat tumors like carcinoma in situ.A model based on GoogLeNet (Szegedy et al.) was utilized by Ikeda et al. to mitigate human error and increase the quality of the diagnosis.The dataset consisted of a total of 2102 cystoscopic images, 1671 images of normal tissue, and 431 images of tumor lesions.To further expand the data, the authors enlarged the training dataset by rotating and blurring the cystoscopy images.They also used transfer learning to train their model.The original GoogLeNet model, which consisted of 22 layers and over seven million parameters, could readily fit the small dataset utilized in this study.A final model evaluation of a cystoscopic image with AI assistance showed a sensitivity of 89.7% and a specificity of 94.00%.As a whole, the use of AI in cystoscopy image evaluation has a lot of potential. [27]ymph node staging is essential in the diagnosis and treatment of patients with bladder cancer.Wu et al. produced a diagnostic model based on whole-slide images of lymph node metastases.This retrospective, multicenter, and diagnostic study examined the effectiveness in different groups: five validation sets, a single lymph node test set, a multi-cancer test set, and a subset for comparing performance between the model and the pathologists.The dataset included 8177 images and 20,954 lymph nodes obtained from 1012 patients with bladder cancer and radical cystectomy, along with lymph node dissection.After the exclusion of 14 patients with non-bladder cancer and 21 low-quality images, the samples included 998 patients and 7991 images.The AUC for accuracy ranged from 0.978 to 0.998 in the five validation sets.The performance of the model came to 0.983 and 0.906 for the junior pathologist in comparison with 0.947 for the senior pathologist.The AI assistance improved the sensitivity of both the junior and senior pathologists, from 0.906 to 0.953 and from 0.947 to 0.986, respectively.In the multi-cancer test, the model maintained an AUC of 0.943 in breast cancer and 0.922 in prostate cancer images.Micro metastases missed by the pathologists, leading to negative results, were detected by the model.The receiver operating characteristic curve demonstrated that the model would allow pathologists to exclude 80% to 92% of the negative slides while maintaining 100% sensitivity.The model holds significant potential for clinical applications in detecting lymph node metastases, especially micrometastases, enhancing the precision and effectiveness of pathologists' work. [28]ue to the invasive nature of cystoscopy, patients experience discomfort during the procedure, which poses a financial burden to the healthcare system.Research into the implications of AI for efficient and prompt diagnosis through urine is underway. [25]Urine cytology is a simple, inexpensive, and effective mechanism for detection and follow-up.However, limitations exist with this method.The samples necessitate manual screening, involving first screening by a cytotechnologist, followed by a review by a pathologist, before a conclusive diagnosis is reached. [29]In many cases, confirmation is difficult due to sampling error, urothelial cell degradation, inflammation, and interobserver agreement. [30]o address the absence of readily available computed screening systems for urine specimens and the issue of poor interobserver agreement, Sanghvi et al. devised a computational pipeline for analyzing digitized liquid-based urine cytology slides.One thousand six hundred and fifteen whole slide images (WSIs) were used for training, and 790 WSIs were used for testing their pipeline, encompassing multiple tiers of CNN models.This algorithm yielded an AUC of 0.88. [31]he implication of artificial intelligence using the Visio-Cyt test in the diagnosis of bladder carcinoma was reported by Lebret et al.A large multicenter trial was executed in 14 centers on 1360 patients, divided into two groups: 1) patients with bladder carcinoma of varied histological grades and stages; and 2) a control group of patients with negative cystoscopy.The process included algorithm development and algorithm validation.This was followed by a comparison of the VisioCyt results and those obtained from experienced uropathologists.VisioCyt showed a sensitivity of 84.9% and a cytology of 43%.Regarding highgrade tumors, VisioCyt's sensitivity was 92.6%, and that of uropathologists was 61.1%.For low-grade tumors, Visio-Cyt showed a sensitivity of 77%, and that of uropathologists was 26.3%.If confirmed by a validation cohort, the algorithm can be beneficial for pathologists. [32] study performed by Sokolov et al. explored the use of nano-scale-resolution images of urine cells by atomic force microscopy (AFM).The evaluation of urine samples from 43 patients without evidence of bladder cancer and 25 cancer patients with confirmed bladder cancer displayed a diagnostic accuracy of 0.94 when studying five cells per urine sample.A significant improvement in diagnostic accuracy was achieved in comparison with the standard cystoscopy, as validated by the assessment of 43 control and 25 bladder cancer patients.[33]

AI in prostate cancer
With the epidemiological range for prostate cancer patients ranging from 45 to 60, it is associated with the highest cancer-associated mortality rate in Western countries. [34]rostate cancer screening with prostate-specific antigen (PSA) has substantially reduced mortality from prostate cancer (>50%). [35]On the other hand, it has led to major overdiagnosis and overtreatment of non-aggressive prostate cancer. [36]Therefore, it is of vital importance to ensure the use of the latest technologies to allow for personalized therapeutic plans for individual patients.
Diagnosis of prostate cancer is done by microscopic assessment of tissue samples; however, limitations remain in the presence of insufficiently trained pathologists.Some researchers have developed an augmented reality microscope (ARM).The deep learning system presents real-time feedback to pathologists as they are viewing the histopathology slides.In relation to prostate cancer detection, the algorithm gained an AUC of 0.93 at the ×10 objective and an AUC of 0.99 at the ×20 objective.The integration of ARM can be seamlessly incorporated into the existing microscopy workflow, thereby eliminating the requirement for expensive IT infrastructure and whole slide image scanners.Subjective tasks like stain quantification can be overcome, especially in places with limited access to highly experienced pathologists.Early experience has been successful with promising results, indicating its utility in clinical medicine with continued development and data set incorporation. [37] Due to the similar challenges encountered, Ström et al. created an AI system to assist urological pathologists in de-tection, localization, and Gleason grading.Digitization of 6682 slides from needle core biopsy, mainly from the STH-LM3 population-based prostate cancer screening trial.The obtained images were used in the training of deep neural networks for the evaluation of the prostate biopsy samples.The system received an AUC of 0.997 for differentiating between benign (n=10) and malignant (n=721) tumors.In accordance with Gleason's grading, the AI system displayed a mean pairwise Cohen's kappa of 0.62, which fell within the range of corresponding values (0.60-0.73) obtained by 23 experienced urological pathologists.By using this algorithm, accurate grading of prostate cancer along with a second opinion from a pathologist can be obtained. [38]he Gleason grading of the biopsy specimen holds significance within case management.Nagpal et al. constructed a deep learning system to assist in the grading of the specimen into 5 categories: nontumor, GG1, GG2, GG3, GG4, or GG4-5.Seven hundred and fifty-two deidentified digitized images from prostate needle biopsy specimens were acquired from three institutes in the United States.By allowing immunohistochemical access to the histologic sections for the biopsy samples, specialists were able to minimize levels of uncertainty in the diagnosis.The rate of agreement with subspecialists was higher at 71.7% than the 19 general pathologists (58%, p<0.001).Differentiation between tumor and non-tumor, the rate of agreement with the subspecialist was 94.3%, and the general pathologist had a similar result of 94.7%.This study demonstrates that support from AI is beneficial in enhancing expertise at Gleason grading of prostate needle biopsy specimens. [39]nother computer-aided diagnostic system for automatic grading of prostate cancer by multiple pathologists can be visualized in this study demonstrated by Nir et al.The workflow was composed of multi-scale features: glandular, cellular, and image-based.The classifier was trained on 333 tissue microarray (TMA) samples, each graded by six pathologists.Each TMA slide consisted of 160 cores (10×16 grid) from approximately 80 different radical prostatectomy samples.For ease of annotation by multiple pathologists of individual core images, they developed an Android-based application known as "PathMarker".This contributed to the establishment of an extensive and complex database with multiple labels.To evaluate the performance of the automatic grading system, a cross-validation study comprising a dataset of 231 patients was conducted.This received a kappa value of 0.51 between the system and pathologist, which falls within the spectrum of previously reported ranges of inter-pathologist agreements (0.45-0.62).This finding reflects positively on the classifier's performance as it validates its efficacy, indicating potential employment in clinical practice. [40]

Limitations
The use of AI is rapidly evolving; however, there are some roadblocks impacting the evolving involvement of these programs in clinical practice.Some studies have based their research on a retrospective design, resulting in occasional data loss and bias.Alongside that, small sample sizes hinder reproducibility and clinical applicability due to the algorithm being taught on small datasets, preventing its widespread use for patients from various epidemiological backgrounds.This limitation is reinforced by the use of an internal dataset, which further limits cross-validation and potentially leads to bias. [13]Having a large canvas of cells is imperative for subsequent imaging data; however, with RMB limited by under-sampling, the dimensional similarities are not equivalent between a small sample patch of whole slide imaging compared to a renal mass biopsy specimen. [14]One major limitation that is holding back the advancement of AI is the misclassification of biopsy slides.One study reported their model's inability to differentiate between clear cell RCC and normal slides, which is an issue of major concern.This is a fundamental function essential for improving the algorithm further; however, its failure to perform this basic task highlights the model's prematurity and the limitations of the algorithm.For some studies, only one urologist annotated all the lesions, increasing the possibility of overlooking a lesion and making accuracy vulnerable. [16]urther validation of the images by multiple doctors was not conducted.Annotating the images of inflammatory changes in the bladder was not prepared, causing the unavailability of the data for the algorithm.Not providing the algorithm certain images indicates either overperformance in the training set or underperformance on the new, unseen data.This may reduce effectiveness in real-world application as it may struggle to generalize new data and the predictions may be inaccurate due to the lack of comprehensive data. [27]In patients with lymph node metastases in bladder cancer, some studies identified the lack of variability within different types of bladder cancer and the different types of treatment the patients receive.This hinders the use of data for other types of bladder cancer. [28]In some cases, the data was obtained from a tertiary academic medical center and consisted of extremely rare and challenging cases, like polyoma virus-infected transplantation.
In everyday applications, it may struggle to generalize to more common scenarios as it was trained on limited information affecting the effectiveness and utility in broader applications.The urine samples contained both single cells and 3-dimensional cell clusters, which created difficulty as whole slide images only focused on one Z-plane. [31]One major drawback was the training of the algorithm for one biopsy specimen per case, even though each clinical case involved 12 to 18 biopsy specimens.Furthermore, in order to provide less subjective insights than a subspecialist evaluation, the relationship between the deep learning system for Gleason grading and clinical outcomes should have been established.The impact of rescanning the specimens on model performance requires further assessment in future research endeavors. [39]olia Medica I 2024 I Vol.66 I No. 3

Date of publication Objective Algorithm/ Method Study Design Results
Erdim et al. [13] January 2020 Detection of benign and malignant renal solid mass

Machine learning algorithm
• 79 patients with 84 solid renal masses (21 benign; 63 malignant) AUC of 0.91 for CT texture analysis in differentiating benign and malignant renal solid mass Fenstermaker et al. [14] July 2020 Identify Abdeltawab et al. [20] April 2022 Lebret et al. [32] March 2021 Using VisioCyt test to improve diagnosis of bladder carcinoma using voided urine cytology

Machine and deep learning algorithm
• 1360 patients divided in two groups followed by comparison of results with experienced uropathologists VisioCyt sensitivity of 92.6% compared to 61.1% of uropathologists for high-grade tumors Chen et al. [37]