Machine Learning Approaches to Predict Symptoms in People With Cancer: Systematic Review

Background People with cancer frequently experience severe and distressing symptoms associated with cancer and its treatments. Predicting symptoms in patients with cancer continues to be a significant challenge for both clinicians and researchers. The rapid evolution of machine learning (ML) highlights the need for a current systematic review to improve cancer symptom prediction. Objective This systematic review aims to synthesize the literature that has used ML algorithms to predict the development of cancer symptoms and to identify the predictors of these symptoms. This is essential for integrating new developments and identifying gaps in existing literature. Methods We conducted this systematic review in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist. We conducted a systematic search of CINAHL, Embase, and PubMed for English records published from 1984 to August 11, 2023, using the following search terms: cancer, neoplasm, specific symptoms, neural networks, machine learning, specific algorithm names, and deep learning. All records that met the eligibility criteria were individually reviewed by 2 coauthors, and key findings were extracted and synthesized. We focused on studies using ML algorithms to predict cancer symptoms, excluding nonhuman research, technical reports, reviews, book chapters, conference proceedings, and inaccessible full texts. Results A total of 42 studies were included, the majority of which were published after 2017. Most studies were conducted in North America (18/42, 43%) and Asia (16/42, 38%). The sample sizes in most studies (27/42, 64%) typically ranged from 100 to 1000 participants. The most prevalent category of algorithms was supervised ML, accounting for 39 (93%) of the 42 studies. Each of the methods—deep learning, ensemble classifiers, and unsupervised ML—constituted 3 (3%) of the 42 studies. The ML algorithms with the best performance were logistic regression (9/42, 17%), random forest (7/42, 13%), artificial neural networks (5/42, 9%), and decision trees (5/42, 9%). The most commonly included primary cancer sites were the head and neck (9/42, 22%) and breast (8/42, 19%), with 17 (41%) of the 42 studies not specifying the site. The most frequently studied symptoms were xerostomia (9/42, 14%), depression (8/42, 13%), pain (8/42, 13%), and fatigue (6/42, 10%). The significant predictors were age, gender, treatment type, treatment number, cancer site, cancer stage, chemotherapy, radiotherapy, chronic diseases, comorbidities, physical factors, and psychological factors. Conclusions This review outlines the algorithms used for predicting symptoms in individuals with cancer. Given the diversity of symptoms people with cancer experience, analytic approaches that can handle complex and nonlinear relationships are critical. This knowledge can pave the way for crafting algorithms tailored to a specific symptom. In addition, to improve prediction precision, future research should compare cutting-edge ML strategies such as deep learning and ensemble methods with traditional statistical models.


Background
Cancer poses considerable physical and psychological challenges for those diagnosed with the disease.The Global Cancer Observatory estimated that there were 19.3 million new cancer cases and 43.8 million individuals living with cancer within 5 years of diagnosis globally in 2020 [1].Symptoms such as fatigue, pain, nausea, vomiting, depression, and anxiety often persist beyond treatment [2][3][4][5], detrimentally affecting individuals' quality of life [6].Moreover, people with cancer frequently grapple with multiple intertwined symptoms [7], intensifying their distress [8].Unmanaged cancer symptoms can lead to increased health care use, including emergency department visits and unscheduled hospitalizations to address these symptoms; a decline in the quality of life [9]; and even a reduced life expectancy.Providing precision symptom management tailored to the individual at the right moment has the potential to significantly improve outcomes, which is crucial for both people with cancer and their health care providers.Accurately predicting and addressing these symptoms is fundamental to providing such precision in symptom management.
Artificial intelligence, incorporating machine learning (ML) and deep learning (DL) models, excels in handling complex, high-dimensional, and noisy data.It has demonstrated effectiveness in disease diagnosis, predicting disease recurrence, enhancing quality of life, and symptom management [10][11][12][13][14][15][16].There is a growing interest in ML in the emerging field of predictive analytics for cancer symptoms.ML contributes to the development of robust clinical decision systems, enhancing overall health care delivery [17].ML algorithms can be broadly categorized into supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning.DL, a subset of ML, addresses complex tasks such as speech recognition, image identification, and natural language processing [18].

Objectives
This study seeks to offer a comprehensive and systematic review of the literature on the application of ML algorithms in predicting symptoms for people with cancer.Conducting this review of a rapidly expanding body of literature is imperative to understand the current state of the science for ML models in symptom prediction for cancer and to guide future research.This research aims to provide a comprehensive understanding of the current state of research; identify areas for improvement; and understand the limitations and gaps in the current literature, such as a lack of specific focus on ML models for patients with cancer.By comparing model performances across diverse symptom prediction tasks, we can identify the best practices, highlight areas for improvement, and offer informed recommendations that will propel the field of predictive analytics in cancer symptom research forward.

Search Strategy and Data Sources
This study was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analyses) protocol [19] and involved a comprehensive database search spanning from 1984 to August 11, 2023, including the PubMed, Embase, CINAHL, and Google Scholar databases.The search terms encompassed cancer, neoplasm, signs and symptoms, neural networks, machine learning, and specific algorithm names.In our study, we used Boolean expressions, using specific combinations of keywords and phrases, acknowledging the variability in terminology across studies.Search results were compiled using EndNote 20 (Clarivate Analytics).The detailed search strategy, which uses Boolean expressions, and the PRISMA checklist can be found in Multimedia Appendices 1 and 2.

Inclusion and Exclusion Criteria
To identify relevant research focusing on the application of ML methods in predicting cancer symptoms, we applied the following inclusion criteria: (1) papers published in English, (2) studies that used ML algorithms, and (3) research specifically aimed at predicting cancer symptoms.The exclusion criteria were as follows: (1) nonhuman studies, (2) technical reports, (3) review papers, (4) book chapters or series, (5) conference proceedings, and (6) studies for which full texts were unavailable.Two authors, NZ and NY, independently screened and cross-checked the candidate records.During the screening process, conducted using EndNote 20, any disagreements were resolved by consulting a third reviewer (SGW).The screening process involved an initial review of titles and abstracts, followed by a full-text examination to determine the study's eligibility for inclusion in the review.

Data Extraction and Analysis
In our study, we implemented a systematic, multistep process for data synthesis.Initially, relevant studies were identified and selected based on the predefined inclusion and exclusion criteria.Two independent researchers, NZ and NY, extracted data from 42 selected studies.They worked independently to mitigate bias and enhance the accuracy of the data extraction process.In cases of discrepancies, these were resolved through discussion or consultation with a third reviewer, SGW.The extracted data were aggregated, involving the collation of study characteristics such as research location, sample size, study design, types of ML algorithms, validation metrics, identified significant predictors, cancer types, and the specific symptoms focused on.This comprehensive approach enabled us to reduce the bias and increase the reliability of our findings.For the analysis, we used both quantitative and qualitative methods.Quantitative data, such as frequencies and percentages, were compiled and analyzed using Python.This included the creation of insightful plots and heat maps to identify patterns and trends, illustrating relationships among variables and highlighting key findings in an easily digestible format.Qualitative aspects, such as algorithm implementation or study design, were explored through narrative synthesis.This allowed for a deeper understanding of the context and nuances in the application of ML algorithms for cancer symptom prediction.
We conducted a cross-analysis to compare findings from different studies, assessing the effectiveness of various ML algorithms across different cancer types and symptoms and identifying common predictors of success and the challenges faced.Finally, we interpreted the findings in the context of the existing literature.We discussed how our results align with or differ from previous studies and what new insights our synthesis brings to the field of ML in cancer symptom prediction.

Overall Results
A search across the 3 databases produced 1788 papers.After removing 289 duplicates, we screened the records for titles and abstracts, excluding another 1352 irrelevant records.However, 1 study was not retrieved.We reviewed the full text of the remaining 146 records, omitting 105 due to the absence of ML application in predicting cancer symptoms (69/146, 47.3%), not being a research article (34/146, 23.3%), and not being an English article (1/146, 1%).In the second phase, we intend to include Google Scholar in our research methodology to capture an additional 113 articles not found in our main databases, although 1 study was not retrieved.We reviewed the full text of the remaining 99 records, ultimately excluding all of them for reasons such as the lack of ML applications in cancer symptom prediction (89/99, 90%) and not being a research articles (10/99, 10%).Eventually, 42 studies met the inclusion criteria, as depicted in Figure 1.Of the 42 studies, 42 (100%) is listed in PubMed, Embase covers 37 (88%) studies, and CINAHL includes 18 (43%) studies.The distribution and overlap of these research articles across the databases are illustrated in Multimedia Appendix 3.
The data extracted from these studies, which include the reference number, research location, year, data type, cancer site, symptoms, significant predictors, ML algorithms, and validation methods, are detailed in Table 1 and in Multimedia Appendix 4. A total of 2 individual researchers (NZ and NY) separately extracted data from each study, working independently of each other.This approach is used to reduce bias and increase the accuracy of the data extraction process.If discrepancies arise between the 2 independent authors, they are usually resolved through discussion or by consulting a third reviewer (SGW).

Primary Database Information
The studies selected were published between 2017 and 2023 and were conducted in North America (

Significant Candidate Predictors of Symptoms
Numerous predictors were frequently used for predicting symptoms, which can be grouped into demographic features and clinical characteristics.

Demographic Features
The demographic features include age, sex, BMI, income, medical insurance, education, marital status, and zip code-level poverty.

Clinical Characteristics
The clinical characteristics include smoking and alcohol use, initial diagnosis, presence of cancer, stage of cancer, cancer course, tumor site, type and number of prior treatments, chemotherapy type, and radiotherapy dose and volume.Health conditions such as comorbidity, diabetes, hypertension, osteoarthritis, and coronary disease also play a significant role.In addition, psychological factors such as depression and anxiety, fatigue, sleep disturbance, and pain are considered.Other influential predictors encompass care fragmentation, polypharmacy, hormone levels, physical activity, diet, heart rate, and social support factors.
In our comprehensive analysis of 42 studies, all the detailed findings on common cancer symptoms are compiled in Figure 2. We provide a detailed analysis of the predictors for the 4 most frequently reported cancer symptoms identified in this study: xerostomia, pain, depression, and fatigue.In a detailed analysis of 42 studies, various predictors for 4 common cancer symptoms-xerostomia, pain, depression, and fatigue-have been identified, each with its distinct set of influencing factors.For xerostomia, age, gender, chemotherapy type, radiotherapy dose and volume, cancer stage, tumor site, and hypertension are crucial predictors.In the case of pain, factors such as age, BMI, smoking and alcohol habits, cancer site and stage, tumor site, diabetes, hypertension, osteoarthritis, coronary disease, physical activity, psychological factors, sleep disorders, and existing pain conditions emerge as significant.Significant predictors for depression include age; gender; education; cancer site and stage; economic factors such as insurance, income, and poverty level; marital status; initial diagnosis impact; comorbidities (diabetes, hypertension, osteoarthritis, and coronary disease); pain; social support; care fragmentation; polypharmacy; and various scale scores.Finally, for fatigue, the key predictors are existing fatigue and low energy, cancer site, sleep disturbances, age, income, education, chemotherapy type, tumor site, comorbidities, hypercholesterolemia, heart rate, hypoproteinemia, physical and psychological factors, pain, adverse drug reaction history, limited social support, Eastern Cooperative Oncology Group score, platelet distribution width, and erythropoiesis.
When examining the commonalities across these predictors for xerostomia, pain, depression, and fatigue, several factors stand out as particularly influential across multiple symptoms: age; gender; cancer site and stage; treatment-related factors such as the type of chemotherapy and radiotherapy; comorbidities such as diabetes, hypertension, and coronary disease; physical and psychological factors; and socioeconomic factors such as income and education level, demonstrating the impact of cancer treatments on symptom development.These common predictors underscore the complex, multifactorial nature of symptom manifestation in patients with cancer, necessitating a comprehensive approach to their management and care.

Principal Findings
In this review, we present the first systematic analysis of ML applications for predicting the development of cancer symptoms.We explore the most frequently studied cancer sites and delve into the intricacies of ML procedures.Breast, head or neck, and lung cancers are the most frequently studied sites in current research, with xerostomia, depression, pain, and fatigue being the most prominent symptoms.The application of various ML techniques is on the rise, with data acquisition and preprocessing being pivotal for successful ML models.While a range of algorithms, from traditional methods such as LR and DT to advanced ones such as DL, are used, there is a growing emphasis on data quality, external validation, and a standardized approach to model evaluation.The future of ML in cancer symptom prediction looks promising, with a need for collaborative efforts among oncologists, data scientists, and patient groups, combined with more comprehensive research on lesser-studied cancer sites and standardized methodologies.
Regarding the cancer sites covered in the studies, breast, head or neck, and lung cancers emerged as the most frequently researched primary cancer sites.The range of symptoms and side effects that patients experienced varied from one study to another.Some symptoms depended on the specific cancer site and the treatments patients received.For example, xerostomia, which can either arise from the tumor itself or manifest as a treatment side effect, has a significant impact on patients' dental health and compromises antimicrobial functions [61].However, most symptoms were not directly attributed to a particular cancer site or treatment.
Our review revealed a notable emphasis on predicting xerostomia in 14% (9/42) of the studies, despite head and neck cancers being less prevalent.The notable emphasis on predicting xerostomia in ML research, despite the lower prevalence of head and neck cancers, is likely due to advancements in integrating ML with CT imaging.CT imaging is a pivotal tool in the diagnosis and treatment planning of head and neck cancers.The integration of ML with CT imaging has opened new possibilities for more accurately predicting side effects such as xerostomia.ML techniques, when applied to CT images, can potentially identify patterns and indicators that are not easily discernible by human observers.This capability can lead to earlier and more precise predictions of xerostomia, thereby enabling better preventive measures and treatment planning to mitigate this side effect.Therefore, the focus on xerostomia in ML research, in the context of head and neck cancers, is likely driven by the opportunities presented by combining ML with advanced imaging techniques.Depression, a widespread emotional challenge for people with cancer [62,63], was the focus of prediction in many studies (8/24, 13%).Similarly, pain, a recurrent concern for palliative care patients [64] and survivors of cancer [65,66], was the subject of prediction in >13% (8/24) of the studies.Fatigue, prevalent across all age groups with cancer [67,68], was highlighted in 6 (10%) of the 42 studies reviewed.
In terms of the ML approaches used in the studies, a plethora of techniques were used to construct these predictive models, spanning all phases of the ML process, from data collection and preprocessing to feature and algorithm selection, model training, testing, and evaluation.The process of data acquisition is pivotal for the development of ML models, thereby emphasizing the importance of an adequate sample size.Upon reviewing 42 studies, we discerned that the most frequent sample sizes for ML applications ranged between 100 and 1000 samples.More advanced ML techniques necessitate larger data sets to bolster robustness and mitigate the risk of overfitting.Alarmingly, certain studies in our review used ML with comparably smaller data sets, introducing the risk of model overfitting and potential biases in the subsequent performance metrics [69].Challenges tied to sample size might impede the creation of sturdy and trustworthy ML models [70].Data preprocessing is indispensable to yield clean and interpretable data, which is a cornerstone for proficient ML models.Data cleaning approaches encompass addressing missing values, tackling data noise, and data normalization.Within health care data sets, noisy or absent data are frequently a by-product of inaccuracies in manual entries or instrument recordings made by medical personnel or ancillary staff [71].However, most of the reviewed studies lacked comprehensive descriptions of their data cleaning methodologies or strategies for handling noisy data and normalization, constrained by word or page limits in publications.
Given the crucial importance of data quality in developing ML models, it is essential for researchers to focus equally on effective data preparation and choosing suitable algorithms.Future endeavors would benefit from exhaustive procedural documentation made available on public platforms such as GitHub.In a research context, GitHub can be used for sharing and collaborating on various aspects of a research project, including but not limited to code.It allows researchers to maintain version control of their scripts, data analysis procedures, and even documentation.This feature is particularly beneficial for replicating studies and verifying results, as it provides a transparent view of the methodologies and analyses used.
Overloading an ML model with excessive features can undermine its ability to differentiate between pertinent data and superfluous noise, leading to the challenge often referred to as the "curse of dimensionality."The goal of feature engineering is to mitigate model complexity, expedite the training process, reduce the data's dimensionality, and avert overfitting [72].By streamlining the model with a curated set of predictors, it becomes more accessible and transparent, emphasizing the importance of feature selection during data preparation.Our review pinpointed the most frequently used significant predictors in cancer symptom prediction.The efficacy of prediction models is heavily influenced by the number and interplay of the relevant predictors.Factors such as age, gender, type and number of previous treatments, cancer location, cancer stage, chemotherapy type, dosage and volume of radiotherapy; chronic conditions such as diabetes and hypertension; concurrent diseases; and symptoms including depression, anxiety, fatigue, pain, and sleep disturbances have consistently featured as determinants in numerous predictive frameworks.Our review of cancer symptom prediction underscored age as a pivotal factor, associated with predominant symptoms such as depression, pain, xerostomia, and fatigue.While numerous elements, from gender to type of treatment and cancer stage, influence the predictive models, it is the prominence of age that consistently emerges as a cornerstone predictor.As we delve deeper into this field, even with the introduction of newer determinants and correlations, the centrality of age in these frameworks remains indisputable.
Regarding algorithm selection, traditional methods often struggle with handling high-dimensional data and processing extensive information.To tackle these challenges, researchers have XSL • FO RenderX increasingly shifted toward innovative ML algorithms that are renowned for their robust predictive power and strong generalization capacities.These sophisticated algorithms excel at delving deep into data and discerning intricate interrelationships among variables.To navigate the multifaceted landscape of modeling challenges, it is advantageous for researchers to leverage a diverse array of ML algorithms.Most studies used multiple predictive models, with techniques such as LR, RF, ANN, and DT consistently delivering stellar results.The introduction of advanced ML techniques, such as DL and ensemble classifiers, provides promising opportunities to elevate prediction accuracy in future research.
After their design, the ML models undergo training and testing on different data sets.However, these models can grapple with issues such as overfitting and underfitting.Overfitting occurs when a model becomes overly complex, which leads to increased variance and reduced clarity.In contrast, underfitting results from an oversimplified model, causing it to overlook key data patterns and diminish its predictive capacity.Therefore, the ideal learning model should strike a balance between the optimal variance and justifiable bias.To mitigate these issues, the common strategy is to divide the data set into training and testing subsets, followed by internal or external validation.While most studies in our review used internal validation, only 1 study reported external validation [58], which was demonstrated on a small cohort of 25 patients with head and neck cancer.Although its performance is typically lower than evaluations using the original data sets, external validation remains crucial for gauging ML models [72].It is a crucial step in ensuring that the model's performance is not just limited to the conditions and data it was originally trained on but also applicable and reliable in broader, real-world clinical settings.This approach serves to verify the model's efficacy and generalizability across different patient populations and settings.
Understanding and interpreting ML models continue to pose challenges.Determining the variables that significantly impact symptom prediction can be elusive due to the intricate prediction processes.Many studies gauge the performance of ML models using metrics that examine their ability to distinguish between 2 classes.From our systematic review of 42 studies, the area under the curve emerged as the predominant metric for the prediction models.Other metrics included accuracy, sensitivity, specificity, positive predictive value, root mean square error, and negative predictive value.These metrics provide a holistic view of a model's efficacy, facilitating its refinement and enabling more precise predictions.However, the diverse emphasis on distinct metrics in numerous studies underscores the need for a uniform approach to evaluating ML models in cancer symptom prediction.
As interest grows in using ML for predicting cancer symptoms, there are several areas that merit deeper investigation.A crucial area is broadening the range of studied cancer sites and more comprehensively correlating symptoms with various treatment methods.To fully understand symptom prediction, it is essential that future studies delve into lesser-explored or infrequently studied cancer sites.Furthermore, the methodologies used for data preprocessing and cleaning should be documented more thoroughly, focusing on best practices to ensure data integrity.As data are foundational to ML models, transparent and detailed preprocessing can improve the reliability and repeatability of these models.Although our analysis highlighted common predictors for symptom forecasting, examining potentially underrepresented or emerging indicators could refine these models further.On the algorithmic front, exploring hybrid ML methods that merge the strengths of multiple algorithms might be particularly beneficial for cancer symptom prediction.Standardizing evaluation metrics across studies would also provide clarity and facilitate a more accurate comparison of various ML techniques.To genuinely progress, collaborations among oncologists, data scientists, and patient advocacy groups are vital to ensure that the developed models are technically robust and clinically pertinent.With these insights, ML stands poised to transform cancer care, creating treatment plans based on patient-focused and accurate symptom prediction models.

Limitations
This review is not without its limitations.Although we established clear inclusion and exclusion criteria, potential biases in the studies we analyzed could inherently limit our review.We might have missed or excluded relevant studies due to inadequate information or the absence of keywords in their titles or abstracts.Many of the studies we reviewed did not specify the cancer site, potentially limiting the accuracy and applicability of our findings to specific cancer types.The broad range of predictors used across the studies also made it difficult to draw definitive conclusions about the most influential factors in predicting cancer symptoms using ML algorithms.As such, readers should interpret these results cautiously, given this variability.

Conclusions
ML offers an intriguing potential for predicting cancer symptoms, thereby preemptively mitigating the associated challenges.Predicting the symptoms that people with cancer might experience and determining their onset throughout their treatment journey is a pivotal clinical issue that can enhance patients' quality of life.Notably, all studies in our review were published after 2017, highlighting the nascent nature of this research area.Our investigation primarily sought to outline the ML methodologies harnessed for symptom prediction in people with cancer.While ML techniques hold an edge over traditional statistical approaches by virtue of their prowess in analyzing vast data sets and gauging the efficacy of diverse prediction models, certain impediments such as a limited pool of symptoms; suboptimal data preparation; challenges in feature engineering; and complexities in ML algorithm design, validation, and evaluation can constrain the broad applicability of these predictive models.Future research should pivot toward amplifying the efficacy of ML strategies.This enhancement can be achieved by harnessing expansive, high-caliber data sets; tapping into innovative technologies for data refinement; and sculpting refined models.Harnessing ML can potentially free health care practitioners-including doctors, nurses, and clinic personnel-to accentuate the human touch in managing cancer symptoms.

Figure 1 .
Figure 1.PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart.ML: machine learning.

Table 1 .
Details of the included studies (n=42).