Biomarker Reproducibility Challenge: A Review of Non-Nucleotide Biomarker Discovery Protocols from Body Fluids in Breast Cancer Diagnosis

Simple Summary Various studies and techniques have been designed to discover biofluid-derived biomarkers for non-invasive early detection and prognosis of cancers. Despite the importance of non-invasive biomarker discovery in cancer diagnosis and management, the reported markers are often inconsistent and irreproducible across different studies and cohorts. In this article, we reviewed the ongoing trend of non-nucleotide biomarkers, including lipidomics, proteomics and metabolomics, derived from body fluids, with a focus on breast cancer, and reviewed the inconstancies in the biomarker discovery pipelines across pre-analytical, analytical, and post-analytical phases, covering the diversity of approaches from sample processing to predictive modelling and validation. Abstract Breast cancer has now become the most commonly diagnosed cancer, accounting for one in eight cancer diagnoses worldwide. Non-invasive diagnostic biomarkers and associated tests are superlative candidates to complement or improve current approaches for screening, early diagnosis, or prognosis of breast cancer. Biomarkers detected from body fluids such as blood (serum/plasma), urine, saliva, nipple aspiration fluid, and tears can detect breast cancer at its early stages in a minimally invasive way. The advancements in high-throughput molecular profiling (omics) technologies have opened an unprecedented opportunity for unbiased biomarker detection. However, the irreproducibility of biomarkers and discrepancies of reported markers have remained a major roadblock to clinical implementation, demanding the investigation of contributing factors and the development of standardised biomarker discovery pipelines. A typical biomarker discovery workflow includes pre-analytical, analytical, and post-analytical phases, from sample collection to model development. Variations introduced during these steps impact the data quality and the reproducibility of the findings. Here, we present a comprehensive review of methodological variations in biomarker discovery studies in breast cancer, with a focus on non-nucleotide biomarkers (i.e., proteins, lipids, and metabolites), highlighting the pre-analytical to post-analytical variables, which may affect the accurate identification of biomarkers from body fluids.

. Variable factors involved in biomarker discovery pipeline. This schematic flow chart illustrates the influential factors involved in typical pre-analytical, analytical, and post-analytical stages in proteomics, metabolomics, and lipidomic investigations in breast cancer liquid biopsy.

Pre-Analytical Variables
In biomarker discovery studies, body fluid sources, sample collection procedures, handling, preparation steps, and storage conditions are defined as pre-analytical variables [42,43]. They are one of the most error-prone, time-consuming, and laborious steps in biomarker identification, and they affect the sensitivity, reproducibility, and selectivity of analysis and need to be carefully considered during the project design [44]. In the following subsections, we will outline the intricacy of the pre-analytical phase and its significance for biomarker discovery.

Biofluids Are Excellent Sources of Biomarkers
Recently, different types of body fluids have obtained great attention as sources of biomarkers for the detection and monitoring of breast cancer due to their low complexity and simpler sample collection and processing procedures compared to solid tissues, sustainable accessibility, and the ability to be measured repeatably in a minimally invasive way [45]. The major challenge in biomarker discovery from body fluids is the identification of biomarkers specific to the type of cancer. For example, a proteomic analysis of five different body fluids by Zhao et al. suggested that the proteome of body fluids may indicate the holistic functions of the whole body rather than that of adjacent tissues [46]. Therefore, the identification of biomarkers released into the body fluid by cancerous lesions may be difficult. Nonetheless, the metabolic changes that occur in the body due to the onset of cancer can be reflected in the metabolic/proteomic profile of body fluids. Furthermore, daily water intake or microbiome profile may alter the protein or metabolite concentration in a patient's body fluid [47,48] and the biomarker concentration may depend on the sample collection method. Thus, the pre-analytical phase of biomarker discovery workflows should be stringently standardised.
The selection of appropriate body fluids depends on the type of omics study (i.e., proteomics, lipidomics, or metabolomics), as one specimen may be advantageous over the other. For example, urine samples, mainly composed of metabolites and end products of biochemical reactions, are more suitable for metabolomic analysis [49]. Furthermore, compared to saliva, which comprises 99% water and 0.3% protein, serum and plasma are more appropriate for proteomic investigations [50]. In the following sections, we discuss the commonly used biofluids for biomarker discovery.

Serum and Plasma
Blood is believed to have the most complex human-derived circulating biomarkers and therefore has attracted considerable research attention. So far, over 12,000 proteins, 600 lipids, and 300 metabolites have been profiled from blood samples [51][52][53], and the concentrations of many circulating analytes were found to be different in plasma and serum [54][55][56]. For example, Liu et al. revealed that some metabolites, including most amino acids, hypoxanthine, carbohydrates, b-hydroxybutyrate, and glycerol-3-phosphate, were significantly lower in plasma compared to serum. In contrast, other metabolic products such as citrate, fumarate, pyruvate, glycerate, nitrogen metabolites, urate, and hydroxylamine were significantly higher in the plasma [54]. Furthermore, studies indicate that the total concentration of several lipids, including triglycerides (TGs), phosphatidylcholines (PCs), and HDL cholesterol, were higher in serum than in EDTA or citrate plasma [57,58].
Breier et al. reported that the reliability of metabolite measurements was slightly higher in serum samples compared to plasma [59]. The reason for this may be the higher metabolite concentration in serum compared to plasma, which provides greater sensitivity for biomarker identification [60,61]. However, the concentration of some metabolites involved in platelet aggregation will be different from their actual level in serum as the clotting process causes these metabolite levels to increase. Therefore, such metabolites will need to be measured from plasma [58]. In the study by Ishikawa et al., it has been demonstrated that plasma is more suitable than serum for studying lipid biomarkers because the clotting process was found to affect serum lipid levels [62]. Moreover, lipids showed the lowest biological variation in plasma citrate samples, implying the suitability of plasma for quantitative targeted lipidomics [60]. Nonetheless, the method and conditions by which the plasma was prepared need to be standardised to avoid detecting differences due to the time used, temperature or type of tubes, centrifuge used, or how the sample is stored (e.g., 4 • C −20 • C, −80 • C, or snap frozen).
When the blood clot is removed during serum preparation, the concentration of highabundance circulating proteins, such as fibrinogen, will significantly decrease in serum, making it much easier to detect low-abundance proteins. At the same time, some proteins are released from the platelet during the blood coagulation process. This phenomenon can vary sample-to-sample and may lead to the false positive identification of protein biomarkers from serum [63,64]. A study by Tammen et al. suggested citrate plasma or platelet-depleted EDTA plasma for studying the low-molecular-weight proteome [65]. In 2005, the HUPO's Human Plasma Proteome Project (HPPP) recommended using EDTA plasma as the preferred sample for all proteomic analyses [66]. Therefore, it is not possible to measure the biomarkers of interest from plasma and serum interchangeably. Based on the aims of the study and the target biomarker, either plasma or serum may need to be chosen.
As shown in Figure 2, a greater tendency to use serum over plasma has not been observed in breast cancer metabolomics investigations. The number of metabolomics studies that used serum as the biofluid sample of choice was relatively similar to those that utilised plasma samples. However, plasma was the preferred matrix over serum for breast cancer lipidomic investigations, with approximately 60% of the publications reporting on plasma as opposed to approximately 24% reporting on serum. In contrast, serum samples were used in approximately 44% of the studies focusing on proteomics investigations of breast cancer, which is much higher than plasma selection. demonstrated that plasma is more suitable than serum for studying lipid biomarkers because the clotting process was found to affect serum lipid levels [62]. Moreover, lipids showed the lowest biological variation in plasma citrate samples, implying the suitability of plasma for quantitative targeted lipidomics [60]. Nonetheless, the method and conditions by which the plasma was prepared need to be standardised to avoid detecting differences due to the time used, temperature or type of tubes, centrifuge used, or how the sample is stored (e.g., 4 °C −20 °C, −80 °C, or snap frozen).
When the blood clot is removed during serum preparation, the concentration of highabundance circulating proteins, such as fibrinogen, will significantly decrease in serum, making it much easier to detect low-abundance proteins. At the same time, some proteins are released from the platelet during the blood coagulation process. This phenomenon can vary sample-to-sample and may lead to the false positive identification of protein biomarkers from serum [63,64]. A study by Tammen et al. suggested citrate plasma or platelet-depleted EDTA plasma for studying the low-molecular-weight proteome [65]. In 2005, the HUPO's Human Plasma Proteome Project (HPPP) recommended using EDTA plasma as the preferred sample for all proteomic analyses [66]. Therefore, it is not possible to measure the biomarkers of interest from plasma and serum interchangeably. Based on the aims of the study and the target biomarker, either plasma or serum may need to be chosen.
As shown in Figure 2, a greater tendency to use serum over plasma has not been observed in breast cancer metabolomics investigations. The number of metabolomics studies that used serum as the biofluid sample of choice was relatively similar to those that utilised plasma samples. However, plasma was the preferred matrix over serum for breast cancer lipidomic investigations, with approximately 60% of the publications reporting on plasma as opposed to approximately 24% reporting on serum. In contrast, serum samples were used in approximately 44% of the studies focusing on proteomics investigations of breast cancer, which is much higher than plasma selection.

Urine
Urine is one of the most widely used human body fluids for routine testing due to its less complex composition [67,68]. Many studies on urine biomarkers for breast cancer screening and diagnosis are still in the discovery phase; hence, further cohort investigations are needed to validate their sensitivity and specificity [22].
There are several types of urine collection approaches, including random, first-morning, second-morning, and 24-h collections [69]. Each kind has unique advantages and disadvantages for metabolomic, proteomic, and lipidomic investigations. Although a random urine sample is presumably the most straightforward collection approach, it is rarely the preferred choice, as depending on the collection time, urine may be excessively diluted due to water intake, and the patient's diet and exercise would have affected its composition [68,70]. The first-morning urine sample is generally considered appropriate for proteomic studies because it contains the largest amount of total proteins [71] and shows the lowest variation compared to the 24-h urine samples [68,72]. Conversely, the midstream second-morning urine collected after an overnight fast is recommended for metabolomic profiling, as the pattern of metabolites in the first-morning urine may reflect nutrients consumed the day before [69,73]. Although urine collection time is a critical factor, it has been neglected by many studies focused on urinary metabolomics in breast cancer [74][75][76]. However, in a few investigations, it has been indicated that first-morning urine collection was used [77,78].
In terms of lipidomic analysis, there is a lack of information demonstrating the characteristics of each urine sample type based on the time of sampling for lipid biomarker discovery. Furthermore, few investigations have been performed on urinary lipidomics in patients with breast cancer, in which the detailed information of urine collection protocols has not been addressed well [79,80].
Another aspect to consider using urine as the source of biomarkers is the difference in the microbiome composition of the urinary tract and the vaginal tract in women. Due to the microbiome-host interaction, the results can be affected. The microbiota may produce and secrete proteins, lipids, etc., which may confound the biomarker discovery and may also metabolise the host-secreted biomarkers in the sample. It has been shown that urinary microbiota composition differs by menopausal status in patients with breast cancer [81]. Moreover, regardless of menopausal status, cancerous patients had increased levels of Gram-positive bacteria, including Corynebacterium, Staphylococcus, Actinomyces, and Propionibacteriaceae [81], which may influence the metabolite and protein content of urine.

Tears
The tear's composition, especially proteins, can be substantially affected due to the sample collection procedure [82][83][84]. Schirmer's test strips (STSs) and microcapillary tubes (MCTs) are the most popular tear sampling procedures [85]. Pieragostino et al. [86] reviewed the advantages and disadvantages of collection techniques previously. STSs have been used in most proteomics studies in breast cancer so far [17,18,87]. Results from the analysis by Nättinen et al. [83] indicated that Schirmer strip samples had a ten-fold greater mean total protein content compared to MCTs. To date, there is no agreement on how the tear sampling procedures impact the proteomic data. Sample handling, such as strip cutting, has been shown to increase the risk of contamination and protein loss, making the results even more variable [84]. Therefore, the most appropriate and reliable tear sampling approaches are needed for the accurate and repeatable detection of tear biomarkers.

Nipple Aspiration Fluid
Nipple aspirate fluid (NAF) in non-lactating women is a fluid secreted by breast epithelial duct cells and can be collected with various degrees of effectiveness, ranging from 34% to 90% by utilising a milk-expressing pump, nasal oxytocin spray, and gentle breast massage [31,[88][89][90]. Proteins are the main components of NAF, with concentrations up to 170 mg/mL, which can be more than that found in plasma [91]. However, there are some challenges when using NAF as a source of protein biomarkers. Firstly, NAF droplets may not be acquired from the duct where carcinogenesis has occurred [92]. Furthermore, it has been shown that the colour and viscosity of NAF can affect biomarker identification when using spectrophotometry approaches [92]. Li et al. proposed that the notable differences in the results spectra between NAF samples in a group may stem from several reasons, including the biological variation in the breast duct's microenvironment and variability of the protein concentration in the samples (equal sample volume was examined rather than equal protein concentration) [14]. Given these variations and challenges in NAF sample examinations, it is difficult to cross-compare the findings of different investigations. Another challenge in using NAF is that the microbiome profile and host-microbiome interaction may interfere with the biomarker studies. The study by Chan et al. showed that the microbiota composition of NAF significantly differs in patients with breast cancer compared to healthy women, which may affect the multi-omics profile of NAF [93] for biomarker discovery.

Saliva
The safe, non-invasive, and repeatable collection makes saliva a good target for biomarker discovery. Investigations showed significant differences in the level of metabolites in saliva that can be used as biomarkers for breast cancer diagnosis [37,94,95]. However, the exceptionally diverse composition of saliva arising from age, diet, gender, and time of day of the collection makes it a challenging choice of biofluid for biomarker studies.
Protein degradation is one of the main reasons for the irreproducibility of salivary proteomic analyses. The proteolytic degradation commences just as the proteins enter the oral cavity and continues post-collection of salivary samples, leading to substantial differences in biomarker profiles [70]. Furthermore, salivary biomarkers can be affected by the site of collection. For example, Cui et al. showed that the concentration of several metabolites was different in whole saliva, parotid saliva, and submandibular/sublingual saliva [96]. Moreover, Assad et al. propounded that small variations in the collection and storage procedure affect the free amino acid content of saliva as it comprises proteinases and peptidases [97], resulting in irreproducible results between studies.

Extracellular Vesicles
Extracellular vesicles (EVs) are rich sources of circulating biomarkers in blood that have been of interest in many recent studies, with demonstrated utility in breast cancer diagnosis, as reviewed previously [98]. Continuous production, release, and uptake of existing EVs by different types of blood cells, as well as the delay between blood collection and preparation of plasma or serum, need to be considered when EVs are used for biomarker discovery [99,100]. It is shown that physical activity undertaken prior to sample collection, besides other pre-analytical parameters such as collection tube, centrifugation, and storage time, may influence morphology, size, and stability, as well as the downstream characterisation of EVs [101,102]. EV isolation and enrichment are other discriminatory pre-analytical factors in many studies, as there is no established gold-standard protocol to purify and isolate EVs. For instance, centrifugation is one of the main parameters that impact the reproducibility of EV isolation and purification [102,103]. This may complicate cross-comparison between studies as well as the external validation of biomarkers. This lack of standardised guidelines in EV research has triggered international efforts and consortiums, such as EV-TRACK (https://evtrack.org/index.php, accessed on 2 May 2023), to facilitate the standardisation of EV research through increased systematic reporting [104].
Based on the study published in 2020 [105], the concentration and size of the microvesicles (MVs), which are a sub-type of EVs, differ in plasma and serum. While MVs have lower concentrations in serum, small-sized MVs are higher in serum than large-sized MVs. In another study by Palviainen et al., the protein profiles of plasma EVs were different between serum and plasma [106]. In order to reduce vesicle release from blood cells, most procedures suggest using plasma rather than serum [101]. EVs and MVs in cancer biomarker discovery have previously been reviewed in detail [99,101,107,108]. In breast cancer studies focusing on EVs, plasma was used as the main source compared to serum [108], regardless of the type of EV composition.

Sample Collection and Processing Variables Impact the Discovery of Accurate Biomarkers
In addition to biofluid type, other pre-analytical variables, including anti-coagulants, collection tubes, incubation times (pre-centrifugation processing delay), storage time and temperature, and freeze-thaw cycles, can also influence biomarker levels, thereby affecting the analytical reproducibility [42,43,109]. Some distinguished influential variables that may occur during sample collection and handling are presented in Table 2 to highlight the importance of considering these facets in prospective proteomic, metabolomic, and lipidomic studies. The information presented in Table 2 reiterates that the pre-analytical phase should be meticulously controlled and regulated to prevent unfavourable impacts on biomarker discovery and underscores the need for highly standardised protocols.

Trends in Non-Invasive, Non-Nucleotide Biomarker Discovery for Breast Cancer
As discussed above, in biomarker discovery studies, the ease of sample collection, reproducibility, and effective variables are some of the critical factors. Biomarker investigations for the detection and prognosis of breast cancer are more concentrated on non-invasive approaches rather than a tissue biopsy. Figure 2 illustrates the proportion of metabolomic, lipidomic, and proteomic investigations carried out on various biofluid samples of breast cancer between January 2001 and April 2023. It demonstrates that the number of studies exploiting non-nucleotide-based biomarkers from various biofluids has increased in the last ten years. Although proteomics has dominated the field for many years, there has been a shift to metabolomics and lipidomics since 2015. Regarding biofluid sources, although various biofluids have been exploited for biomarker discovery, blood continues to be the primary biofluid for biomarker discovery (plasma and serum). Notably, serum was the primary source before 2015, and plasma was the primary source from 2015 to 2019. The preference for choosing blood over other biofluids might be due to the fact that, compared to the other biofluids, fewer variables, including exposure to air, possible effects of their microbiome on the abundance and composition of analytes, time of collection, and the high proportion analytes related to adjacent tissues may affect the study outcomes [137][138][139][140].
Furthermore, protocols and analysis pipelines of plasma and serum may be more standardised compared to other biofluids. Proteomics and metabolomics are emerging fields that have expanded rapidly as a result of parallel improvements in bioanalytical platforms and methods for data analysis [141]. As shown in Figure 2, the trend of research using proteomics to identify biomarkers has been overtaken by lipidomics and metabolomics in more recent years. This may be due to the development of new protocols and methods for metabolome and lipidome purification, advances in analytical techniques, and awareness of their potential use for biomarker discovery.

Analytical Techniques for Biomarker Discovery
Apart from the pre-analytical variables, the wide dynamic ranges, sensitivity, and specificity of analytical methods are major challenges in biomarkers discovery, which can affect the reproducibility of biomarker identification. For example, because some biomarkers have a very low abundance in the selected biofluid, the sensitivity of the analytical method can limit the number of discovered proteins [70]. Detailed information on commonly used techniques in proteomic, metabolomic, and lipidomic investigations is included in Table 3 and summarised below.

Proteomic Approaches
Proteomic workflows can be categorised as gel-based and gel-free methods coupled with array-based and mass spectrometry-based (MS) techniques [159]. Mass-spectrometry (MS) is the most commonly used approach in proteomic studies of breast cancer [160]. Time-of-flight, triple quadrupole, and orbitrap mass spectrometers can be coupled with different ionisation procedures, including surface-enhanced laser desorption/ionisation (SELDI), matrix-assisted laser desorption/ionisation (MALDI), and electrospray ionisation (ESI) for proteomic applications [160]. Although most of the investigations utilised the SELDI-TOF-MS method for breast cancer diagnosis as a potential discovery method, the reproducibility was questionable due to the low resolution of SELDI-TOF-MS data and chip-to-chip variation. In contrast, MALDI-TOF-MS shows higher reliability and robustness and is favoured in clinical proteomics [161]. However, it is not without limitations; for example, MALDI-TOF-MS is sensitive to impurities such as salt, causing problems with the reproducibility of the results [68].
Two-dimensional gel electrophoresis (2-DE) is a technique widely used in qualitative proteomic investigations of breast cancer [10,28]. However, this technique has some drawbacks, including weak inter-assay reproducibility, low sensitivity for the detection of proteins with either very low PH (<3) or high PH (>10) values, and too small (<10 kD) or too large (>150 kD) molecular masses, as well as the inability to identify hydrophobic and low abundant proteins [162]. In contrast, the two-dimensional difference in the gel electrophoresis (2D-DIGE) approach has demonstrated higher sensitivity and improved reproducibility [155].
Other factors, such as diversity in binding/washing buffer conditions and the chemistry of ProteinChip surfaces, can influence the binding and identification of various proteins, leading to discrepancies in biomarker discovery [27]. For example, IMAC3 (Immobilized Metal Affinity Capture) chips capture proteins via chelation of metal ions, whereas H4 chips absorb by hydrophobic interaction; consequently, the proteins captured by these chips are distinct and would lead to irreproducible results [163,164]. Therefore, analytical procedures should be standardised among research and clinical laboratories for a precise interpretation and interlaboratory comparison of data.

Metabolomic Approaches
Two main analytical techniques are commonly employed in metabolomic investigations: mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy [32]. Although NMR has the capability to measure metabolites with high reproducibility in complex samples without the need for pre-preparation of biological fluids, it shows low sensitivity [165]. Mass spectrometry techniques used for breast cancer studies include ultrahigh performance liquid chromatography coupled with quadrupole time-of-flight (UPLC-QTOF-MS) [166,167], gas chromatography-mass spectrometry (GC-MS) [168][169][170], liquid chromatography-mass spectrometry (LC-MS) [8,171], and ultra-fast liquid chromatographytandem mass spectrometry (UFLC-MS/MS) [172]. However, the LC-MS and GC-MS methods have been frequently applied for biofluids [173]. LC-MS stands as the most suitable approach for the sensitive identification of biomolecules with high reproducibility [174], while GC-MS shows relatively stronger chromatography with distinct peak separation [175].

Lipidomic Approaches
Technological advancements in liquid chromatography, high-resolution accurate mass spectrometry, and NMR spectroscopy have improved the high throughput analysis of lipid molecules [176]. Many mass-spectrometry-based approaches are used in lipidomic studies, each with unique characteristics, advantages, and disadvantages [177]. Mass spectrometry imaging (MSI), direct infusion or shotgun MS, and MS accompanied by initial chromatographic separation such as GC, LC, and thin-layer chromatography (TLC) are the main three infrastructures of lipidomic investigations [147]. Shotgun MS, in which the analyte is not separated by prior chromatography, performs poorly in detecting less-ionisable and low-abundant lipids due to ion suppression, during which the signals stemming from weakly ionised lipid species are buried in the signal of strongly ionised lipids [178,179]. However, the detection of such lipids can be improved by a pre-separation approach, such as LC-MS, which has demonstrated high sensitivity, specificity, and remarkable separation efficiency for lipids [147].

Data Pre-Processing
Mass spectrometry-based techniques have become the mainstream methods for highthroughput and unbiased proteomics, metabolomics, and lipidomics profiling. Several forms of proprietary and open-source software have been developed for data acquisition and quantification, as discussed elsewhere [180,181]. These tools have different underlying assumptions and algorithms for searching (e.g., database vs. de novo) and molecular species quantification [182], which contributes to the discrepancy of generated data across different studies. A comprehensive benchmarking is required to compare data acquisition and quantification techniques and to provide a guideline for the best practices.
Once quantified, high-throughput spectrometry or spectroscopy data are often subject to multiple pre-processing steps to stabilise variance, reduce systematic bias or technical variations, and impute missing data. The choice of pre-processing approach can substantially affect the data quality and validity of downstream analyses. For instance, Mertens [183] argued in favour of log-transformation to mitigate the skewness and standardise spectrometry data, which has raised concerns regarding using so-called "closure normalisation", e.g., data normalised by the sum of the combined expression in exerting spurious biases in the correlations between the spectral measures masking true population associations. Nonetheless, the diversity of the available pre-processing statistical approaches demands benchmarking studies to systematically investigate their effect on the quality of data and the reproducibility of the biomarkers identified. Välikangas et al. [184], for instance, evaluated normalisation methods in quantitative label-free proteomics and demonstrated the variations in outcomes of downstream analyses (e.g., differential expression) depending on the choice of the normalisation method. Despite the importance of pre-processing, we frequently observed unclear and incomplete descriptions of the approaches undertaken in the literature we have reviewed in relation to the non-nucleotide biomarkers of breast cancer (Table 1 and Supplementary Table S2).

Biomarker Signature Panel Identification (Feature Selection)
From the computational perspective, signature panel identification can be formulated as a feature selection or extraction problem, which implies the selection of a set of molecules (e.g., proteins, lipids, or metabolites) that best stratify the groups of interest (e.g., cancer vs. control) or the extraction of latent features from the entire omics profile (e.g., embeddings derived via dimensionality reduction). Feature selection has been historically performed via differential analysis (i.e., statistical hypothesis tests such as t-test or Mann-Whitney U test). However, while differential analysis can detect functionally relevant molecules, it is ineffective in selecting features with optimal predictive power [185] as it is a univariate approach overlooking nonlinear relationships among multiple biomarkers, whose collective effect contributes to the prediction of a phenotype, disease outcome, or treatment response. Several sophisticated machine learning-based methods have been developed by the computer science community for feature extraction or selection of predictive variables from high-dimensional data, which can substantially enhance signature panel identification, and the development of predictive models and cancer diagnostics as previously benchmarked [186]. Despite the proven utility of machine learning and nonlinear, multivariate feature selection in identifying biomarker signatures with high sensitivity and specificity, statistical hypotheses testing has been the dominant approach adopted in non-nucleotide breast cancer biomarker discovery, as outlined in Supplementary Table S2.

Biomarker Predictive Modelling (Classification)
After feature selection (or extraction), the identified biomarker signature panel can be used as predictive variables of a classifier algorithm to stratify patients into categories of interest (e.g., cancer vs. normal). A classifier algorithm often implements a mathematical function that maps input data to a category upon learning from a training cohort. Different classifiers have been implemented as multi-variate cancer diagnostics models, including commonly used algorithms such as random forest, support vector machines, logistic regression, artificial neural networks, and ensemble approaches (i.e., predictive models composed of a weighted combination of multiple classifiers) [187]. For a long time, improving the prediction accuracy has been the primary focus of biomarker discovery predictive modelling. However, biomarker discovery methods should be assessed based on prediction accuracy as well as robustness, defined as the generalisability of the model to diverse cohorts. In recent years, the stability of biomarker discovery has gained more attention, as reviewed previously [188]. Nonetheless, in breast cancer liquid biopsy studies, the adoption of classifiers as diagnostic models has been limited (Supplementary Table S2), contributing to the lack of highly predictive and robust diagnostic tests.

Clinical Validation
Extensive validation is necessary before the clinical implementation of a diagnostic test. Validation of a predictive model using the dataset at hand (referred to as the development dataset) is often referred to as an internal validation, wherein the dataset is divided into the test and train sets, using the latter for model development and optimisation and the former for model validation. In addition, to mitigate model overfitting, particularly in small datasets, data re-sampling techniques, such as bootstrapping or cross-validation, can be used to account for the selection bias and to quantify the stability of the predictive performance [189].
Based on our literature review, the majority of breast cancer liquid biopsy studies have only reported the prediction performance of biomarkers upon internal validation, which is not sufficient to confirm model generalisability. In order to progress towards implementation and technology readiness, extensive external validation is required, wherein the model's predictive performance is quantified using data collected from participant cohorts external, temporally and/or geographically, to the development dataset [189].
Besides the validation of the prediction models, the analytical parameters should be optimised, followed by the validation of the parameters according to regulatory guidelines [190,191]. The clinical performance of the test should then be compared to the goldstandard method, e.g., mammography [192,193]. When the technology is implemented, prospective clinical studies should be conducted to assess if the assay improves patient outcomes and reduces healthcare costs [192,194].

Conclusions and Future Perspective
Our major biofluid biomarker discovery pathway throughout the last decades was focused mainly on nucleotide-based biomarkers for early breast cancer diagnosis. However, in recent years, the investigation of proteomics, lipidomics, metabolomics, and microbiome profiles, along with EV cargo, has been increased to introduce new biomarker profiles, not only for blood but also for other types of body fluids, as we have comprehensively reviewed here. We also reviewed the effect of different procedures, from sample collection and processing to data analysis and validation. The lack of standard protocols in different parts of biomarker discovery can be a key factor hindering the clinical implementation and manufacturing of commercialisable assays or clinical tests. Therefore, one of the future efforts in breast cancer biomarker studies is to standardise the liquid biopsy assay procedures and analysis platforms. This will give a better opportunity to combine and compare results from different studies and develop breast cancer liquid biopsy consortiums to advance and validate liquid biopsy technologies, homogenise guidelines, and standardise data for the development of breast cancer biomarkers. Some initiatives have already been implemented by the National Institute of Health (https://prevention.cancer.gov/majorprograms/liquid-biopsy-consortium, accessed on 2 May 2023), targeted for early-stage cancer detection on a wide range of cancer types.
Due to the ongoing advances in non-invasive biomarker discovery, technology, and data analytics, the future of the field is moving towards multi-omics liquid biopsy and non-invasive blood tests (or other bodily fluids) through the simultaneous assessment of different omics data (e.g., genomics, transcriptomics, and proteomics) from body fluids for cancer detection and monitoring. Multi-omics approaches could provide complementary information on the presence of the dysregulated bodily processes leading to disease, enabling early detection of tumours, and they have demonstrated utility in enhancing the sensitivity and specificity of cancer detection as we construct a fuller picture [195]. Despite its advantages, multi-omics liquid biopsy is facing slow adoption and implementation. So far, there have been limited studies using this approach for breast cancer identification emerging over the last few years (Supplementary Table S1). One major obstacle is limited sample availability and/or technical difficulties associated with generating complete multi-omics datasets due to the uneven maturity of different omics approaches. Moreover, the growing gap between generating large volumes of data compared to data processing capacity and available integrated datasets are of concern. Additional efforts are needed for the standardisation of multi-omics operational procedures and data integration, from robust pre-processing and operational guidelines to data integration and validation.    Data Availability Statement: All data generated or analysed during this study are included either as a supplementary file or are publicly available and properly referenced in the manuscript.