Clinical Applications and Future Directions of Minimal Residual Disease Testing in Multiple Myeloma

In the last years, the life expectancy of multiple myeloma (MM) patients has substantially improved thanks to the availability of many new drugs. Our ability to induce deep responses has improved as well, and the treatment goal in patients tolerating treatment moved from the delay of progression to the induction of the deepest possible response. As a result of these advances, a great scientific effort has been made to redefine response monitoring, resulting in the development and validation of high-sensitivity techniques to detect minimal residual disease (MRD). In 2016, the International Myeloma Working Group (IMWG) updated MM response categories defining MRD-negative responses both in the bone marrow (assessed by next-generation flow cytometry or next-generation sequencing) and outside the bone marrow. MRD is an important factor independently predicting prognosis during MM treatment. Moreover, using novel combination therapies, MRD-negative status can be achieved in a fairly high percentage of patients. However, many questions regarding the clinical use of MRD status remain unanswered. MRD monitoring can guide treatment intensity, although well-designed clinical trials are needed to demonstrate this potential. This mini-review will focus on currently available techniques and data on MRD testing and their potential future applications.


INTRODUCTION
The treatment course of multiple myeloma (MM) has been strongly improved during the last 20 years: the introduction of modern 3-drug regimen therapies combined with transplantation increased the achievement of deeper responses and the acquisition of minimal residual disease (MRD) negativity in up to 40/50% of patients enrolled in clinical trials (1). Consistently, a large number of studies showed that, among patients achieving a complete response (CR), those with detectable MRD had inferior progression-free survival (PFS), and overall survival (OS) compared to those with undetectable MRD. Moreover, among patients in CR, improved PFS and OS have been significantly associated with undetectable MRD, regardless of disease stage, prior transplant, or cytogenetic risk (2). Therefore, the International Myeloma Working Group (IMWG) recently revised the response criteria and introduced the definition of MRD in CR patients as the persistence or re-emergence of very low levels of cancer cells, equal to about 1 tumor cell in at least 10 5 normal cells (3). These response criteria are the direct result of the progressive evolution of both imaging and bone marrow  MRD techniques in the last 15 years (Figure 1). However, a precise knowledge of when and how to perform MRD detection is required. This review aims to examine the currently available MRD techniques recommended by IMWG and data from different clinical trials, in order to outline a possible future perspective on the role of MRD testing as a tool for decision making in standard clinical practice.

Bone Marrow Techniques: NGF and NGS
There are two techniques commonly used to detect MRD in the bone marrow (BM): multiparameter flow cytometry (MFC) and next-generation sequencing (NGS) molecular technology. Both techniques show positive and negative aspects ( Table 1). MFC can detect and quantify tumor vs. normal plasma cells using cell surface and cytoplasmic markers. For the identification of plasma cells, the combined use of CD38 and CD138 is recommended even if they are also expressed on other BM cells. In particular, the aberrant expression patterns of CD19, CD56, CD45, CD38, CD27, CD20, CD28, CD33, CD117 and surface membrane immunoglobulin can characterize the phenotype of monoclonal plasma cells (4). However, antigenic expression can vary on plasma cells and should be considered when interpreting flow data.
Older conventional 4-to 7-color flow cytometry assays have now been replaced by advanced 8-color 2-tube or 10-color 1-tube assays. In this sense, the increased sensitivity of MFC (between 10 −4 and 10 −5 ) is due to the simultaneous assessment of ≥8 markers in a single tube. In this way, if sufficient cell numbers are evaluated (e.g., ≥5 × 10 6 ), it is possible to promptly identify aberrant PC phenotypes at MRD levels (5).
According to this consensus methodology, it is important to evaluate the limit of quantitation (LOQ) and the limit of detection (LOD) of the NGF-MRD method. The LOQ is calculated as 50 among 10 7 nucleated cells (based on the identification of ≥50 clonal plasma cells); the LOD as 20 among 10 7 nucleated cells (based on the identification of ≥20 clonal plasma cells). This evaluation allows to discriminate between positive and negative samples. Interestingly, a baseline sample is not mandatory for MRD evaluation. After the multicenter evaluation of patients with very good partial response (VGPR) or CR, 110 follow-up bone marrows showed a higher sensitivity for NGF-MRD, as compared to conventional 8-color flow-MRD: MRD-positive rates were 47 vs. 34% (P = 0.003), respectively. Thus, 25% of patients who were categorized as MRD-negative by conventional 8-color flow were categorized as MRD-positive by NGF. This translated into a significantly longer PFS using NGF to discriminate between MRD-negative and MRD-positive CR patients (P = 0.02). Importantly, NGF can also provide a qualitative assessment of the patient sample by allowing the complete analysis of the normal B-cell compartment and the detection of a significantly decreased number of non-PC BM cells (e.g., mast cells, nucleated red blood cells, myeloid precursors, B-cell precursors, and CD19-normal PC) revealing potentially hemodiluted BM samples. Finally, treatment with CD38 antibodies such as daratumumab and isatuximab can alter the antigen expression in MM cells. This sets a limit for the use of CD38 as a marker for the detection of plasma cells during MRD assessments at follow-up. The use of multi-epitope CD38 antibody in an advanced flow cytometry panel can solve this problem, since this conjugate can bind to a specific site (not covered by daratumumab) of the CD38 antigen. Nonetheless, in case of CD38 surface downregulation, the solution is the analysis of intracellular CD38 through the same protocol used for intracellular k-and λ-chain staining (7).
Allele-specific oligonucleotide polymerase chain reaction (ASO-PCR) was first explored to evaluate molecular MRD in MM, but even if its prognostic role was confirmed, different issues limited its use in favor of the NGS technique. First, its applicability ranged from 40 to 60% due to the low rate of diagnostic marker identification, since this technique does not take into account the somatic hypermutation rate of immunoglobulin loci and this translates into sequencing problems. Moreover, patient-specific reagents raised the complexity of this technique (8)(9)(10)(11).
NGS was developed to overcome all these disadvantages. ClonoSEQ R Assay (Adaptive Biotechnologies, Seattle, US-WA) is the most frequently adopted commercial platform in the United States. In this test, DNA is extracted from patient's BM, a multiplex PCR amplifies VDJ, IgK, and IgL gene sequences and a common PCR prepares DNA for sequencing and creates a sequencing library. At the end of the process, a bioinformatic tool is essential to extrapolate and analyze all NGS data.
Using this assay, we can define as "clonotypes" two identical sequencing reads. A clonotype with frequency >5% at diagnosis is considered a clonality (clonal gene rearrangements), thus becoming a target for the detection of MRD in follow-up samples (12,13). In lymphoid malignancies, NGS and ASO-PCR have been compared, showing similar sensitivities and results (13).
In the IFM2009 clinical trial, a comparison between NGS and 7-color MFC has been made, showing that the higher sensitivity with NGS at 10 −6 allowed to predict the best outcomes in MRDpositive vs. -negative patients (3-year PFS: 53 vs. 83%, p < 0.001).
Ongoing clinical trials are evaluating NGS vs. NGF and their correlation: in the CASSIOPEIA trial, a good concordance (83.5% in paired samples) was observed using the same sensitivity (10 −5 ) regardless of response in patients achieving ≥CR, indicating that both techniques performed similarly in evaluating MRD (14). As illustrated in Table 1, some characteristics can affect the clinician's preference of choosing NGS vs. NGF, such as the higher cost for NGS (∼1,500 $ per sample vs. ∼300 $ for NGF), and the required time and skills (at least 1 week for NGS vs. 3-4 h for NGF and commercial service available only for NGS).
In this regard, ongoing studies are evaluating 'in-house' NGS techniques: recently, Martinez-Lopez et al. described a NGS method starting from 1 µg of DNA and amplified IGH or IGK sequences. The sequencing data were analyzed by specific mathematical and bioinformatic tools to identify and quantify the clonotype present on each sample. A clonotype was identified when at least 400 identical sequencing reads were obtained, or when it was present at a frequency of >1% with a sensitivity of at least 10 −5 (15).

Imaging Techniques: PET/CT
MM is a patchy disease and BM infiltration may often be heterogeneous. Indeed, ∼60% of MM patients show focal lesions that represent the local accumulation of plasma cells (16). Therefore, the IMWG incorporated imaging in addition to BM evaluation to better characterize MM residual disease (3).
Different studies showed the role of imaging techniques in evaluating focal lesions: magnetic resonance imaging (MRI) is a sensitive, non-invasive imaging technique available to detect the bone involvement in the spine and to provide details regarding the soft tissue disease and the pattern of marrow infiltration (normal, focal, diffuse, or heterogeneous).
Fluorodeoxyglucose positron emission tomography/computed tomography (FDG PET/CT) can be used to analyze the vitality of the focal lesions and is therefore the current standard of care to evaluate the post-therapeutic residual infiltration (17)(18)(19).
Different studies showed the prognostic and predictive role of FDG PET/CT (20)(21)(22). Interestingly, Moreau et al. compared PET/CT with MRI. Although at diagnosis both the techniques performed similarly in the detection of bone lesions, the normalization after therapy of PET/CT, but not of MRI, was predictive of PFS and OS (20). In both responding and nonresponding patients, focal lesions can still remain positive for many months. As a consequence, conventional MRI is probably not the best technique to evaluate MRD (22)(23)(24). On the other hand, functional MRI techniques based on the measurement of the movement of water molecules in the tissue (Diffusion-Weighted MRI, DWI) could be informative on the residual cellularity and the microcirculation of the focal lesions (25). No standardization of the diagnostic technique and no interpretation of results in MM after therapy are still available and no prospective comparison between PET/CT and DWI in a meaningful number of patients has been done. In a small number of MM patients, DWI seemed to be more sensitive in the detection of residual lesions. However, if this could be an advantage or could lead to an increased number of false-positive cases, still needs to be elucidated (26,27).
Finally, different researchers confirmed the complementarity of PET/CT and BM techniques. Rasche et al. showed how patients who were both Flow-MRD-and PET/CT-negative had the best PFS outcome when compared with those who were Flow-MRDnegative but PET/CT-positive (28). Paiva et al. demonstrated that, even if NGF-negative patients had a long PFS, there was a proportion of subjects who relapsed with extramedullary disease in the presence of a previous negative BM sample, confirming the importance of combining BM and imaging analyses (29).
PET/CT has some limitations, some of which are linked to the tracer used (FDG). Indeed, a low expression of the enzymes responsible for the glycolysis process (e.g., hexokinase 2 gene) in MM cells could lead to false-negative cases with FDG PET/CT (30). Alternative tracers could overcome these limitations. For instance, 11C-Methionine uptake correlates with protein synthesis, a very active mechanism in malignant plasma cells, and can be used as an alternative PET/CT tracer in MM (31).
In a head-to-head prospective comparison in a heterogeneous MM patient population, 11C-Methionine PET/CT was more sensitive than FDG PET/CT in the detection of focal lesions, both within and outside the bone. More data are needed in a homogenous patient population to understand whether this tracer could be an alternative to FDG in the detection of residual disease after treatment. Currently, other tracers targeting lipid membrane (e.g., Choline, Acetate) and CXCR4 are also under study (32).

MRD RESULTS IN THE CLINICAL SETTING: RELEVANT QUESTIONS
In this section we focus on clinically relevant questions regarding MRD, reviewing available data on newly diagnosed MM (NDMM) patients. Single studies are summarized in Table 2. Data on MRD evaluation in relapsed and/or refractory MM patients (59) and high-risk smoldering MM (60) are beginning to emerge as well, and have been recently reviewed elsewhere (61).
In the MM field, a major question concerned the prognostic role of MRD and its ability to perform better than conventionally defined response criteria. As already discussed, there is now compelling evidence coming from multiple studies ( Table 2) and two meta-analyses (2, 62) confirming that MRD-negative patients have a significantly better PFS and OS compared to MRD-positive patients. The beneficial effect of MRD negativity was confirmed also focusing on CR patients (2). Using MFC with a sensitivity of 10 −4 -10 −5 , Lahuerta et al. nicely demonstrated that MRD-negative patients with a conventionally defined CR had better PFS (median, 63 vs. 27 months, p < 0.001) and OS (median, not reached vs. 59 months, p < 0.001) than MRDpositive CR patients (42). Moreover, MRD-positive CR patients had similar outcomes compared to patients achieving a partial response (PR) (median PFS, 27 vs. 29 months; median OS, 59 vs. 65 months, respectively) showing that the prognostic advantage of conventionally defined CR over PR resided in the MRDnegative patient population (42).
The best timing for MRD measurement is another important unanswered question. Usually, MRD is measured at specific timepoints during therapy [e.g., post-induction (39), +100 days post-ASCT (33), post-consolidation (41), pre-maintenance, and during maintenance (46)]. If treatment does not provide for a phase-specific timepoint (as in the case of the continuous treatment strategy commonly adopted for transplant-ineligible patients), MRD testing is usually done at unconfirmed CR/sCR and at fixed timepoints thereafter (50).
Data clearly show that, as we continue to intensify patient treatment, the percentage of MRD-negative patients increases (39,43,53,55,56) and even maintenance treatment can convert a significant percentage of MRD-positive patients into MRDnegative [e.g., 27-30% with lenalidomide maintenance in a pooled analysis (9,46)]. Each timepoint can be important due to different clinical reasons. For instance, the post-induction timepoint can be used to design clinical trials addressing different intensification regimens, while pre-maintenance or during maintenance timepoints can be exploited to design clinical trials addressing the intensity and the duration of maintenance. Regarding the prognostic effect of different timepoints, in the Myeloma IX study, which used MFC with a sensitivity of 10 −4 , a PFS advantage was found in patients that were MRD-      Frontiers in Oncology | www.frontiersin.org years for 10 −4 ), suggesting that MRD level is a continuous rather than a discrete variable (63). Recently, several studies using both flow cytometry-based methods with a sensitivity of 10 −5 (48) or 10 −5 -10 −6 (7) and NGS-based methods with a sensitivity of 10 −6 (58,64) demonstrated that lower levels of MRD are associated with better outcomes and that the best possible sensitivity should be pursued. Indeed, in the IFM/DFCI 2009 trial, among 163 patients who were MRD-negative pre-maintenance using MFC with a sensibility of 10 −4 , 84 (56%) were indeed MRD-positive using NGS with a sensibility of 10 −6 (3-year PFS, 86 vs. 66% in NGS-negative vs. NGS-positive among MFC-negative patients). This is especially important in clinical trials designed to explore treatment interruption based on MRD levels because a low sensibility of the technique can lead to unacceptable risk of patients' undertreatment. This observation leads to our last question: if MRD negativity is a major prognostic determinant, do treatment administered and baseline risk stratification matter as long as MRD negativity is achieved? Many studies demonstrated that even if a more effective regimen induced MRD negativity in a higher number of patients, the prognosis of MRD-negative patients was similar independently from treatment arm (49,58). However, we do need MRD-driven clinical trials to determine if treatment deintensification in MRD-negative patients is feasible without worsening patient prognosis (65). In this regard, in the Myeloma IX trial, MRD-negative patients (MFC at 10 −4 ) receiving thalidomide maintenance remained in a MRD-negative state more often than patients not receiving maintenance treatment (96 vs. 68.8%, p = 0.026). Regarding MM patients who are at high risk according to baseline prognostic factors (e.g., highrisk cytogenetics or unfavorable Revised International Staging System score), MRD-negative patients at a low level of sensitivity (10 −4 ) still showed inferior clinical outcomes than standardrisk patients (34). Conversely, reaching MRD negativity at a sensitivity of 10 −5 -10 −6 seemed to overcome the inferior outcome observed in high-risk vs. standard-risk patients (48,58). However, it should be noted that high-risk patients require highly intensive regimens in order to achieve a proper level of MRD negativity (47,52,55).

FUTURE PERSPECTIVES
Is MRD a Surrogate Endpoint for Drug Approval?
Improving OS and quality of life is the final aim of MM treatment. In the past years, the PFS endpoint has been used as a surrogate endpoint for OS to speed up the drug approval process. However, following the achievement of longstanding and deep responses (especially in NDMM patients), PFS is inappropriately becoming a late endpoint. MRD is considered the best candidate as a PFS/OS surrogate marker for provisional drug approval by regulatory agencies. Indeed, ClonoSEQ R Assay is now authorized by FDA (66) and MRD negativity with a sensitivity of 10 −5 is the most common primary endpoint of new clinical trials designed for NDMM patients. However, as discussed above, continuous efforts should be exterted to define the optimal sensitivity cut-off (10 −5 vs. 10 −6 ), the timing of evaluation and the need for a sustained MRD negativity. Moreover, safety should be closely addressed, as it was demonstrated by higher MRD (13.4 vs. 1%) but worse OS rates (HR 2.03, 95% CI 1.04-3.94) in the experimental arm of the BELLINI trial (M14-031) comparing venetoclax-Vd vs. Vd (67,68). Moreover, in some settings, the correlation between MRD negativity rates and PFS improvement could be less clear because of technical pitfalls (e.g., early MRD evaluation after myelosuppressive treatments in hypocellular bone marrows).

How to Address Spatial Heterogeneity?
MM is a spatially heterogeneous disease and simultaneous MRD negativization inside and outside the bone marrow showed synergistic predictive values (28).
Moreover, MRD analysis within the bone marrow is done on bone marrow aspirates coming from a single random site and, in some patients, MM cells show a patchy infiltration (69). To overcome this issue and to possibly link the information on residual disease coming from both bone marrow and extramedullary sites, liquid biopsy approaches are beginning to emerge. Currently under exploration are the detections at high sensitivity levels of circulating tumor DNA (70), circulating plasma cells (71), and M protein peptides (72)(73)(74). The further optimization of the available techniques will be essential for their future success.
As an example, applying the ClonoSEQ R assay to peripheral blood ctDNA and paired BM samples, Mazzotti et al. showed that residual disease in the peripheral blood was undetectable in 69% of patients with concurrent MRD-positive bone marrow samples (70). This was mainly due to an insufficient sensibility to detect specific Ig gene rearrangements in the peripheral blood when disease burden was low in the BM (70), underlying the need to improve the technique before we can routinely exploit peripheral blood to monitor MM burden.

MRD-Driven Trials
MRD has not yet entered the clinical practice, but it represents an attractive tool to potentially guide treatment choices. To address this hypothesis, many MRD-driven trials are beginning to explore treatment intensification in MRD-positive patients after standard treatment (e.g., NCT03901963) or treatment deintensification in sustained MRD-negative patients (e.g., NCT03710603). Ongoing and future MRD-driven trials will contribute to solve the unanswered question: is it recommended to evaluate other induction cycles until the achievement of MRD negativity in patients who are MRD-positive after 4 induction cycles? Can we perform post-transplant consolidation on the basis of MRD status? Can we stop maintenance after 1 year of sustained MRD negativity?
Ongoing and future clinical trials will evaluate the definition and the role of sustained MRD-negativity in treatment decisionmaking. On the one hand, the achievement of a MRDnegative status does not necessarily mean that treatment should be stopped. Indeed, it should be noted that what we define as "MRD-negative" is a MRD undetectable with the current techniques, each one of them having a sensitivity limit. This means that we are not sure that the disease is eradicated even in MRD-negative cases. On the other hand, the achievement of a MRD-positive status after treatment brings the question of whether it is necessary to change treatment, improving the depth of response. However, before developing response-adjusted treatment strategies based on MRD status-either intensifying/changing treatment for MRDpositive patients or de-escalating treatment for MRD-negative patients-we need to understand if sustained MRD negativity should be the treatment goal and to define the most appropriate timepoint for its evaluation (after 1 year or after more years).

AUTHOR CONTRIBUTIONS
SO, MD'A, MB, and AL: substantial contributions to the conception or design, acquisition, analysis, or interpretation of data, critical revision for important intellectual content, final approval of the version to be published, and agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. SO, MD'A, and AL: first draft. MB and AL: supervision.