Sentinel lymph node (SLN) biopsy has become standard care for lymph node staging in breast cancer, represents a substantial step in the evolution of breast cancer surgery toward greater conservatism, and is one of the great success stories in contemporary surgical oncology. The most salient surgical questions (feasibility, accuracy, case selection, technique, and morbidity) have been asked and answered, and it is increasingly difficult to generate debate on any of them. In contrast, most aspects of SLN pathology remain controversial and elude consensus. Is intraoperative assessment worthwhile? Which method (frozen section, touch prep or smear) is best? How should SLN be processed for permanent pathology [single-section hematoxylin and eosin (H&E) and/or serial sections and/or immunohistochemistry (IHC)]? What is the prognostic significance of SLN micrometastases, especially those detected only by IHC, or as pN0i+ disease (≤0.2 mm in size)? Is completion axillary dissection (ALND) required for all patients with SLN who are positive on final pathology? Is there a low-risk group for whom ALND is unnecessary and can we reliably identify this group? All of these issues are highly interrelated, but the last remains the most perplexing for surgeons and is the subject of a substantial literature. What have we learned?

Predicting Metastasis to Non-SLN

Non-SLN metastases are present in 40–50% of SLN-positive patients1,2 and are predicted by the same variables which predict metastasis to the SLN (or to axillary nodes in general): the most important are tumor size and lymphovascular invasion (LVI). Non-SLN metastases are also predicted by the characteristics of the SLN metastasis: the most important are method of detection (frozen section, H&E, serial sections, or IHC), size of SLN metastasis [<0.2 mm (pN0i+), 0.2–2 mm (pN1mi), >2 mm (pN1)], number of positive SLN, presence of extranodal invasion, and number of negative SLN removed. Many papers predict non-SLN metastases on the basis of one or more variables, as previously summarized by Van Zee (20 studies in 26–702 patients).3 A meta-analysis by Degnim et al.4 (11 studies in 60–389 patients) found that non-SLN metastasis was most strongly associated with tumor size, LVI, more than one positive SLN, SLN metastasis >2 mm, and extranodal extension. A meta-analysis by Cserni et al.5 (25 studies) found that non-SLN metastases were present in 20% of patients with low-volume SLN disease and in 9% with SLN metastases detected only by IHC. Finally, a large series by Viale et al.6 of 1,228 SLN-positive patients found that the risk of non-SLN metastasis in patients with the most favorable combination of predictive factors was no less than 13%.

The MSKCC Nomogram

The prediction of non-SLN status on the basis of one or a few variables is problematic, with risk estimates that vary widely between studies. The most logical response is to develop a multivariate model from a large dataset, and to validate it in a separate cohort of patients. Van Zee et al.7 have done so, drawing on our own experience in 1,075 SLN-positive patients who had a completion ALND. Multivariate logistic regression was used in 702 patients to develop a multivariate nomogram, the Memorial Sloan–Kettering Cancer Center (MSKCC) nomogram. The eight variables include tumor size, type/grade, LVI, multifocality, estrogen receptor (ER) status, method of SLN metastasis detection, number of SLN positive, and number of SLN negative. This model was validated prospectively in another 373 patients, and in a calibration plot there was good agreement across a wide range of probabilities between the predicted and the observed rates of non-SLN metastasis. The MSKCC nomogram (along with a more recent nomogram for the prediction of SLN metastasis8) is available online (www.mskcc.org/nomograms) in the form of a simple calculator.

Validating the MSKCC Nomogram

The MSKCC nomogram is simply a test to predict the probability of non-SLN metastasis. The most direct validation is to compare the predicted and observed rates of non-SLN metastasis; in our own calibration plots, the nomogram performs well across a broad range of predicted rates.7 A more comprehensive measure of its performance is the receiver operating characteristic (ROC) curve. ROC curves were developed in World War II as part of a field called signal detection theory, and were used to evaluate the ability of radar operators to distinguish between the signals of enemy and friendly ships. They are now widely used in medicine to assess test performance; we recommend a particularly lucid online discussion of this topic by Tape.9

The ROC curve plots true positives (sensitivity) on the Y axis against false positives (1 − specificity) on the X axis (Fig. 1). A test’s accuracy (ability to distinguish between patients with and without a condition) is represented by the area under the curve (AUC). An AUC of 1 (a vertical line along the Y axis) indicates a perfect test with no false-positive results; an AUC of 0.5 (a diagonal line at 45°) indicates a worthless test with an equal number of true-positive and false-positive results, the equivalent of a coin toss.

Fig. 1
figure 1

A comparison of ROC curves for excellent, good, and worthless diagnostic tests. In a worthless test (AUC = 0.50), there are equal numbers of true-positive and false-positive results; adapted from Tape.9

Most diagnostic tests, including our nomogram, are imperfect and have an AUC somewhere in between. The MSKCC nomogram AUC of 0.77 means that, between two randomly selected SLN-positive patients of whom one has a positive non-SLN, the nomogram would correctly identify that patient 77% of the time. The MSKCC nomogram has been validated by 15 studies worldwide (Table 1),7,1022 including that of Poirier et al.22 in this issue of the Annals of Surgical Oncology, and other validation series are certain to follow. The nomogram had proved robust despite differences in patient demographics, clinical characteristics, surgical technique, and pathologic processing. The AUC values range from 0.58 to 0.86 and as one might expect, the highest (0.82 and 0.86) and lowest (0.58) values come from studies with fewer than 100 patients.

Table 1 Results of series validating the MSKCC nomogram

Critiquing the Nomogram: Can We Do Better?

Nomogram development lends itself to statistical tweaking, and many groups have raised caveats and proposed improved models. Degnim et al.11 selected subsets of their patients in whom the MSKCC nomogram predicted ≤5% and ≤10% probabilities of non-SLN metastases, and observed false-negative rates of 17% and 11%, respectively; using a modified model, they were able to reduce the false-negative rate for low-risk patients, but without increasing the AUC. Alran et al.17 observed an AUC of 0.72 for all of their patients, but only 0.52 for patients with SLN micrometastases (≤2 mm), and concluded that the nomogram was not reliable in low-probability cases. Kohrt et al.,21 using three different statistical techniques, developed a model based on three variables (tumor size, LVI, and SLN metastasis size) which outperformed the MSKCC nomogram in their own patients, with AUCs of 0.83–0.85 versus 0.77. Of note, only 60% of their 285 patients had complete pathologic data. Coutant et al.23 reported a scoring system based on three variables (tumor size, presence/absence of macrometastasis, and no. of positive SLNs divided by no. of SLNs removed), and achieved an AUC of 0.82. Pal et al.19 added SLN metastasis size to their model and improved the AUC from 0.68 (MSKCC) to 0.84. Finally, Dauphine et al.13 (in 39 patients) applied three different scoring systems, observed AUCs of 0.63, 0.70, and 0.68, and advised caution in the application of each.

It is worth emphasizing that subset analyses of nomogram performance are problematic. Alran et al.17 observed a poor AUC (0.52) for patients with SLN micrometastases, but their own unpublished data (S. Alran, personal communication) indicate good agreement between the observed and predicted rates of non-SLN involvement for patients with SLN micrometastases ≤2 mm (14% observed versus 10% predicted) and for patients with SLN positive only on IHC (11% observed versus 9% predicted). In this issue of the Annals of Surgical Oncology, Poirier et al.22 report that, for their patients with nomogram scores of ≤10% (18% of all cases), the observed rate of non-SLN metastasis was 13% [95% confidence interval (CI) 2–24%]. Based on a small sample size (n = 37) with wide confidence intervals, this degree of variation is not at all surprising. Taken together, these studies demonstrate that: (1) a low AUC in a small subset indicates lack of precision in being able to discriminate between two numbers within a small range (for example 11% versus 9%), a deficiency which is not clinically relevant, and (2) an observed rate of non-SLN metastasis lower or higher than the predicted value neither proves nor disproves the value of the nomogram. In fact, the nomogram performed well in both studies.

There is to date no standardized methodology for pathologic analysis of SLN, and this inconsistency may account for at least some of the observed variation in nomogram performance. We began to perform SLN biopsy in 1996, adopted a pathologic protocol using serial sections/IHC,24 and, in the MSKCC nomogram, categorized SLN metastasis by pathologic detection (frozen section, routine H&E, serial sections, IHC-only). Current American Joint Committee on Cancer (AJCC) staging25 categorizes nodal metastases on the basis of size (≤0.2 mm, 0.2–2 mm, >2 mm) and we are now updating the nomogram on this basis. We agree with Turner et al.26 that AJCC lymph-node staging is subject to wide interpretive variation, and that there is substantial room for improvement. Lymph node staging should be simple, reproducible, cost effective, and clinically relevant; in current practice, we have not yet achieved these goals.

The Nomogram in Everyday Practice: What It Can and Cannot Do

First, the MSKCC nomogram was designed to estimate the probability of non-SLN metastases in SLN-positive patients, not to determine with certainty that non-SLN disease is (or is not) present. It is crucial to recognize that the nomogram correctly discriminates between randomly selected patients with and without SLN metastases in about three-quarters of cases, i.e., that it is not perfect.

Second, the MSKCC nomogram is superior to clinical judgment. In two separate studies using hypothetical scenarios,27,28 the nomogram outperformed clinician “guesstimates” (in one of these,27 the AUC values for nomogram versus clinicians were 0.72 versus 0.54, P < 0.01). It is again crucial to recognize that the nomogram is a more accurate guess than clinical judgment, but that it remains a guess.

Third, the MSKCC nomogram cannot tell us (or our patients) what to do; there is no cutoff nomogram score which mandates the performance of an ALND. When Poirier et al.22 in this issue of the Annals of Surgical Oncology state that 71% of Quebec surgeons would not perform an ALND for nomogram scores of ≤10%, they imply a 10% cutoff. While we have observed a declining rate of ALND in our SLN-positive patients and lower nomogram scores in SLN-positive patients who did not have ALND compared with those who did (10% versus 37%), the range in nomogram scores for the no-ALND patients was wide (1–89).29 The decision for ALND in SLN-positive patients should be individualized considering multiple factors (patient age, comorbidities, anxiety level, and implications for systemic therapy, among others), and not based on the nomogram score alone.

Future Directions: Are We Asking the Wrong Question?

At present, the most important reasons for ALND in SLN-positive patients are to guide systemic therapy and to prevent local recurrence. The decision for systemic therapy is multifactorial, and, for SLN-positive patients, is infrequently changed by the discovery of additional positive nodes. For the occasional SLN-positive patient in whom systemic therapy might be changed, completion ALND is reasonable. In our opinion, this assessment is better made by the medical oncologist than the surgeon.

Regarding local control, six series (n = 583, 2003–2007) of selected SLN-positive/no-ALND patients report axillary local recurrence (LR) of 0.5% at a median follow-up of 31 months, results quite comparable to those in 14 series (n = 3802, 2004–2007) of SLN-negative/no-ALND patients: 0.3% at 47 months’ follow-up.30 In our own series of SLN-positive/no-ALND patients,29 we observed axillary LR as a first event in 1% (3 of 287 patients) at follow-up of 23 months. It seems inconceivable that these very low rates of axillary LR would ever reach 10%, the level at which LR had a detectable adverse effect on survival in the most recent Early Breast Cancer Trialists’ Collaborative Group overview.31 Across the burgeoning literature validating the MSKCC nomogram, we are in effect asking the question, “Which SLN-positive patients do not need ALND?” It is quite clear from the data above that at least some do not and that practice patterns are changing. The American College of Surgeons Oncology Group (ACOSOG) Z0011 trial32 (a randomization of SLN-positive patients to ALND versus observation) closed early due to slow accrual and low event rates, but was ahead of its time in asking a better question: “Which SLN-positive patients, if any, need ALND?” It is time to ask this question again.