1 Introduction

Papillary thyroid carcinoma (PTC) constitutes more than 88% of thyroid cancers diagnosed each year [1]. It has an excellent prognosis after sufficient initial therapy, and hence, deciding the best surgical procedure is critically important for effective treatment. However, the optimal initial surgical extent, which is neither too little nor too much, should be employing total thyroidectomy (TT) or lobectomy that has been thoroughly discussed for a considerable time for patients with low-risk PTC (without clinical evidence of extrathyroidal extension [ETE] or lymph node metastasis [LNM]).

Lobectomy alone can be considered a sufficient initial surgery for low-risk PTC patients with a tumor ≤ 4 cm [2, 3]. For patients with high-risk characteristics, such as ETE, incidental regional metastases or aggressive tumor subtype, TT or near-TT and gross removal of all primary tumors should be performed. However, some high-risk characteristics can only be detected upon postoperative histopathological examination. Therefore, completion TT will be recommended for several patients who undergo initial lobectomy with any of those high-risk characteristics; otherwise, they will face a high risk of recurrence. An additional secondary operation, as well as the physical, psychological and financial burden of the subsequent completion TT on the patient, may be unavoidable. Therefore, both the short-term and long-term risks of needing a secondary operation is considered in the frequency of completion TT, which was considered an important indicator to measure the initial surgical extent by many guidelines and institutions.

In several studies, researchers have reported appreciable frequencies of completion TT (43–59%) in patients eligible for lobectomy according to the American Thyroid Association (ATA) guidelines [4,5,6,7]. These results highly exceeded those obtained in a previous study by Nixon et al. [8], which has been cited as evidence in the ATA guidelines, in which only 6% of patients who underwent an immediate completion TT after lobectomy. According to the European specialist group, it is suggested that the surgical recommendations in the ATA guidelines are reconsidered [9, 10]. It is therefore important to determine the need for completion TT to guide the initial surgical procedure. The aim of this study was 1) to determine the prevalence of postoperative high-risk characteristics that would have led to a recommendation for completion TT after lobectomy in low-risk PTC patients with a tumor ≤ 4 cm and 2) to provide a protocol for other institutions for surgical decision-making based on their own definitions of high-risk characteristics.

2 Methods

2.1 Patients

A retrospective analysis was performed for consecutive PTC patients with a tumor ≤ 4 cm who underwent thyroidectomy at the Clinical Research Center for Thyroid Diseases of Yunnan Province between January 2007 and December 2020, which covers more than a quarter of thyroid cancers in Yunnan Province, China.

Patients were selected based on the pathology data, and patients not eligible for an initial lobectomy alone according to the ATA differentiated thyroid cancer management guidelines were excluded. All patients included in the statistical analysis did not have any preoperatively known high-risk characteristics, including (i) gross ETE, (ii) clinically apparent LNM (either preoperatively or intraoperatively), (iii) distant metastases, (iv) a history of head-neck surgery or radiation, a family history of thyroid cancer, or (v) other types of thyroid cancer. Only patients with a tumor size ≤ 4 cm who underwent surgeries performed by high-volume surgeons (> 100 thyroidectomies/year) were included. Additionally, patients with suspicious bilateral multifocality (1186/6151) on preoperative ultrasonography (Thyroid Imaging Reporting and Data System (TI-RADS) 4b or above [11]) were also excluded because TT is usually recommended for these patients before surgery. Detectable contralateral lesions guided the final surgical procedure based on frozen pathology. All surgical procedures were performed in accordance with current or previous Chinese guidelines for differentiated thyroid cancer. The surgical procedure was lobectomy and isthmectomy or TT/near-TT with prophylactic ipsilateral or whole central compartment dissection. PTC patients were separated according to tumor size (the ≤ 1 cm group and the 1–4 cm group) for analysis (Fig. 1).

Fig. 1
figure 1

Flow chart of the included patients

2.2 Indications for completion thyroidectomy

All pathologically high-risk characteristics of thyroidectomy and central lymph node dissection (CLND) specimens were evaluated. The characteristics were recorded if they would affect structured recurrence or lead to a recommendation for completion TT, including aggressive tumor subtypes (tall cell, hobnail, and diffuse sclerosing variants), ETE, LNM, and bilateral multifocal disease. Subsequently, we constructed a completion TT risk stratification system based both on the postsurgical RAI indication and the risk of structural disease recurrence in patients after initial therapy according to the ATA guidelines [2]. In addition to persistent or recurrent structural disease, effective RAI treatment also means that removal of the whole thyroid gland is necessary. The system divided the risk of requiring completion TT into three levels based on the intensity of benefit (Table 1). A higher risk stratification is more likely to benefit from completion thyroidectomy, based on the recommendations of ATA guidelines.

Table 1 Completion of total thyroidectomy risk stratification for PTC patients with lobectomy based on high-risk pathological characteristics

2.3 Completion thyroidectomy risk assessment for guiding the initial surgical procedure

Because the main goal of both initial surgical procedures (TT or lobectomy) is to reduce the risk of needing a secondary operation and the incidence of recurrent disease rather than survival in patients with low-risk PTC [8, 12, 13], the frequency of completion TT can be evaluated by postoperative high-risk characteristics, which also reflect the optimal surgical procedure for patients with low-risk PTC. Based on the definition used in our center, for patients with either an intermediate or high risk of needing completion TT, the optimal extent should be achieved with TT, but for patients without any high-risk features (intermediate and high risk of completion), the optimal extent should be achieved with lobectomy alone. Disputes of the low-risk patients were settled. We use the following notations and equations:

$$\#TT\mathrm{optimal}=\#\mathrm{Patients}\;\mathrm{with}\;\left(\mathrm H.+\;\mathrm{Inter}.\right)\mathrm{risk}\;\mathrm{of}\;\mathrm{completion}\;\mathrm{TT}$$
(1)
$$\#LT\mathrm{optimal}=\#\mathrm{Patients}\;\mathrm{without}\;\left(\mathrm H.+\mathrm{Inter}.+\mathrm L.\right)\mathrm{risk}\;\mathrm{of}\;\mathrm{completion}\;\mathrm{TT}$$
(2)

The intermediate- and high-risk pathological characteristics (shown in Table 1) were identified as the probability of completion TT risk based on the ATA guidelines. The proportion of patients with preoperative clinical low-risk PTC who would benefit from employing TT or lobectomy can be expressed by:

$$\frac{TT\mathrm{optimal}}{LT\mathrm{optimal}}=\frac{\#(\mathrm H.+\mathrm{Int}.)/\#\mathrm{Total}}{\lbrack\#\mathrm{Total}-\#\left(\mathrm H.+\mathrm{Int}.+\mathrm L.\right)\rbrack/\#\mathrm{Total}}=\frac{\mathrm P\left(\mathrm H.+\mathrm{Int}.\right)}{1-\mathrm P\left(\mathrm H.+\mathrm{Int}.+\mathrm L.\right)}=\frac{\mathrm P(\mathrm H.+\mathrm{Int}.)}{\mathrm P(\mathrm{Risk}\;\mathrm{free})}$$
(3)

If the result of Eq. (3) was less than 1, lobectomy as an initial procedure could allow more patients to achieve the optimal surgical extent. In contrast, TT made more patients perform the optimal surgical extent when the result was more than 1 in our real-world cohort.

2.4 Statistical analysis

Statistical analysis was performed using SPSS 24.0 (IBM, NY, America). Fisher’s exact test and Pearson’s chi-squared test were used to compare categorical variables. Continuous variables were compared with the t test or the Kruskal‒Wallis test.

3 Results

3.1 Patient characteristics

We analysed the pathological characteristics of 4,965 patients who met the criteria for lobectomy alone according to the ATA guidelines and underwent thyroidectomy and prophylactic CLND between January 2007 and December 2020. The median age was 42 (interquartile range [IQR] 35–50) years, and 3,871 (78.0%) patients were women. Of this cohort, 2,444 (49.2%) patients underwent TT/near-TT. Prophylactic CLND was performed in all patients, of whom 2,552 (51.4%) underwent whole central compartment dissection. The median number of harvested LNs was 6 (IQR 4–10). The median tumor sizes were 0.5 (IQR 0.4–0.8) cm and 1.5 (IQR 1.3–2.0) cm in the tumor ≤ 1 cm and 1–4 cm groups, respectively (Table 2).

Table 2 Baseline characteristics in the low-risk patients who were eligible for lobectomy alone

3.2 Risk of completion thyroidectomy

Postoperative pathological examination revealed that 1,981 (39.9%) patients had one or more high-risk characteristics, indicating that 787 (15.9%) had an intermediate-high risk of requiring completion TT. The most common high-risk characteristic was incidental central LNM (32.6%), found in up to 25.9% and 54.8% of PTC patients with a tumor ≤ 1 cm and those with a tumor size of 1–4 cm, respectively. However, only 156 (3.1%) of the patients had more than five LNs involved, of whom 97/156 (62.2%) PTC patients had a tumor size of 1–4 cm. As shown in Table 3, 12.0% and 28.7% of PTC patients with a tumor ≤ 1 cm and a tumor size 1–4 cm had an intermediate- and high-risk of needing completion TT, respectively, while 67.2% and 36.6% of patients had no high-risk characteristics in the respective groups. Additionally, the risk of needing completion TT had only a slight influence when comparing the 1–2 cm and 2–4 cm groups (P = 0.449).

Table 3 High-risk characteristics and completion total thyroidectomy risks in different tumor size groups

3.3 Initial surgical decision-making based on completion thyroidectomy risk

We then calculated the frequency of optimal surgical extent for the ≤ 1 cm and 1–4 cm groups using Eq. (3):

$$\leq1cm:\frac{TT\mathrm{optimal}}{LT\mathrm{optimal}}=\frac{\mathrm P(\mathrm H.+\mathrm{Int}.)}{\mathrm P(\mathrm{Risk}\;\mathrm{free})}=\frac{10.2\%+1.8\%}{67.2\%}<1$$
$$1-4cm:\frac{TT\mathrm{optimal}}{LT\mathrm{optimal}}=\frac{\mathrm P(\mathrm H.+\mathrm{Int}.)}{\mathrm P(\mathrm{Risk}\;\mathrm{free})}=\frac{21.0\%+7.7\%}{36.6\%}<1$$

Employing lobectomy for the initial surgical procedure could allow more patients to meet the criteria of optimal surgical extent, with 67.2% and 36.6% of low-risk PTC patients with a tumor size ≤ 1 cm and those with a tumor size of 1–4 cm benefiting from the initial lobectomy procedures, respectively. In addition, Fig. 2 allows clinicians to decide on the initial surgical procedure based on their own definition of needing completion TT in their institution.

Fig. 2
figure 2

Cumulative frequencies of postoperative high-risk characteristics that can be used to develop a stratification system for the risk of needing completion total thyroidectomy. Shown on the left side of the figure are the cumulative frequencies in low-risk PTC patients with at tumor size ≤ 1 cm. Shown on the right side of the figure are the cumulative frequencies in low-risk PTC patients with a tumor size 1–4 cm

4 Discussion

Controversy over whether TT or lobectomy should be employed as the initial surgical procedure for patients with low-risk PTC exists not only among scholars but also widely in the recommendations from different practice guidelines [2, 3, 14, 15]. Given the extremely low incidence of disease-specific mortality among patients with low-risk PTC, the cognitive differences in the those with a risk of needing completion TT mentioned in the guidelines have contributed the major controversy. Moreover, the discordance in the surgical procedure recommendations has decreased guideline adherence and created confusion for surgeons, which are both associated with poor patient prognoses [16].

According to a general consensus, survival is not significantly compromised in low-risk PTC patients who undergo TT than in those undergo lobectomy [8, 10, 12, 13]. Quality of life (QoL) and psychophysical status should be high priorities when evaluating an optimal initial surgical procedure. Therefore, the risk of needing a secondary surgery, whether a completion or recurrence operation, has become a major concern in surgical extent decision-making. Using aggressive postoperative characteristics as the dependent variables, the different risks of both surgical procedures (e.g., the risk of needing a secondary operation after lobectomy versus excessive surgical extent for TT) can be evaluated to provide a reference to define their own initial routine thyroidectomy extent according to our findings.

4.1 Evaluation indexes of completion thyroidectomy risk

The cohort described in this study had a much higher prevalence of high-risk characteristics than previous reports, both for the ≤ 1 cm group (32.8%) and the 1–4 cm group (63.4%). This may be because we adopted strict inclusion criteria, such as 52.9% bilateral surgeries, 100% prophylactic CLND and surgeries performed by high-volume surgeons. Given the propensity of PTC for lymphatic metastasis, ruling out occult nodal disease on prophylactic CLND should examine 3 and 4 nodes in patients with T1b and T2 disease, respectively [17] (median 6 nodes examined in this cohort). It is conceivable that the more thorough the surgical extent, the more accurate the evaluation with true aggressive characteristics will be. However, this also raises the question of replicability and generalizability of our findings. Due to the high relevance between surgeon experience and incidence of complications [18, 19], as well as the general scarcity of high-volume surgeons around the world, surgical complications were not included in our equations. Although they also play an important role in the initial surgical decision, we concluded that complications are related to the skill of the individual surgeon rather than to an estimated average incidence. We provided a method based on the probability of requiring completion TT risk to guide the initial surgical decision, which can be used flexibly according to different PTC size groups and different definitions of aggressive characteristics.

The parameters selected in the present study are different from those in most studies evaluating the completion risk; that is, we added incidental bilateral multifocality and excluded the involvement of ≤ 5 LNs [5,6,7, 20]. (i) Apparent suspicious lesions would not disturb the surgical procedure because of their satisfactory detectability by preoperative ultrasonic examination with fine-needle aspiration. Only incidental contralateral cancers would cause persistent diseases, and the subsequent completion TT (or recurrence) risk would be almost inevitable. Therefore, incidental contralateral cancers should be used as an evaluated indicator in studies with covered sufficient bilateral thyroidectomy samples with effective ultrasound data. (ii) Whether all patients who are diagnosed with occult LNM after undergoing lobectomy should undergo completion TT remains debatable. The risk of structural recurrence in patients with ≤ 5 micrometastases is minimal at best (4.0%) [21]. Another study of 876 patients who underwent lobectomy with prophylactic CLND showed that the recurrence rate (P = 0.133) and disease-free survival rate (P = 0.065) were not significantly different between LN-negative and LN-positive groups [20]. In addition, a proportional hazards model with restricted cubic splines showed that more than 5 metastatic LNs can be considered a change point for evaluating survival prognosis [22]. Therefore, the completion risk can be well evaluated by these parameters in our stratification system (Eq. 3). The probability of lobectomy alone as the optimal initial surgical extent can be considered that the proportion of patients at risk-free of completion among all patients, because these patients rarely face a disease recurrence or a secondary operation after lobectomy. In contrast, the probability of TT as the optimal extent can be considered the proportion of intermediate and high completion risk in all cases.

4.2 Initial surgical procedure decision-making for patients with ≤ 1 cm low-risk PTC

Currently, lobectomy alone is considered a sufficient initial procedure for low-risk PTC patients with a tumor ≤ 1 cm according to worldwide consensus. High-risk characteristics were found in 32.8% of patients postoperatively, whereas only 10.2% and 1.8% were classified as having the high and intermediate completion TT risk, respectively; this is mainly due to differences in the assessment of LN disease. Similar to our findings (1.5%), a South Korean study involving 2,735 PTC patients with a tumor size ≤ 1 cm with prophylactic CLND showed a similar low prevalence of the involvement of > 5 LNs (4.0%) [23]. In these large cohort studies, researchers reconfirmed that employing lobectomy allows more PTC patients with a tumor size ≤ 1 cm to achieve the optimal surgical extent at the initial surgery. Moreover, up to 67.2% of patients were considered without any risk of completion TT (risk-free) after the initial lobectomy in the present study which consistent with previous findings.

4.3 Initial surgical procedure decision-making for patients with 1–4 cm low-risk PTC

Determining which procedure will be performed as the initial surgery for low-risk PTC patients with a tumor size 1–4 cm is one of the most critical controversies. In our previous study, we summarized 19 guidelines and their core reference evidence [24]. Of these guidelines, the extent of the initial surgery in 11 and 4 guidelines was consistent with TT/NTT and lobectomy, respectively. Due to the extremely long natural course of PTC, it is impractical to verify the prognosis with a randomized controlled trial (RCT). In the present study, although we provide satisfactory and current evidence of a large sample cohort that shows that 28.7% of low-risk PTC patients with a tumor size of 1–4 cm had an overall risk of needing completion (21.0% and 7.7% having the high and intermediate risk, respectively), lobectomy would allow more (36.6%) patients to achieve the optimal surgical extent. The prevalence of high-risk characteristics allows clinicians to decide on the initial surgical procedure based on their own institutional risk definition for completion TT. However, deciding the initial surgical procedure may be more than a simple matter of mathematics.

For many years, large database studies and retrospective studies have been used as the main evidence for evaluating the surgical extent in PTC patients with a tumor size 1–4 cm [22]. Given the lack of irrefutable evidence, the number of studies supporting either the TT or lobectomy procedure appears to increase every year. The clinical guidelines mainly serve to the medical staff in the local area, however the distinction in the recommendations stems from the different emphases on the evidence cited by different professional associations, which reflects the tendencies of clinical practice and clinical research between regions and is formed as a result of the long-term effects of local medical policy and the cultural environment. Therefore, although the current derivations show that employing lobectomy would result in more patients achieving an optimal surgical extent (36.6% vs. 28.7%) (Equations [1] and [2]), one question needs to be discussed: who can be served by our research?

Except for the probability of completion, two other factors should be evaluated in the initial surgical decision: the patient’s QoL and the surgeon’s surgical experience. (i) A European survey of QoL showed that recurrence (67%) was the highest ranked problem among the 25 most common problems for thyroid cancer patients, and 45% of patients worried about secondary operations [25]. Another study from South Korea showed that 53% of patients were more likely to choose a strategy with a low risk of recurrence, even if that might increase the risk of complications [26]. Individual patients may prefer one-time surgery (e.g., to reduce the risk of needing completion TT to prevent recurrence) to resolve the disease. (ii) There is a direct correlation between surgical experience and patient risk/benefit; for example, TT leads to a lower recurrence rate and an increase in the risk of complications, whereas lobectomy does the opposite. Therefore, the evaluation of the optimal initial surgical procedure should focus on the risk of needing completion TT (including a short- or long-term secondary operation) rather than disease-specific mortality in patients with low-risk PTC. This gives us an opportunity to modify the surgical procedure by comprehensively evaluating intraoperative high-risk characteristics and complications. Most surgeons can self-assess their individual surgical skills and complication risks. In the context of the development of techniques such as nerve monitoring and parathyroid recognition during the operation [27, 28], intraoperative decision-making after lobectomy also plays an important role through the evaluation of complications, such as neurological function and parathyroid injury after lobectomy [29]. As Fig. 3 shows, we can flexibly select whether perform an additional contralateral lobectomy can be flexibly selected based on the dual evaluations of intraoperative high-risk characteristics (frozen pathology) and complications. In fact, intraoperative evaluation is an important complementary procedure to improve the accuracy of the initial surgical decision-making in patients with high and intermediate completion risk, not only for surgeons but also for individual patients.

Fig. 3
figure 3

Intraoperative evaluation process for surgical decision-making after lobectomy

The current study has several limitations. First, the frequency of patients requiring TT may have an inherent potential for selection bias, as our cohort was selected from a special thyroid centre of a tertiary hospital, with thyroidectomies only performed by high-volume surgeons included to ensure the standard extent of the surgical procedure. However, it is unrealistic to expect that all thyroidectomies are performed by high-volume surgeons. Considering the availability of our research results, complications were not considered. In our centre, 0.4–1.0% and 0.2–0.6% of PTC patients experienced permanent hoarseness and permanent hypocalcaemia, respectively, after thyroidectomy, which was also performed by high-volume surgeons [30,31,32]. However, a single-centre database may have some advantages in terms of quality control compared to large national databases; for example, 40% of records for thyroid cancer have been reported with inaccurate coding in the Tennessee Cancer Registry [33]. Second, in this retrospective dataset, 47.1% and 48.6% of patients underwent lobectomy and ipsilateral CLND, respectively, which may have led to aggressive characteristics being missed. However, the risks are small at best in those residual tissues, and it is unlikely to interfere with our findings [34]. Third, the size of positive LNs was not included in our study. Based on the ATA guidelines definition, patients with a low risk of recurrence should have both ≤ 5 positive LNs and all nodal sizes < 0.2 cm [21]. Thus, the rate of completion TT in our study may be underestimated, even though all clinical N1 cases were excluded. Fourth, BRAF mutations, which are identified as imparting an intermediate risk of recurrence according to the ATA guidelines, were not considered since they were not routinely investigated during our study period. Approximately 72.4% of PTC patients carry BRAF mutations in China [35]; therefore, the risk of needing to perform completion TT will most likely be increased substantially if this variable is included. Finally, the benefit ratio estimates that the optimal initial surgical extent may be partial. In addition to the risk of needing completion TT, the decision factors for the initial surgery should also include complications, patient preference, surgeon’s surgical experience, number of hospital stays, and costs. Although RCT validation is optimal, the long natural course of PTC makes this difficult. Therefore, it may be necessary to use available real-world evidence to guide initial surgical procedures, which will benefit more patients. We also hope that more surgical decision indicators will be included in the evaluation formula by other institutions to provide a more accurate initial surgical evaluation in clinical practice.

We provide a protocol to guide initial surgical decision-making based on different definitions of the risk of needing completion TT for patients who are diagnosed with low-risk PTC and tumor size ≤ 4 cm. Our findings confirm that lobectomy alone as the initial procedure could allow more PTC patients with both a tumor size ≤ 1 cm and those with a tumor size of 1–4 cm to achieve the optimal surgical extent. Clinicians can decide the initial surgical procedure by performing a completion TT risk assessment based on the definition of high-risk characteristics in their own institution.