Conservative management versus tonsillectomy in adults with recurrent acute tonsillitis in the UK (NATTINA): a multicentre, open-label, randomised controlled trial

BACKGROUND
Tonsillectomy is regularly performed in adults with acute tonsillitis, but with scarce evidence. A reduction in tonsillectomies has coincided with an increase in acute adult hospitalisation for tonsillitis complications. We aimed to assess the clinical effectiveness and cost-effectiveness of conservative management versus tonsillectomy in patients with recurrent acute tonsillitis.


METHODS
This pragmatic multicentre, open-label, randomised controlled trial was conducted in 27 hospitals in the UK. Participants were adults aged 16 years or older who were newly referred to secondary care otolaryngology clinics with recurrent acute tonsillitis. Patients were randomly assigned (1:1) to receive tonsillectomy or conservative management using random permuted blocks of variable length. Stratification by recruiting centre and baseline symptom severity was assessed using the Tonsil Outcome Inventory-14 score (categories defined as mild 0-35, moderate 36-48, or severe 49-70). Participants in the tonsillectomy group received elective surgery to dissect the palatine tonsils within 8 weeks after random assignment and those in the conservative management group received standard non-surgical care during 24 months. The primary outcome was the number of sore throat days collected during 24 months after random assignment, reported once per week with a text message. The primary analysis was done in the intention-to-treat (ITT) population. This study is registered with the ISRCTN registry, 55284102.


FINDINGS
Between May 11, 2015, and April 30, 2018, 4165 participants with recurrent acute tonsillitis were assessed for eligibility and 3712 were excluded. 453 eligible participants were randomly assigned (233 in the immediate tonsillectomy group vs 220 in the conservative management group). 429 (95%) patients were included in the primary ITT analysis (224 vs 205). The median age of participants was 23 years (IQR 19-30), with 355 (78%) females and 97 (21%) males. Most participants were White (407 [90%]). Participants in the immediate tonsillectomy group had fewer days of sore throat during 24 months than those in the conservative management group (median 23 days [IQR 11-46] vs 30 days [14-65]). After adjustment for site and baseline severity, the incident rate ratio of total sore throat days in the immediate tonsillectomy group (n=224) compared with the conservative management group (n=205) was 0·53 (95% CI 0·43 to 0·65; <0·0001). 191 adverse events in 90 (39%) of 231 participants were deemed related to tonsillectomy. The most common adverse event was bleeding (54 events in 44 [19%] participants). No deaths occurred during the study.


INTERPRETATION
Compared with conservative management, immediate tonsillectomy is clinically effective and cost-effective in adults with recurrent acute tonsillitis.


FUNDING
National Institute for Health Research.


Introduction
Tonsillectomy for patients with recurrent acute tonsillitis is one of the most common adult operations performed by the National Health Service (NHS) in England, with approximately 16 000 procedures per year. 1 In the USA, 102 000 tonsillectomy procedures are done annually. 2 The health-care cost of sore throat episodes in the UK is estimated to be as high as £2·35 billion (US$3·20 billion). 3 In the past 20 years, tonsillectomy rates have reduced by up to 50% in many European countries, whereas hospital admissions for tonsillitis have increased by 136%. 4 Tonsillectomy is a painful procedure which requires an average of 14 days off work. 5,6 The level 1 evidence for tonsillectomy in adults is scarce, 5 which probably contributes to variation in tonsillectomy rates between countries. 7 The 2014 Cochrane Review identified two studies with 156 participants and concluded that the evidence for tonsillectomy in adults was of low quality. 5 A meta-analysis showed that the number of sore throat days was 10·6 days (95% CI 5·8-15·8) fewer in patients receiving tonsillectomy than those treated conservatively during approximately 6 months of follow-up. However, ( neither study accounted for postoperative sore throat days. NATTINA was commissioned by the NHS National Institute for Health and Care Research (NIHR) with an aim of addressing the evidence gap by assessing the clinical effectiveness and cost-effectiveness of conservative management versus tonsillectomy in patients with recurrent acute tonsillitis.

Study design and participants
This pragmatic multicentre, open-label, randomised controlled trial was conducted in 27 hospitals in the UK. Participants were adults aged 16 years or older who were newly referred to secondary care otolaryngology clinics with recurrent acute tonsillitis. Recruited participants had to meet the UK guidelines for tonsillectomy, which stipulate that sore throat episodes are due to acute tonsillitis; the episodes of sore throat prevent healthy functioning; and that there are seven or more clinically significant sore throat episodes in the preceding year, or five or more episodes in each of the preceding 2 years, or three or more episodes in each of the preceding 3 years. 8 We did not aim to address the evidence rationale for these largely consensus-based guidelines. The full list of inclusion and exclusion criteria is shown in the appendix (p 1). All participants provided written informed consent. The trial protocol 9 was approved by the North East Research Ethics Committee (November, 2014; 14/NE/1144). Oversight was done by independent data monitoring and trial steering committees.

Randomisation and masking
Patients were randomly assigned (1:1) to receive tonsillectomy or conservative management using random permuted blocks of variable length. Randomisation was performed centrally by the Newcastle Clinical Trials Unit Randomisation Service, which is an in-house, bespoke, internet-based system. A statistician (Newcastle University, Newcastle, UK), otherwise not involved in the trial, produced the final allocation schedule. Stratification by recruiting centre and baseline symptom severity was assessed using the Tonsil Outcome Inventory-14 (TOI-14) score (categories defined as mild 0-35, moderate 3648, or severe 49-70). 10 No masking was done in this study.

Procedures
To reflect the pragmatic nature of this trial because tonsillectomy is a commonly performed operation by surgeons, including supervised trainee surgeons, the recruiting centre was used as a marker of the surgical team. Data on individual surgeons were not collected. Participants in the tonsillectomy group received elective surgery to dissect the palatine tonsils within 8 weeks after random assignment. Surgical methods used were the same as in standard care at recruitment sites and no stipulations were made by the trial team regarding the choice of surgical approach. Participants in the conservative management group received standard nonsurgical care during 24 months, comprising of selfadministered analgesia plus ad hoc primary care prescription of antibiotics or attendance at emergency departments for severe tonsillitis.
For the primary outcome, once per week data were collected mostly with a text message (email and telephone calls were also available; return of data defined as returns). To reflect routine practice and national guidelines, and maximise symptom data capture, throat swabs were not required to report a sore throat. Surgical outcome follow-up after a tonsillectomy occurred by telephone 1-2 weeks after the surgery.
For secondary outcomes, questionnaires (TOI-14 and 12-Item Short Form Survey ) were posted at 6 and 18 months after random assignment and participants followed up in person at 12 and 24 months. TOI-14 is a

Research in context
Evidence before this study Iterations of the Cochrane Review on the effectiveness of tonsillectomy in adults with recurrent tonsillitis identified low-quality evidence. Two small, randomised trials with short follow-up suggested that the number of sore throat days might be fewer in the first 6 months after a tonsillectomy than after conservative management. Given scarce evidence to support this common intervention, the UK National Institute for Health and Care Research commissioned a pragmatic clinical trial. In line with Cochrane Review recommendations, the brief specified that the primary outcome should be sore throat days during 24 months. We searched MEDLINE for clinical trials published between Jan 1, 1996, and Jan 1, 2022, using the search term "tonsillectomy", limited to randomised trials and adults and found no further relevant clinical trials.

Added value of this study
To our knowledge, NATTINA is the largest randomised trial to assess the clinical effectiveness and cost-effectiveness of a surgical intervention in adults with a sore throat. We showed that compared with conventional medical management, tonsillectomy is clinically effective and cost-effective. The incident rate ratio of sore throat days was 0•53 with tonsillectomy during 24 months on the most conservative estimate and 0•42 when accounting for actual treatment administered.
Implications of all the available evidence NATTINA quantifies the comparative value of tonsillectomy for health service providers, optimises the basis for shared decision making in adults with recurrent tonsillitis, and thus, has the potential to reduce regional variation in referrals for tonsillectomy. But predominantly, NATTINA encourages timely access to care.
See Online for appendix patient-reported, disease-specific, quality of life (QoL) questionnaire reflecting symptoms in the previous 6 months, validated in patients with chronic tonsillitis who had a tonsillectomy. Each item is a 0-5 Likert score with an adjusted total score of 100 (higher scores indicate worse QoL). The SF-12 is a shortened version of the SF-36, comprised of SF-12 Mental Component Summary (MCS) scores and SF-12 Physical Component Summary (PCS) scores. 11 The maximum score for both components is 100 (higher scores indicate better QoL).

Outcomes
The primary outcome was the number of sore throat days collected during 24 months after random assignment. Secondary outcomes were TOI-14 score at baseline and at 6, 12, 18, and 24 months; general QoL measured using the SF-12 at baseline and at 6, 12, 18, and 24 months; the number of adverse events; and the views and experiences of patients and clinicians regarding tonsillectomy and conservative management (reported elsewhere). 12 Economic evaluation outcomes (part of secondary outcomes) were the costs incurred by health-care providers to manage reoccurring sore throat episodes, collected with case report forms and directly from participants using bespoke Health Utilisation Questionnaires (HUQs) at baseline and 6, 12, 18, and 24 months (including a time and travel questionnaire at 18 months); direct and indirect costs incurred by participants collected with self-reported questionnaires; quality-adjusted life-years (QALYs) based on responses to the SF-12 mapped onto the Short Form 6-Dimension (SF-6D) instrument; 13 and incremental cost per QALY gained.

Statistical analysis
The sample size of 510 participants was based on 90% power to detect a standardised effect size of 0·33 (mean sore throat intergroup difference of 3·6 days [pooled estimated SD 10·8]) in a quantitative outcome, assuming a type 1 error rate of 5% (appendix p 1), allowing for 20% loss to follow-up and 5% crossovers (ie, those who switched treatment groups). This effect size was based on previous research. 14 Due to slower than expected recruitment, a substantial amendment was approved by the independent trial steering committee and funder, to reduce the power to 85% and recruitment sample size to 444 (retaining the same detectable difference).
The primary analysis was done in the intention-to-treat (ITT) population (defined as randomly assigned participants who returned primary outcome data) using mixed-effect multivariable negative binomial regression with outcome variable sore throat days, adjusted for the recruiting centre (as a random effect) and baseline severity (as a fixed effect). A two-sided p value of less than 0·05 was considered as significant.
To account for missing data during 104 weeks of collection, the rate of return of sore throat data was included as an exposure variable in the negative binomial regression model. We assumed that these once per week data were missing at random. The effect of randomisation is summarised with the incident rate ratio (defined as the ratio of predicted numbers of sore throat days in the two treatment groups).
Pre-planned primary sensitivity analyses included a per protocol analysis restricted to participants who had a tonsillectomy within 8 weeks after random assignment compared with those who remained within the conservative group and did not cross over to surgery; and a per treatment analysis of participants who received a tonsillectomy at any time during follow-up compared with those who received conservative management, regardless of the initial randomisation group.
Two unplanned sensitivity analyses were performed. The first analysis assumed that a missing sore throat return meant that there were no sore throat episodes to report that week, which was done by omitting the exposure variable in the negative binomial regression model for the ITT analysis. The second repeated the primary ITT analysis for participants who returned at least 80% of the sore throat data.
TOI-14 scores at each timepoint were compared using descriptive statistics and maximum likelihood computations with repeated measures, adjusted for treatment groups and stratification factors. A two-sample t test quantified the between-group difference at 12 and 24 months. SF-12 scores were analysed using repeated measures in a similar way to TOI-14 scores. Data were processed using the Optum PRO CoRE software (SF-12 version 2) to derive the MCS and PCS scores. In the economic evaluation, primary and secondary healthcare costs were based on cost of surgery and self-reported health-care use, collected with the HUQ and combined with unit costs from routine sources. 15 Total health-care cost per participant was estimated and summarised as the average total cost per group.
A time and travel questionnaire administered at 18 months collected data on direct (eg, parking) and indirect (eg, time away from usual activities) costs incurred by participants to attend the health-care appointments and was used to derive a standard set of unit costs for time and travel costs. The questionnaire was used for the first 2 years of recruitment and stopped thereafter to reduce participant response burden. Direct (eg, over-the-counter medication) and indirect (eg, time away from usual activities) costs associated with sore throat episodes were collected with the bespoke HUQ questionnaire administered when a participant reported a sore throat in the previous week. These data were collated when participants reported a health-care contact using the HUQ questionnaire or a sore throat day with return of sore throat data.
Utility values from the SF-6D were used to estimate QALYs using the area under the curve. 16 Costs and QALYs incurred in the second year of follow-up were For the Optum PRO CoRE software see https://www. qualitymetric.com/ discounted at 3·5%. 17 Missing cost and utility data were assumed to be missing at random. Chained multiple imputation by predictive mean matching was used for missing data. 18 Missing data were imputed simultaneously using regression models controlled for baseline characteristics (TOI-14, age, sex, and ethnicity). Differences in average costs were estimated using seemingly unrelated regression. 19 Uncertainty in results was estimated using stochastic sensitivity analysis (bootstrapping). 20 The bootstrapping method estimates the difference in costs and QALYs between two participants (one from each group) with resampling, and this method was done for 1000 samples. A cost-effectiveness acceptability curve and plane were used to illustrate uncertainty in the estimation. 21,22 Cost-effectiveness acceptability curves were replicated with total costs converted into US$ and € using the Cochrane Economic Methods Group Cost Converter. 23 The economic analysis was done following best practice guidelines. 24 All statistical analyses were performed using STATA (version 16). This study is registered with the ISRCTN registry, 55284102.

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.

Results
Between May 11, 2015, and April 30, 2018, 4165 participants with recurrent acute tonsillitis were assessed for eligibility and 3712 were excluded (figure 1). 453 eligible  Univariate and subsequent multi variate analysis showed that sex and educational levels were not independently associated with treatment outcomes. TOI-14 was collected for 493 (13%) of 3712 patients who declined entering the trial. This group had higher baseline TOI-14 scores than those entering the trial (appendix p 2).
Of 231 participants who underwent tonsillectomy, 122 (53%) were cold dissection, 91 (39%) were bipolar diathermy, one (<1%) was coblation, one (<1%) was laser, eight (3%) were mixed (cold and bipolar dissection), and one (<1%) was done outside of the trial site. Details of tonsillectomy type were not received for seven (3%) patients. Grade  Complete sore throat data for each of the 104 weeks was received for 115 (25%) of 453 participants. 15 (3%) did not provide sore throat data. There was a numerically higher return rate of sore throat data in the tonsillectomy group than in the conservative management group (144 [62%] vs 115 [53%] patients completed sore throat data returns for 83 [80%] of 104 weeks). Cumulative return rates are shown in the appendix (pp 8-9).
Of 429 patients, data collected in 42 (10%) were retained in the analysis until the point of withdrawal. Participants in the immediate tonsillectomy group had fewer days of sore throat during 24 months than those in the conservative management group (median 23 days [IQR  vs 30 days ; figure 2).
After adjustment for site and baseline severity, the incident rate ratio of total sore throat days in the immediate tonsillectomy group (n=224) compared with the conservative management group (n=205) was 0·53 (95% CI 0·43-0·65; p<0•0001; table 2). The grade of surgeon experience was not statistically significant with negative binomial multivariable regression (appendix pp 9-11). Figure 2 shows ITT incident rate ratio on a log scale with the results of planned sensitivity analyses in the per protocol and per treatment populations and two unplanned sensitivity analyses. The planned and unplanned sensitivity analyses supported the primary ITT analysis with fewer sore throat days reported in participants who were randomly assigned or received tonsillectomy. The per protocol analysis showed a greater reduction in sore throat days in the tonsillectomy group  . The per treatment analysis shows a smaller reduction in sore throat days in the tonsillectomy group compared with no tonsillectomy (incident rate ratio 0•73 [0•59-0•90]). The mean sore throat data in the two study groups and for participants crossing over from randomly assigned groups are shown in the appendix (p 9). Baseline TOI-14 data were completed by 448 (99%) of 453 participants and SF-12 data were completed by 447 (99%) participants. Completion rates were 239 (53%) for both at the 12-month visit and 199 (44%) for TOI-14 and 200 (44%) for SF-12 at the 24-month visit. TOI-14 scores were similar for both groups at baseline (table 1). Figure 3 shows that TOI-14 scores improved (reduced) during 24 months in both treatment groups, with more pronounced and earlier improvement in the immediate tonsillectomy than in the conservative management group ( ; p<0·0001). SF-12 scores were also higher during 24 months of follow-up (better QoL) in the immediate tonsillectomy group (appendix pp [13][14][15]. Responses to the HUQ were near complete at baseline, with 231 (99%) participants responding in the immediate tonsillectomy group versus 215 (98%) in the conservative management group, which reduced to 122 (52%) versus 117 (53%) at 12 months and to 99 (42%) versus 100 (45%) at 24 months. Details of health-care resource use by study groups are shown in the appendix (pp [15][16][17][18][19][20]. Based on imputed results, tonsillectomy was non-significantly more costly (mean difference £488 [95% CI 349-626]) and more effective (mean difference 0·12 QALYs [0·09-0·14]) than conservative management. The incremental cost per QALY gained was £4136 (appendix p 23). Results from the stochastic analysis are presented in the appendix (pp [24][25][26]. Over the range of values considered for an additional QALY, tonsillectomy has a high probability of being considered cost-effective. Assuming a £5000 ($5000 or €5000) threshold value for an additional QALY, tonsillectomy had an 85% (17% with $5000 and 63% with €5000) probability of being considered cost-effective. This finding increased to 100% when an additional QALY was valued at £10 000 ($10 000 or €10 000). When participant costs were considered, tonsillectomy was less costly than conservative management (mean difference £889 [95% CI 40-1738]) and more effective (mean difference 0·12 QALYs [0·09-0·14]).
There were 191 adverse events in 90 (39%) of 231 participants undergoing tonsillectomy that were deemed related to tonsillectomy (

Discussion
Current guidance on adult tonsillectomy for recurrent acute tonsillitis is the same in the UK and the USA and is translated from the evidence for paediatric tonsillectomy rather than being based on adult-specific clinical trials. 25 NATTINA has shown that tonsillectomy is clinically effective for adults with recurrent tonsillitis meeting the UK guidelines. 8 Over the range of values that society is willing to pay for an additional QALY (assuming a   £10 000 threshold value), adult tonsillectomy has 100% probability of being considered cost-effective compared with conservative management. To our knowledge, NATTINA is the largest clinical trial to assess the effect of tonsillectomy in adults and offers clinically important long-term follow-up data. The benefit of immediate tonsillectomy, in terms of the 0·53 incidence rate ratio for sore throat days during 24 months versus standard conservative management, is clinically important. Patients considering tonsillectomy for recurrent acute tonsillitis should weigh the benefits of fewer sore throat days against the risks of surgery. NATTINA adds to the evidence to aid patients in shared decision making.
The postoperative bleeding rate of 19% is higher than reported previously, 26 but reflects the two proactive postoperative contact options actively made to elicit this information. The greater number of female participants recruited to NATTINA is consistent with the 70% adult female distribution undergoing tonsillectomy, which was shown in a large audit done in the UK. 27 NATTINA materially augments the level 1 adult tonsillectomy evidence base. The 2014 Cochrane Review identified two studies showing that the number of sore throat days was 10·6 days fewer in those receiving tonsillectomy than those treated conservatively during approximately 6 months of follow-up. However, unlike our trial, neither study accounted for postoperative sore throat days. The anticipated 14•0 days of pain after a tonsillectomy needs to be considered by patients and health-care providers during shared decision making.
Despite existing adult tonsillectomy guidance, variation in practice regarding referrals remains, with many patients in the UK having difficulties accessing treatment. Douglas and colleagues 28 observed that the mean number of tonsillitis episodes before tonsillectomy was 27 during 7 years in 123 patients, which is three times higher than the UK guidelines of three episodes annually for 3 years. Findings from our study support adult tonsillectomy guidelines and should result in a practice change towards timely referral of patients.
Limitations of the study include the high dropout rates and missing primary outcome data (almost 10% higher in the conservative management group than the immediate tonsillectomy group when measured as those returning more than 80% once per week responses). The primary outcome of sore throat days was stipulated by the funder through a commissioned research call and recommended by the Cochrane Review as an appropriate measure. In a previous paediatric tonsillectomy trial 14 we showed that parents and patients found it challenging to recognise distinct episodes of tonsillitis as one bout merged with another, hence the decision to specify sore throat days rather than episodes of tonsillitis. Collection of data on a once per week basis during 104 weeks was challenging, but the continuous data collection approach minimised recall bias. Despite this challenge, 25% of participants completed every week and 57% responded in more than 80% of the 104 weeks. Further limitations include an absence of pain comparison between sore throat days and the postoperative period, and, although cited in previous studies, the TOI-14 questionnaire has not been formally validated in English from the original version in German. 10 The NATTINA statistical analysis accounted for missing primary outcome data. The planned and unplanned sensitivity analyses support the main trial findings and conclusion. The unplanned sensitivity analyses show that the primary ITT results were maintained when restricted to participants returning more than 80% once per week sore throat responses. Because we assumed that any missing returns equated to no sore throat days, this sensitivity analysis offers the most conservative estimate of the effect of tonsillectomy.