Diagnostic performance of shear wave elastography in thyroid nodules with indeterminate cytology: A systematic review and meta-analysis

Purpose Thyroid nodules classified as indeterminate in previous fine-needle aspiration cytology often necessitate additional evaluation to determine their histology, while shear wave elastography (SWE) offers an alternative option in this regard. The objective of this study was to assess the diagnostic effectiveness of SWE in evaluating indeterminate nodules. Methods The PubMed, EMBASE, and Web of Science databases were searched from 1st January 1970 to 1st March 2023. The studies were reviewed and the data was extracted by two separate reviewers. A Bayesian bivariate model was utilized to quantitatively synthesize the diagnostic accuracy and yield of the studies in R. Results A total of seven studies, involving indeterminate thyroid nodules undergoing SWE were included, and the overall malignancy rate was 34.1% (307/900). The summarized estimates of sensitivity and specificity were 0.792 (95% credible interval [CI], 0.727–0.850) and 0.845 (95% CI, 0.797–0.887), respectively. The summarized estimate for the diagnostic odds ratio (DOR) was 17.8 (95% CI, 14.0–22.6). Summarized receiver operating characteristic (SROC) plots indicated a trade-off between sensitivity and specificity, and the estimate of AUC was 0.866 (95% CI, 0.834–0.895). The summary estimates for positive and negative likelihood ratios were 4.67 (95% CI, 3.98–5.85) and 0.26 (95% CI, 0.23–0.28), respectively. Conclusions The overall accuracy of SWE remains satisfactory in indeterminate thyroid nodules. However, it should be noted that the available data are still extremely limited, and more studies or guidelines are required to provide further insights.


Introduction
Thyroid nodules are a common occurrence in the adult population, with a prevalence of up to 60% and a cancer incidence of approximately 5% [1].Ultrasound-guided fine needle aspiration cytology (FNAC) is the established diagnostic modality for preoperative evaluation of thyroid nodules [2].However, it should be noted that this procedure may yield unsatisfactory and indeterminate results in up to 30% of cases [3,4].Indeterminate categories commonly encompass a range of thyroid nodules, including atypia of undetermined significance and follicular lesion of uncertain significance (AUS/FLUS), follicular neoplasm or suspicious for a follicular neoplasm (FN/SFN), and suspicious for malignancy (SM) [5].The management of thyroid nodules with indeterminate cytology poses ongoing difficulties in clinical practice.The key challenge lies in accurately identifying patients at a higher risk of cancer development, while minimizing unnecessary surgical interventions [6].Ultrasonography (US), molecular studies, and core needle biopsy (CNB) are increasingly employed in conjunction with FNAC to enhance diagnostic accuracy and facilitate optimal therapeutic decision making [7].Molecular testing can improve risk stratification, and there is currently a lack of long-term outcome data to fully assess its value in guiding therapeutic decision making [8].CNB is increasingly being considered a viable alternative to FNAC, particularly in cases of AUS/FLUS and nondiagnostic cytology, such as calcified nodules [9].However, CNB is a more invasive procedure than FNAC in conventional cognition and may not be a feasible option for managing small indeterminate nodules, thus limiting its applicability.
US elastography was utilized using stiffness as an indicator of malignancy for elastography, based on the observation that a suspicious nodule is palpably firm or hard in consistency [10,11].Shear wave elastography (SWE) provides real-time measurement of tissue elasticity, quantified as an elasticity index (EI), along with a qualitative color-coded elasticity map [12].Compared to other elasticity methods, SWE is less user-dependent because SWE utilizes acoustic pulses generated by a transducer to assess elasticity rather than relying on external pressure applied by the examiner [12,13].Many studies have demonstrated that SWE exhibits superior performance in predicting the malignancy of thyroid nodule [14,15], and is also considered particularly valuable for evaluating indeterminate nodules or nodules with a follicular growth pattern [16,17].Nonetheless, the usefulness of SWE in distinguishing thyroid nodules with indeterminate cytology remains a topic of controversy.Bardet et al. reported a sensitivity of 0.85 and a specificity of 0.94 with a cut-off value of maximum EI at 65 kPa using SWE in 131 indeterminate nodules [18].Samir et al. reported a sensitivity of 0.85 and specificity of 0.94 in 35 indeterminate nodules but used a lower mean EI at a 22.3 kPa cutoff [19].
Therefore, we conducted this systematic review and meta-analysis to explore and summarize the diagnostic efficacy of SWE in thyroid nodules with indeterminate cytology along with the focus on its clinical applicability.

Methods
This systematic review and meta-analysis adhered to the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) extension for diagnostic test accuracy statement [20].

Literature search
We conducted a comprehensive search across multiple databases, including PubMed, EMBASE, and Web of Science, covering the period from 1st January 1970 to 1st March 2023.We utilized "intermediate", "indeterminate", "undetermined", "shear wave elastography", "thyroid" and their synonyms as search terms.Additionally, we also examined the references of eligible studies and review articles to ensure comprehensive coverage of relevant literature.

Inclusion and exclusion criteria
First, studies or their subsets that examined the use of SWE to predict malignant thyroid nodules in patients who had previously received an indeterminate FNAC report were eligible for inclusion.Then, indeterminate nodules that underwent final postoperative histopathologic examination or repeated diagnostic FNAC were included.The exclusion criteria were as follows: a) articles that did not pertain to the field of interest (articles that did not include indeterminate nodules; articles that did not utilize SWE; indeterminate nodules were not validated by the final gold standard); b) reviews, case reports, editorials or letters, comments, and conference proceedings; and c) articles that were not written in English.

Data extraction
One investigator was responsible for extracting the descriptive data, which were then cross-checked and confirmed by another investigator.The extracted descriptive data included the encompassed information of the study, the details of the patients involved, and test characteristics.Two reviewers independently extracted the numerical data.Any discrepancies were resolved through discussion and consensus.In cases where the data were not readily extractable, we reached out to the authors for additional data that might be needed.

Risk of bias
The risk of bias and concerns about applicability were assessed by two independent reviewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [21].Discrepancies were resolved through consensus.The inclusion of each study was determined through a comprehensive discussion between the two reviewers.

Data synthesis
In our meta-analysis, we applied a Bayesian bivariate model of diagnostic test studies using integrated nested Laplace approximation (INLA) [22].INLA is a powerful computational method that allows for efficient and accurate estimation of parameters in Y. Qiu et al.Bayesian models and directly provides accurate posterior marginal distributions for sensitivity and specificity, as well as all hyperparameters and covariates, with no need for conventional Markov chain Monte Carlo sampling [23,24].Furthermore, univariate results of sensitivity and specificity accompanied by 95% credible intervals (CIs), as well as the summarized receiver operating characteristic (SROC) curve, were directly available.Additionally, area under the receiver operating characteristic curve (AUC) values with 95% CI were combined.The summarized positive and negative likelihood ratios (LR + s and LR-s, respectively) were calculated from the summarized sensitivity and specificity estimates.The diagnostic odds ratios (OR) and risk difference (RD) were also derived from these estimates.Spearman correlation was utilized to test for the presence of a threshold effect, and a P value less than 0.05 was considered to be a significant threshold effect.Forest plots were utilized to visually depict the summarized sensitivity and specificity, along with the individual sensitivity and specificity of each study.Funnel plot asymmetry was also assessed to evaluate the degree and significance of publication and selective reporting bias.Subgroup analyses were performed according to the study design (prospective or retrospective), prevalence of malignancy (above or below the median rate), histopathology (surgery or repeated FNAC & surgery), different SWE systems (Young's moduli (Kpa) or shear wave velocity [SMV(m/s)]) and different EI (mean or maximum).All the analyses were conducted using R software 4.2.3 (R Foundation for Statistical Computing, Vienna, Austria; https://www.r-project.org)along with the R package meta4diag (2.1.1)and INLA (22.12.16), and their relied packages.

Literature search and study characteristics
After removing duplicate results and conducting abstract screening, the potentially eligible full publications were thoroughly reviewed.After exclusions, a total of 7 studies were finally included in this systematic review [16,18,19,[25][26][27][28].The study selection process is described in Fig. 1.Of the 7 studies, 5 were prospective, and 2 were retrospective.Among the 919 included patients, 932 thyroid nodules were initially classified as indeterminate by FNAC, and 900 of them were diagnosed by postoperative histopathology  (5 articles) or both repeat FNAC and postoperative histopathology (2 articles).Of the 900 indeterminate nodules that were included in the final statistical analysis, 307 (34.1%) were found to be malignant.The prevalence of malignancy varied across the included studies, ranging from 14.2% to 80.8%, with a median prevalence of 29.2%.All details of the included studies are summarized in Table 1.
Detailed information regarding SWE used and the diagnostic four-fold data of the included studies are provided in Table 2.

Risk of bias
Two reviewers conducted an independent assessment of the risk of bias and concerns about applicability based on the QUADAS-2 tool.The results of the quality assessment are presented in Fig. 2. One article only enrolled indeterminate nodules over 15 mm in diameter, and had a high risk of patient selection [18].Overall, the applicability of the included studies was fine for inclusion.

Diagnostic performance
The summarized results for sensitivity and specificity were 0.792 (95% CI, 0.727-0.850)and 0.845 (95% CI, 0.797-0.887),respectively (Fig. 3).Crosshair plot indicated the sensitivity and specificity across different studies are consistent (Fig. 4A).The summary results for LR+ and LR-were 4.67 (95% CI, 3.98-5.85)and 0.26 (95% CI, 0.23-0.28),respectively.The summarized results for OR and RD were 17.8 (95% CI, 14.0-22.6)and 0.61 (95% CI, 0.58-0.64),respectively.The estimate of AUC was 0.866 (95% CI, 0.834-0.895),and the corresponding SROC plot is shown in Fig. 4B.In addition, no threshold effect was found by the Spearman correlation test (P = 0.383), and no possible publication bias was found according to the funnel plot, as shown in Fig. 4C.The results of subgroup analyses are displayed in Table 3.In terms of AUC, prospective studies, lower malignant rate, total surgical histopathology and SWE systems providing EI rather than SWV offered better diagnostic value.

Discussion
The incidence of thyroid nodules with indeterminate cytology varies widely in the literature.Bethesda III and IV can account for up to approximately 50% of the first FNAC results (35% for AUS/FLUS and 22.1% for FN/SFS), while rates as low as 0.78% for AUS/FLUS and 8.1% for FN/SFN have also been reported [29,30].The Bethesda system also recommends considering repeat FNA, molecular testing, or lobectomy for indeterminate nodules [31].On the whole, the current management of intermediate nodules involves individual assessment based to the US risk category of the nodule and the presence of suggestive clinical and historical/anamnestic risk factors [8,32].If repeat FNAC produces an indeterminate or nondiagnostic result, the ensuing follow-up is the first consideration, and lobectomy can also be an alternative if there is no absolute contraindication to surgery.Therefore, if an additional noninvasive tool could effectively distinguish benign nodules from indeterminate nodules, unnecessary invasive diagnostic procedures or lobectomy could be avoided.
In recent years, SWE has been proposed to identify benign and malignant nodules by quantitative elasticity values, which can display the difference in hardness intuitively.Several studies have demonstrated a correlation between lower elasticity of thyroid nodules and a higher likelihood of malignancy.However, it is important to note that this correlation could potentially be attributed to other histological features as well [33,34].Many studies and an updated meta-analysis have evaluated the diagnostic value of SWE and showed that SWE is a useful tool for identifying thyroid nodules [11].In this particular context, a few papers have examined the reliability and effectiveness of SWE in patients with indeterminate thyroid nodules.In this meta-analysis, 7 studies were included, and we yielded promising findings suggesting that SWE was an optimal tool for the diagnostic evaluation of thyroid nodules with indeterminate cytology.The summarized results for sensitivity and specificity were 0.792 (95% CI, 0.727-0.850)and 0.845 (95% CI, 0.797-0.887),respectively.In the subgroup analysis, the results did not suggest significantly different sensitivity or specificity in any given subgroup, which might mean that SWE has better universality in all used scenarios.However, these findings were not as expected in comparison with our previous study, where SWE showed a sensitivity of 0.838 and a specificity of 0.872 in distinguishing indeterminate nodules [35].The previous study only included 3 articles of SWE, and sample sizes were not sufficient (317 nodules); thus, we thought it was no more consistent than our new findings.Although the diagnostic efficiency of SWE has been demonstrated in this paper, a significant controversy currently lies in the choice of the EI and the determination of the cut-off value.We noticed that 5 included studies reported the optimal diagnostic value of mean EI, while 2 chose maximum EI.We suspect this might be due to the fact that indeterminate cytology is often associated with follicular carcinoma, and this could potentially result in the preselection of less stiff lesions, leading to more accurate diagnostic outcomes in the included studies.Therefore, mean EI could be more suitable for indeterminate cytology.On the other hand, studies that specifically excluded patients with indeterminate cytology showed a higher stiffness of malignant lesions.This could be attributed to a higher percentage of papillary thyroid carcinoma, where the maximum EI might be a more informative parameter [36,37].This could also explain what we found in the subgroup analysis where maximum EI seemed to be more reliable because Zhang et al. reported very high malignancy (80.8%) and more papillary cancers compared to other included articles [16].In addition, the cut-off values were highly variable in our included articles.At present, the dispute cannot be resolved because there are many factors that affect the cut-off value.Studies that included a substantial number of cases with papillary thyroid carcinoma often reported the highest cut-off points for EI [36,38].Nodule size is another factor where a higher EI was found in larger nodules [37,39].In addition, It has been noted that the presence of microcalcifications can influence the interpretation of SWE [40].Therefore, further study is warranted to standardize the cut-off values for wider clinical use and provide guidance in interpreting thyroid nodule results.
The updated EFSUMB guidelines recommended using US elastography for characterizing thyroid nodules due to its high positive predictive value and for follow-up of patients with cytologically highly suggestive benign nodules [41].However, the use of US elastography in evaluating indeterminate thyroid nodules is not currently recommended due to limited available data.In this paper, the role of SWE was further confirmed in indeterminate thyroid nodules.However, we also noticed some conflicts in comparison with other US elastography techniques.Celletti et al. reported both higher sensitivity (0.857 vs 0.571) and specificity (0.941 vs 0.794) in strain ratio elastography than SWE [26].Gay et al. conducted a study on the multiparametric assessment of indeterminate thyroid nodules and found no significant correlation between strain ratio elastography and SWE [42].Our previous study found comparable values for both strain ratio elastography and SWE [35].Thus, we thought different US elastography techniques are not strictly surrogate relationships, and it will make more sense to refer selectively to the results of one single device or to combine them.Some limitations should be briefly discussed here.Despite the adoption of a Bayesian method to minimize heterogeneity, the limited number of included studies might still amplify heterogeneity.The malignancy rate was 34.1% in our study, which was higher than that expected in indeterminate thyroid nodules.Agreement between repeated measurements is not discussed in this study, which could influence the performance, although SWE is often considered to be user independent.

Conclusion
In conclusion, the overall accuracy of SWE remains satisfactory in indeterminate thyroid nodules.However, the available data are still extremely limited, and further studies are needed.We hope that future guidelines can reach some consensus regarding the role of SWE in indeterminate thyroid nodules.

Fig. 2 .
Fig. 2. Quality assessment of the included studies according to the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) criteria.

Fig. 3 .
Fig. 3. Forest plots of the summarized estimates of sensitivity and specificity.

Fig. 4 .
Fig. 4. A) Crosshair plots show reported prior-point estimates (shown as circles) and confidence intervals (shown as extended lines).B) Summary receiver operating characteristic (SROC) curve showing individual study posterior-point estimates (the size of each circle is proportional to the sample size for each study).The dashed elliptical boundary represents the 95% credible region for the summary estimates (closed diamond).The standard (black) and latent class model analyses based on the conditional dependence model (blue) and the conditional independence model (red) are presented.C) Funnel plot for bias.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Table 1
Study characteristics.The study solely presents the initial number of included patients and does not provide the final count; FNAC, fine-needle aspiration cytology; USA, United States of America.

Table 2
Details for SWE used in the included studies.