Scolaris Content Display Scolaris Content Display

Diagnostic accuracy of endoscopic ultrasonography (EUS) for the preoperative locoregional staging of primary gastric cancer

Collapse all Expand all

Abstract

available in

Background

Endoscopic ultrasound (EUS) is proposed as an accurate diagnostic device for the locoregional staging of gastric cancer, which is crucial to developing a correct therapeutic strategy and ultimately to providing patients with the best chance of cure. However, despite a number of studies addressing this issue, there is no consensus on the role of EUS in routine clinical practice.

Objectives

To provide both a comprehensive overview and a quantitative analysis of the published data regarding the ability of EUS to preoperatively define the locoregional disease spread (i.e., primary tumor depth (T‐stage) and regional lymph node status (N‐stage)) in people with primary gastric carcinoma. 

Search methods

We performed a systematic search to identify articles that examined the diagnostic accuracy of EUS (the index test) in the evaluation of primary gastric cancer depth of invasion (T‐stage, according to the AJCC/UICC TNM staging system categories T1, T2, T3 and T4) and regional lymph node status (N‐stage, disease‐free (N0) versus metastatic (N+)) using histopathology as the reference standard. To this end, we searched the following databases: theCochrane Library (the Cochrane Central Register of Controlled Trials (CENTRAL)), MEDLINE, EMBASE, NIHR Prospero Register, MEDION, Aggressive Research Intelligence Facility (ARIF), ClinicalTrials.gov, Current Controlled Trials MetaRegister, and World Health Organization International Clinical Trials Registry Platform (WHO ICTRP), from 1988 to January 2015.

Selection criteria

We included studies that met the following main inclusion criteria: 1) a minimum sample size of 10 patients with histologically‐proven primary carcinoma of the stomach (target condition); 2) comparison of EUS (index test) with pathology evaluation (reference standard) in terms of primary tumor (T‐stage) and regional lymph nodes (N‐stage). We excluded reports with possible overlap with the selected studies.

Data collection and analysis

For each study, two review authors extracted a standard set of data, using a dedicated data extraction form. We assessed data quality using a standard procedure according to the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) criteria. We performed diagnostic accuracy meta‐analysis using the hierarchical bivariate method.

Main results

We identified 66 articles (published between 1988 and 2012) that were eligible according to the inclusion criteria. We collected the data on 7747 patients with gastric cancer who were staged with EUS. Overall the quality of the included studies was good: in particular, only five studies presented a high risk of index test interpretation bias and two studies presented a high risk of selection bias.

For primary tumor (T) stage, results were stratified according to the depth of invasion of the gastric wall. The meta‐analysis of 50 studies (n = 4397) showed that the summary sensitivity and specificity of EUS in discriminating T1 to T2 (superficial) versus T3 to T4 (advanced) gastric carcinomas were 0.86 (95% confidence interval (CI) 0.81 to 0.90) and 0.90 (95% CI 0.87 to 0.93) respectively. For the diagnostic capacity of EUS to distinguish T1 (early gastric cancer, EGC) versus T2 (muscle‐infiltrating) tumors, the meta‐analysis of 46 studies (n = 2742) showed that the summary sensitivity and specificity were 0.85 (95% CI 0.78 to 0.91) and 0.90 (95% CI 0.85 to 0.93) respectively. When we addressed the capacity of EUS to distinguish between T1a (mucosal) versus T1b (submucosal) cancers the meta‐analysis of 20 studies (n = 3321) showed that the summary sensitivity and specificity were 0.87 (95% CI 0.81 to 0.92) and 0.75 (95% CI 0.62 to 0.84) respectively. Finally, for the metastatic involvement of lymph nodes (N‐stage), the meta‐analysis of 44 studies (n = 3573) showed that the summary sensitivity and specificity were 0.83 (95% CI 0.79 to 0.87) and 0.67 (95% CI 0.61 to 0.72), respectively.

Overall, as demonstrated also by the Bayesian nomograms, which enable readers to calculate post‐test probabilities for any target condition prevalence, the EUS accuracy can be considered clinically useful to guide physicians in the locoregional staging of people with gastric cancer. However, it should be noted that between‐study heterogeneity was not negligible: unfortunately, we could not identify any consistent source of the observed heterogeneity. Therefore, all accuracy measures reported in the present work and summarizing the available evidence should be interpreted cautiously. Moreover, we must emphasize that the analysis of positive and negative likelihood values revealed that EUS diagnostic performance cannot be considered optimal either for disease confirmation or for exclusion, especially for the ability of EUS to distinguish T1a (mucosal) versus T1b (submucosal) cancers and positive versus negative lymph node status.

Authors' conclusions

By analyzing the data from the largest series ever considered, we found that the diagnostic accuracy of EUS might be considered clinically useful to guide physicians in the locoregional staging of people with gastric carcinoma. However, the heterogeneity of the results warrants special caution, as well as further investigation for the identification of factors influencing the outcome of this diagnostic tool. Moreover, physicians should be warned that EUS performance is lower in diagnosing superficial tumors (T1a versus T1b) and lymph node status (positive versus negative). Overall, we observed large heterogeneity and its source needs to be understood before any definitive conclusion can be drawn about the use of EUS can be proposed in routine clinical settings.

Plain language summary

Ultrasound for determining the spread of stomach cancer

Review question

There is much debate on the diagnostic performance of endoscopic ultrasound (EUS) in the preoperative staging of gastric cancer. The aim of this review was to collect the available evidence and then to calculate how well EUS stages stomach cancer.

Background

EUS is a diagnostic test that can be used to determine how far (stage) cancer of the stomach reaches prior to surgery. It consists of an endoscope coupled with an ultrasound device capable of scanning the stomach wall, which shows the different layers of the stomach. Changes from the normal ultrasonographic patterns due to the tumor growth can be used to determine the extent of cancer in the stomach wall (T‐stage) and the lymph nodes related to the stomach (N‐stage). Since the correct staging of the tumor enables physicians to personalize cancer treatment, it is important to understand the reliability of staging devices.

Study characteristics

We conducted a meta‐analysis according to the most recent methods for diagnostic tests. The last literature search was performed in January 2015. We included 66 studies (of 7747 patients) in the review.

Key results

We found that EUS can distinguish between superficial (T1 ‐ T2) and advanced (T3 ‐ T4) primary tumors with a sensitivity and a specificity greater than 85%. This performance is maintained for the discrimination between T1 and T2 superficial tumors. However, EUS diagnostic accuracy is lower when it comes to distinguishing between the different types of early tumors (T1a versus T1b) and between tumors with versus those without lymph node disease.

Quality of the evidence

Overall, EUS provides physicians with some helpful information on the stage of gastric cancer. Nevertheless, in the light of the variability of the results reported in the international medical literature, its limitations in terms of performance must be kept in mind in order to make the most out of the diagnostic potential of this tool. Finally, more work is needed to assess whether some technical improvements and the combination with other staging instruments may increase our ability to correctly stage the disease and thus optimize patient treatment.

Authors' conclusions

Implications for practice

Our findings partly support the use of endoscopic ultrasonography (EUS) for the locoregional staging of people with gastric carcinoma. EUS diagnostic performance, although not optimal, may be considered clinically useful to guide physicians in disease staging and thus in the development of the most appropriate therapeutic strategy on an individual‐patient basis, according to personalized medicine principles. However, physicians should be warned that EUS performance is lower in diagnosing superficial tumors (T1a versus T1b) and lymph node status (positive versus negative). The remarkable heterogeneity of the evidence currently available warrants some caution in interpreting the present results. Overall, we observed considerable heterogeneity and its sources need to be understood before any definitive conclusion can be drawn about the use of EUS can be proposed in a routine clinical setting.

Implications for research

The valid but suboptimal diagnostic accuracy of EUS for the locoregional staging of gastric cancer, with special regard to the diagnosis of superficial T1 tumors and lymph node status, prompts further investigation to improve the performance of this tool, especially for the diagnosis of superficial tumors (T1a versus T1b) and lymph node status (positive versus negative). Technological improvements, such as the combination of EUS with fine needle aspiration of suspicious lymph nodes (Dumonceau 2011), may lead to the optimization of gastric cancer staging, which should ultimately ameliorate the therapeutic management of these patients. It will also be important to compare the diagnostic performance of different tools (e.g., EUS, CT, MRI) and to investigate the diagnostic potential of combining these tools in order to optimize disease staging and ultimately to personalize patient treatment.

Summary of findings

Open in table viewer
Summary of findings Summary of findings Table

General information

General issue

What is the diagnostic performance of endoscopic ultrasound (EUS) in assessing disease stage in people with gastric carcinoma ?

Specific questions

What is the diagnostic performance of EUS in assessing primary tumor depth ?

Superficial (T1 ‐ T2) versus advanced (T3 ‐ T4) tumors

Early (T1) versus muscular (T2) tumors

Mucosal (T1a) versus submucosal (T1b) tumors

What is the diagnostic performance of EUS in assessing regional lymph node status ?

Non‐metastatic (N0) versus metastatic (N+) lymph nodes

Patients

Patients diagnosed with gastric carcinoma

Settings

Pre‐treatment evaluation of disease stage

Index tests

Endoscopic ultrasound (EUS)

Reference standard

Histology of surgical or endoscopic specimen

Importance

Choosing best treatment or treatment sequence of gastric carcinoma

Studies

66 studies enrolling 7747 patients

Quality concerns

Overall judgement

Good quality

Applicability concerns

None

Patient selection bias

None

Index test interpretation bias

High risk: 5 studies

Reference test interpretation bias

None

Flow and timing selection bias

High risk: 2 studies

Unclear risk: 2 studies

T1 ‐ T2 versus T3 ‐ T4 tumors

Studies

50 (patients enrolled: 4397)

Summary results

Sensitivity: 0.86 (95% CI: 0.81 to 0.90). Specificity: 0.90 (95% CI: 0.87 to 0.93)

Consequences

In a hypothetical cohort of 1000 patients (T1 ‐ T2 prevalence: 50%)

Correctly classified: 880

Overstaged: 70

Understaged: 50

T1 versus T2 tumors

Studies

46 (patients enrolled: 2742)

Summary results

Sensitivity: 0.85 (95% CI: 0.78 to 0.91). Specificity: 0.90 (95% CI: 0.85 to 0.93)

Consequences

In a hypothetical cohort of 1000 patients (T1 prevalence: 70%)

Correctly classified: 865

Overstaged: 105

Understaged: 30

T1a versus T1b tumors

Studies

20 (patients enrolled: 3321)

Summary results

Sensitivity: 0.87 (95% CI: 0.81 to 0.92). Specificity: 0.75 (95% CI: 0.62 to 0.84)

Consequences

In a hypothetical cohort of 1000 patients (T1a prevalence: 70%)

Correctly classified: 834

Overstaged: 91

Understaged: 75

N0 versus N+ tumors

Studies

44 (patients enrolled: 3573)

Summary results

Sensitivity: 0.83 (95% CI: 0.79 to 0.87). Specificity: 0.67 (95% CI: 0.61 to 0.72)

Consequences

In a hypothetical cohort of 1000 patients (N+ prevalence: 50%)

Correctly classified: 750

Overstaged: 85

Understaged: 165

Background

Despite its declining incidence in Western countries, gastric cancer is still one of the most common cancers in the world (Ferlay 2010; Shah 2010), the fourth most commonly occurring cancer (9% of all cancers) after cancer of the lung, breast, and colorectum, and the second most common cancer‐related cause of death (10% of all cancer deaths) after lung cancer. In 2002, the incidence of gastric cancer was estimated at 934,000 cases, 56% of the new cases being derived from Eastern Asia, 41% from China, and 11% from Japan. On the whole, 65% to 70% of incident cases and deaths from gastric cancer are occurring in less developed countries. In the US, 21,000 new cases of this malignancy were estimated to occur in 2010, leading to 10,500 expected deaths (Jemal 2010).

Radical surgery still represents the mainstay of treatment with curative intent (Dicken 2005; Jackson 2009). However, new approaches are gaining importance in the therapeutic management of these patients. For instance, endoscopic mucosal resection (EMR) is proposed as an alternative to surgery for people with early gastric cancer (EGC) in the presence of favorable prognosis features (e.g. histologically well‐differentiated carcinoma limited to the mucosa, diameter less than 2 cm, absence of ulceration) (Bennett 2009; Hirasawa 2011; Kang 2011; Othman 2011). Moreover, different adjuvant and neoadjuvant chemotherapy regimens (combined or not with radiotherapy) have been shown to provide significant survival advantage to people with advanced gastric cancer (AGC) (House 2008; Jiang 2010; Paoletti 2010; Wagner 2010).

These strategies require reliable disease staging procedures in order to guarantee the most appropriate treatment (i.e. with the highest therapeutic index, the ratio between efficacy and toxicity) for each patient, according to the principles of personalized medicine. As for all solid tumors, the disease stage for gastric cancer is defined by the three categories of the TNM classification: T‐stage (indicating the primary tumor invasion through the layers of the gastric wall; T1: tumor invading mucosa‐submucosa layer; T2: muscolaris propria layer; T3: subserosa layer; T4: serosa layer or adjacent organs), N‐stage (indicating the regional lymph node involvement; N0: no metastasis; N1 ‐ 3: presence of increasing number of metastatic lymph nodes) and M‐stage (indicating the presence/absence of distant metastasis, such as hepatic or peritoneal metastasis; M0 ‐ 1) (Edge 2010).

Therefore, after the diagnosis of primary carcinoma of the stomach is made (usually by means of pathology evaluation of tumor biopsies obtained during a standard gastroscopy), staging is assessed both preoperatively (clinical staging) by means of imaging techniques, and postoperatively by pathology examination of the surgical specimen (pathological staging). Knowing the disease stage before surgery (clinical staging) can be extremely useful in providing patients with the best therapeutic option: for instance, AGC (i.e., T3 ‐ T4 tumors or tumors with lymph node metastasis (N+)) can be treated with neoadjuvant (preoperative) chemotherapy (or radiotherapy, or both) (House 2008; Jiang 2010; Paoletti 2010; Wagner 2010). On the other hand, early gastric cancer (T1 tumors) with no lymph node involvement (N0) can be treated with endoscopic rather than surgical resection (Bennett 2009; Hirasawa 2011; Kang 2011; Othman 2011).

Computed tomography (CT) is currently the most frequently used radiological tool for the preoperative staging of gastric cancer (Jensen 2007; Ly 2008); however, CT accuracy is high mainly for distant metastasis (M category, e.g., hepatic metastasis), whereas its accuracy for locoregional staging (i.e., definition of the T and N categories) is much lower, ranging in most series from 65% to 85% (Hur 2006; Kawaguchi 2011; Kim 2005; Kumano 2005; Stell 1996). For instance, a recent meta‐analysis shows that CT scan sensitivity and specificity for the identification of lymph node status are 77% and 78%, respectively (Seevaratnam 2012). No better results appear to be achievable with other techniques such as magnetic resonance imaging (MRI) or positron emission tomography (PET) (Ha 2011; Kim 2011; Seevaratnam 2012). Overall, only a limited proportion of people with locally‐advanced gastric cancer and an even smaller percentage of those with early gastric cancer can be identified preoperatively and can thus benefit from personalized treatments.

Endoscopic ultrasound (EUS) has been proposed as an accurate device for the locoregional staging of gastric cancer (Byrne 2002; Hargunani 2009; Polkowski 2009). Our aim is to systematically review and meta‐analyze the available evidence regarding the diagnostic accuracy of EUS in discriminating between different primary tumor depths of invasion (T‐stage), as well as in identifying metastasis within regional lymph nodes (N‐stage).

A glossary of terms is provided in Appendix 1.

Target condition being diagnosed

This review addresses the preoperative locoregional staging of primary gastric carcinoma to distinguish between EGC, which is suitable for endoscopic resection, and AGC, which is likely to benefit from neoadjuvant therapies. We have not considered other gastric malignancies (e.g., lymphomas, gastrointestinal stromal tumors (GIST)).

Index test(s)

In this review endoscopic ultrasonography (EUS) represents the index test. It consists of an endoscope equipped with an ultrasound probe that can scan the stomach wall in order to detect alterations in its normal layers caused by primary tumor growth, as well as the presence of metastatic lymph nodes. Usually EUS does not require patient sedation and is performed like a standard gastroscopy with the instrument being introduced into the stomach through the mouth, the only difference being the additional time required to scan the stomach wall. For this reason and because complications are virtually absent, it is usually performed on an outpatient basis. The ultrasound transducer, which is integrated in the distal end of the endoscope to allow its positioning close to the gastric wall, comes in two main types: the linear scanner gives a scanning range of 180°, whereas the radial scanner offers the advantage of a full panoramic view (360°).

Clinical pathway

People with suspected gastric cancer, based on history and clinical findings, generally undergo gastroscopy to make the disease diagnosis, usually defined by the pathology evaluation of the biopsy performed during the endoscopy. Then the malignant disease is staged to assess its spread through the gastric wall to adjacent organs/lymph nodes or to distant body sites; this step is crucial to setting up the best therapeutic strategy and thus to maximizing the likelihood of cure. False positive findings from a staging procedure (e.g., classifying an early disease as advanced ) might lead to over‐treatment (e.g., unnecessary neoadjuvant chemotherapy); false negative findings might lead to patient under‐treatment.

Prior test(s)

No test is usually performed before EUS.

Role of index test(s)

The index test (EUS) is currently utilized in clinical practice by many physicians to preoperatively stage gastric cancer. However there is no consensus on whether or not EUS should be routinely used for this as part of a standardized approach.

Alternative test(s)

Other diagnostic tools that can be used for gastric cancer staging are computed tomography (CT), magnetic resonance imaging (MRI) and positron emission tomography (PET). None of them is deemed sufficiently accurate to be considered as the optimal imaging technique for the preoperative evaluation of disease spread, although all are widely used in clinical practice. In particular, neither CT scan, nor MRI, nor PET are useful for the definition of early stages of gastric cancer, whereas they are commonly utilized to diagnose locally‐advanced gastric cancer (T3 ‐ T4 or N+ cases, or both). While the usefulness of these diagnostic tools in the locoregional staging of gastric carcinoma is debated, there is a general consensus about their use in defining the presence of distant metastatic disease, e.g., presence or absence of metastasis in the liver or lungs.

Rationale

With regard to preoperative assessment of disease spread, one of the most promising tools for the locoregional staging of gastric carcinoma is EUS (Byrne 2002; Hargunani 2009; Polkowski 2009). This endoscopy‐based diagnostic device can both distinguish the different layers that compose the gastric wall and visualize the perigastric lymph nodes by means of a miniaturized ultrasound probe. Based on numerous reports published over more than two decades, EUS is often reported as a highly accurate method for the locoregional staging of gastric cancer. However, findings are heterogeneous, e.g., sensitivity and specificity values can range from 50% to 100% (Hizawa 2002; Kelly 2001; Kwee 2007; Kwee 2008; Kwee 2009; Puli 2008; Reddy 2008; Shimoyama 2004; Weber 2004), and although thousands of people with gastric cancer have been enrolled in EUS‐based studies, no formal quantitative review of the available evidence has been published that comprehensively examines the staging performance of EUS using the most appropriate statistical tools for the meta‐analysis of diagnostic accuracy data (a hierarchical approach) (Harbord 2008; Leeflang 2008; Macaskill 2010; Reitsma 2005).

Our review aims to fill this gap in the medical literature by quantitatively summarizing the diagnostic role of EUS in the staging of primary gastric carcinoma.

Objectives

To provide both a comprehensive overview and a quantitative analysis of the published data regarding the ability of endoscopic ultrasonography (EUS) to preoperatively define the locoregional disease spread (i.e., primary tumor depth (T‐stage) and regional lymph node status (N‐stage)) in people with primary gastric carcinoma. 

Secondary objectives

To provide the tools to calculate EUS diagnostic accuracy measures based on pre‐test information, such as gastric cancer T‐stage and N‐stage prevalence (Bayes nomograms).

To assess whether EUS performs differently in different subgroups of patients identified by the following parameters: year of publication, country (Western versus Eastern), EUS technical features (radial versus linear array; ultrasound frequency (MHz)), definition of target condition (for N‐stage: lymph node morphology versus size), gastric tumor site (any site versus cardia region only) and prevalence of target condition.

Methods

Criteria for considering studies for this review

Types of studies

We include studies that meet the following inclusion criteria:

  1. A minimum sample size of 10 patients with histologically‐proven primary carcinoma of the stomach;

  2. Evaluation of endoscopic ultrasonography (EUS) compared with histopathology of primary tumor (T‐stage) and regional lymph nodes (N‐stage);

  3. Sufficient data to construct a two‐by‐two contingency table such that the cells in the table could be labeled as true positive, false positive, true negative, and false negative (see the Target conditions for more details).

This type of study typically include both retrospective and prospective series of patients. As long as the above information is available,we did not exclude any specific type of study design.

We excluded studies that had possible overlap with the selected studies (i.e. studies from the same study group, institution, and period of inclusion). We excluded studies reporting on EUS performed before preoperative chemotherapy and or radiotherapy (neoadjuvant therapy) in order to avoid the confounding effect of disease downstaging by neoadjuvant treatments.

Participants

For this review, patients were people with gastric carcinoma undergoing preoperative locoregional disease staging (T‐stage and N‐stage) by means of EUS and postoperative pathology evaluation of the surgical specimen, including those having early gastric cancer (EGC) or advanced gastric cancer (AGC). We imposed no restrictions by age, gender or any other category.

Index tests

The index test is EUS. We compared the results of EUS to those of pathology evaluation (reference test) in terms of both T‐stage and N‐stage (see Target conditions for more details).

We did not consider any comparator test.

Target conditions

The target condition was gastric cancer locoregional staging, for both primary tumor depth and regional lymph node status.

For lymph node status (N‐stage), we considered a patient either negative if no lymph node was metastatic (N0) or positive if one or more lymph nodes were metastatic (N+), as assessed by pathology evaluation.

For the primary tumor invasion of the gastric wall (T‐stage), we considered two main conditions according to the clinical questions that EUS aims to answer:

  1. In order to identify patients who would best benefit from surgery without preoperative radio‐chemotherapy, EUS was to be investigated for its ability to distinguish superficial tumors (T1 ‐ T2) versus advanced tumors (AGC, T3 ‐ T4, which are likely to benefit from neoadjuvant preoperative chemotherapy); in this case, a patient was considered either positive if his/her gastric cancer was classified as T1 ‐ T2 by pathology examination, or negative if his/her gastric cancer is classified as T3 ‐ T4.

  2. Within the frame of superficial cancers (T1 ‐ T2), in order to identify patients with superficial tumors amenable to endoscopic resection (T1 tumors), EUS was investigated for its ability to distinguish T1 tumors (EGC) versus T2 tumors; in this case, a patient was considered either positive if his/her gastric cancer is classified as T1 by pathology evaluation, or negative if his/her gastric cancer is classified as T2.

Finally, where the data permitted and within the frame of EGC (T1 tumors), EUS was also tested for its ability to further discriminate between T1a and T1b tumors, since it is believed that the former type of cancers benefit the most from endoscopic mucosal resection (EMR). To this end, a patient was considered either positive if his/her gastric cancer was classified as T1a by pathology evaluation, or negative if his/her gastric cancer was classified as T1b.

Reference standards

The reference standard was routine histopathology evaluation (i.e., microscopic examination of hematoxylin‐eosin stained samples) of primary tumor and regional lymph nodes. Since pathological examination of the surgical specimen is the only way to know precisely the depth of invasion through the gastric wall as well as the status of regional lymph nodes, all eligible patients must have undergone surgery and all tumors must have undergone routine pathology evaluation. According to the pathology report, four T categories (T1 to T4) indicate the extent of gastric wall invasion by the primary tumor; the status of the regional lymph nodes (positive versus negative) was also taken into consideration.

Search methods for identification of studies

We performed a comprehensive search of the literature to identify articles that examined the diagnostic accuracy of EUS (the index test) in the evaluation of primary gastric cancer depth of invasion (T‐stage, according to the AJCC/UICC TNM staging system categories T1, T2, T3 and T4) and regional lymph node status (N‐stage, metastatic versus disease‐free) using histopathology as the reference standard.

Electronic searches

We grouped key words to combine four 'concepts' that must be included in a paper reporting on the subject under investigation in this review:

  1. malignant neoplasm (cancer, carcinoma)

  2. body site (gastric, stomach)

  3. diagnostic method (endoscopic ultrasound, EUS)

  4. disease staging

We systematically searched the following databases.

  1. The Cochrane Library (the Cochrane Central Register of Controlled Trials (CENTRAL)) (2015, Issue 1) (Appendix 2)

  2. MEDLINE (from 1988 to January 2015) (Appendix 3)

  3. EMBASE (from 1988 to January 2015) (Appendix 4)

  4. NIHR Prospero Register

  5. MEDION (http://www.mediondatabase.nl/)

  6. ARIF (www.arif.bham.ac.uk/databases.shtml)

  7. ClinicalTrials.gov (clinicaltrials.gov/)

  8. Current Controlled Trials MetaRegister (www.controlled‐trials.com/mrct/)

  9. WHO ICTRP (www.who.int/ictrp/en/)

Searching other resources

We searched for additional references by cross‐checking bibliographies of retrieved full‐text papers.

Data collection and analysis

Both review authors (SM and SP) conducted the literature search as well as data collection and management. Review author SM conducted the statistical analyses.

Selection of studies

Both review authors (SM and SP) independently selected the studies, resolving discrepancies by iteration, discussion and consensus. Where we retrieved articles in languages other than English, we were able to assess those in Italian, French and Spanish for eligibility.

Data extraction and management

We extracted relevant data from the articles selected for inclusion in the meta‐analysis. In addition to the accuracy data, we also recorded the following information for each study:

  1. Overall study characteristics, including the first author, country, language, and date of publication;

  2. Study patient characteristics;

  3. Features of the index test, e.g. type of echoendoscope, ultrasound frequency, and EUS criteria for tumor depth and lymph node status.

In case of missing data, we contacted the authors of the study to obtain the missing information. None of the three authors we contacted (Caletti 1993; Dittler 1993; Murata 1988) was able to provide data.

When raw data were presented in three‐by‐three or four‐by‐four tables (e.g., when the tumor depth or lymph node stage are defined by more than two categories), we constructed two‐by‐two contingency tables by considering a given T, or any N‐positive category, as the 'positive' state to be distinguished from the other T categories, or from the N‐negative cases. For instance, if an article presented data in a table reporting the number of N0, N1, N2 and N3 cases, we collapsed the data into a table with N0 and N+ (sum of N1, N2 and N3) cases.

We extracted data separately on primary tumor depth (T‐stage) and regional lymph node status (N‐stage).

We assembled all data in a dedicated database built within an Excel spreadsheet, where each row corresponded to a single study and variables of interest were recorded in the columns.

Assessment of methodological quality

We assessed data quality using a standard procedure according to the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) criteria (Whiting 2011). When there was at least one 'no' or 'unclear' response to a signaling question for a given domain, we scored the risk of bias as high or unclear, respectively. See Appendix 5 for details of the findings.

Statistical analysis and data synthesis

We performed statistical analysis according to Cochrane guidelines for diagnostic test accuracy (DTA) reviews (Macaskill 2010).

We used coupled forest plots to display the number of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN), as well as sensitivity and specificity, with their 95% confidence intervals (CI), for all included studies. Visual inspection of forest plots can provide a clue to heterogeneity within single studies. We also used summary receiver operating characteristic (SROC) plots to display the results of individual studies in a ROC space, each study being plotted as a single sensitivity‐specificity point.

As currently recommended for meta‐analysis of diagnostic accuracy studies (Harbord 2008; Leeflang 2008; Macaskill 2010; Reitsma 2005; Rutter 2001), we used hierarchical models to obtain summary estimates of EUS performance in terms of ability to discriminate primary gastric cancer depth of invasion (T‐stage) and regional lymph node status (N‐stage).

According to the bivariate method (Reitsma 2005), we calculated overall sensitivity and specificity and their 95% confidence intervals (CIs) and predictive intervals, based on the binomial distributions of the true positives and true negatives. Besides accounting for study size and between‐study heterogeneity, the bivariate model adjusts for the frequently observed negative correlation between the sensitivity and the specificity of the index test (threshold effect). An additional advantage of using the bivariate model is that the bivariate nature of the original data can be maintained throughout the analysis, allowing the generation of reliable summary estimates of sensitivity and specificity. For the bivariate model, the summary estimates of sensitivity and specificity represent an 'average' operating point across studies. We analyzed primary tumor depth (T‐stage) and regional lymph node status (N‐stage) separately.

We evaluated the clinical (or patient‐relevant) utility of EUS using likelihood ratios, which we computed directly from the summary estimates of sensitivity and specificity, to enable the calculation of post‐test probability (based on the Bayes' theorem) by means of the Fagan's nomogram (Deeks 2004). The Fagan’s nomogram is a graphical tool which in routine clinical practice allows one to use the results of a diagnostic test to estimate a patient’s probability of having a disease (post‐test probability) based on two pieces of information: the pre‐test probability (usually the incidence of the disease/condition) and the test result. In this nomogram, a straight line drawn from a patient’s pre‐test probability of disease (left axis) through the likelihood ratio of the test (middle axis) intersects with the post‐test probability of disease (right axis).

We conducted statistical analyses using both Review Manager 5 software (RevMan 2014) as well as the Metandi and Midas programs for the STATA software (Stata 2009).

Since currently available imaging tools are associated with accuracy sensitivity and specificity values around 80% (see Background section), we considered this as a 'desirable' value with which the EUS diagnostic performance can be compared.

Investigations of heterogeneity

As it is common in diagnostic accuracy studies, we anticipated that there would be substantial between‐study variation in reported pairs of sensitivity and specificity values.

Coupled forest plots, which display both sensitivity and specificity of all included studies, provide a visual clue to heterogeneity of the results on a single‐study basis.

In order to formally investigate potential sources of heterogeneity other than the threshold effect, we used subgroup analysis and meta‐regression by including covariates (study sample size, publication year, type of EUS array, study quality, country, stomach site) in the bivariate model, which enabled us to assess the effect of various factors on the diagnostic accuracy of EUS.

Sensitivity analyses

We conducted sensitivity analyses to assess the impact on the summary effects of low‐quality studies, as defined by the identification of a high risk of bias for one or more QUADAS‐2 items, as well as the presence of specific types of primary gastric cancer morphology (e.g., ulcerated tumors) or location (e.g., cardia region of the stomach).

We also used the 'leave‐one‐out' procedure to assess the impact of each study on the meta‐analysis results (leading study effect).

Assessment of reporting bias

We conducted formal testing for small‐study effects (which include publication bias) by a regression of diagnostic odds ratio (DOR), which describes the odds of positive test results in patients with disease compared with the odds of positive test results in those without disease, on a natural logarithm scale against 1/sqrt ESS (effective sample size), weighting by ESS. P < 0.10 for the slope coefficient indicates significant asymmetry of the funnel plot (Deeks 2005).

Results

Results of the search

The literature search identified 2168 potentially relevant studies (Figure 1). By reading abstracts we excluded 2044 articles, and by reading full‐text versions we eliminated another 54 articles. We identified four citations as potentially meeting the inclusion criteria but could not assess them by the time of publication, and will address them in a future update. Ultimately, 66 articles were eligible according to the inclusion criteria. The main characteristics of the eligible studies, which were published from 1988 through 2012, are reported in the Characteristics of included studies section. The main characteristics of the excluded studies are reported in the Characteristics of excluded studies section. Considering the included studies, overall 7747 patients were enrolled in 16 different countries, with a mean of 117 patients enrolled per study (range: 14 to 930). Most of these studies (41/66, 62%) were published after 1999. The available evidence came primarily from retrospective studies (50/66, 76%), which enrolled Asian patients in 39 series (59%). The target condition was gastric carcinoma arising from any site of the stomach in 60 out of 66 studies (91%), whereas in the remaining series the authors focused on the tumors arising in the cardia region, Finally, the radial type of endoscopic ultrasound (EUS) array was the more often utilized compared to the linear array (55/58 articles (95%) reporting the type of array adopted).


Study flow diagram.

Study flow diagram.

Methodological quality of included studies

Overall, the quality of the included studies was good, as illustrated in the QUADAS‐2 results summary (Figure 2) and graph (Figure 3), and also summarized in summary of findings Table. No concerns about applicability or patient selection bias or interpretation of reference test results were raised by the analysis of the available data. However, five studies (Bhandari 2004; Garlipp 2011; Heye 2009; Potrc 2006; Xi 2003) presented a high risk of index test interpretation bias due to the lack of threshold definition for the classification of T‐stage or N‐stage or both. Two other studies (Akashi 2006; Mouri 2009) presented a high risk of selection bias due to the lack of inclusion of all patients: in particular, 37 and 31 patients respectively were not included, due to uninterpretable EUS findings. In another two studies (Hizawa 2002; Yanai 1997) the same issue occurred for only seven and four patients respectively: accordingly we deemed the risk of bias in these cases as unclear. Overall, uninterpretable results were rarely reported, although this is not a guarantee that EUS findings are always easily interpretable; it might just reflect the attitude of the endoscopist to provide a classification 'at any cost'. In most studies the interval between the index test and the reference test was unreported, but we believe that this occurrence is unlikely to undermine the reliability of the results, since the diagnosis of a malignant disease such as gastric carcinoma is usually considered an indication for surgery and thus for pathological evaluation within a very short time (some days/a few weeks). This in turn is unlikely to be sufficient for the disease stage to change.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Findings

Primary tumor depth (T‐stage)

We first addressed the issue of EUS accuracy in discriminating T1 ‐ T2 (superficial) versus T3 ‐ T4 (advanced) gastric carcinomas. We therefore carried out a meta‐analysis of the eligible studies reporting relevant data. For this analysis (Data table 1), 50 studies were available, with a total of 4397 patients.

The sensitivity and specificity of the single studies are shown in Data table 1. The summary receiver operating curve (SROC) curve along with the summary point and the 95% confidence and prediction regions are illustrated in Figure 4. The summary sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) were 0.86 (95% confidence interval (CI): 0.81 to 0.90), 0.90 (95% CI: 0.87 to 0.93), 8.9 (95% CI: 6.8 to 11.6), 0.16 (95% CI: 0.12 to 0.22), and 56 (95% CI: 37 to 85), respectively.


Summary ROC Plot of studies assessing the accuracy of EUS in discriminating T1 ‐ T2 versus T3 ‐ T4 gastric carcinomas. Each study sensitivity/specificity value is represented by an empty circle. The summary point for sensitivity/specificity is represented by a black filled circle. Dotted closed line: 95% confidence region of the summary point. Dashed closed line: 95% prediction region.

Summary ROC Plot of studies assessing the accuracy of EUS in discriminating T1 ‐ T2 versus T3 ‐ T4 gastric carcinomas. Each study sensitivity/specificity value is represented by an empty circle. The summary point for sensitivity/specificity is represented by a black filled circle. Dotted closed line: 95% confidence region of the summary point. Dashed closed line: 95% prediction region.

As shown in Figure 4 by both confidence and prediction regions, the results indicate a lower variability for specificity as compared to sensitivity, suggesting that EUS might be more reliable in correctly identifying T3 ‐ T4 cases compared to T1 ‐ T2 cases.

Although both summary sensitivity and specificity values were relatively satisfactory, between‐study heterogeneity was substantial, as visually assessable through both the forest plot (Data table 1) and predictive ellipse (Figure 4).

The Fagan plot (Figure 5) illustrates that EUS may be clinically useful because it increases the previous probability of being classified as T1 ‐ T2 from 50% (average prevalence of T1 ‐ T2 cases) to 90% when positive, and it lowers the same probability to 14% when negative. However, the likelihood ratio (LR) scattergram (Figure 6) shows that the summary point of positive and negative LR is located in the lower right quadrant, suggesting that EUS accuracy ‐ although close to values desirable for a diagnostic tool ‐ is not optimal either for tumor depth confirmation or exclusion.


Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 ‐ T2 (rather than T3 ‐ T4) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 ‐ T2 (rather than T3 ‐ T4) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).


EUS ability to discriminate between T1‐T2 and T3‐T4 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

EUS ability to discriminate between T1‐T2 and T3‐T4 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

These findings imply that in a hypothetical cohort of 1000 people with gastric carcinoma, EUS would correctly classify 880 of them, but would also over‐stage 70 patients by classifying them as T3 ‐ T4 instead of T1 ‐ T2, and under‐stage 50 patients by classifying them as T1 ‐ T2 instead of T3 ‐ T4 (see summary of findings Table).

Since the proportion of heterogeneity likely caused by the threshold effect was low (12%), we looked for other sources of heterogeneity. In this regard subgroup analysis (Table 1) demonstrated that publication year has a significant impact on EUS diagnostic performance, since studies conducted before the year 2000 reported on average significantly higher sensitivity (0.91 (95% CI 0.87 to 0.96) versus 0.81 (95% CI 0.75 to 0.88)) and specificity (0.94 (95% CI 0.91 to 0.96) versus 0.88 (95% CI 0.84 to 0.9)). Also the type of EUS array appeared to be correlated with diagnostic performance, with radial array being more accurate than linear array (Table 1); however, only four studies used the latter type of array, which makes it unwise to draw any definitive conclusion on this topic. The other subgroup and sensitivity analyses were not informative.

Open in table viewer
Table 1. Subgroup and sensitivity analysis for T1 ‐ T2 versus T3 ‐ T4 gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

18

0.89 (0.83 to 0.95)

0.90 (0.85 to 0.94)

0.46

<100

32

0.83 (0.77 to 0.90)

0.91 (0.87 to 0.94)

Year of publication

2000 or later

32

0.81 (0.75 to 0.88)

0.88 (0.84 to 0.91)

<0.01

before 2000

18

0.91 (0.87 to 0.96)

0.94 (0.91 to 0.96)

Country

Western

27

0.82 (0.75 to 0.89)

0.91 (0.87 to 0.94)

0.27

Eastern

23

0.89 (0.84 to 0.94)

0.90 (0.86 to 0.94)

EUS array

Radial

39

0.88 (0.84 to 0.93)

0.90 (0.87 to 0.93)

<0.01

Linear

4

0.68 (0.40 to 0.96)

0.86 (0.73 to 1.00)

Tumor site

Cardia only

6

0.70 (0.49 to 0.91)

0.90 (0.82 to 0.98)

0.12

Any site

44

0.87 (0.83 to 0.91)

0.90 (0.88 to 0.93)

Quality

High

45

0.86 (0.82 to 0.91)

0.90 (0.88 to 0.93)

0.59

Low

5

0.78 (0.72 to 1.00)

0.90 (0.81 to 0.99)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Regression testing for funnel plot asymmetry (Deeks 2005) showed no evidence of statistically significant small‐study effect bias (P = 0.48).

Exclusion of studies with a high risk of bias did not significantly change the above findings (Table 1).

For EUS diagnostic ability to distinguish T1 (early gastric cancer, EGC) versus T2 (muscle‐infiltrating) tumors, the meta‐analysis of 46 studies (n = 2742; Data table 2) (see forest plot and SROC curve in Data table 2 and Figure 7, respectively) showed that the summary sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) were 0.85 (95% CI 0.78 to 0.91), 0.90 (95% CI 0.85 to 0.93), 8.5 (95% CI 5.9 to 12.3), 0.17 (95% CI 0.12 to 0.24), and 50 (95% CI 32 to 79), respectively.


Summary ROC Plot of 46 studies investigating the EUS ability to discriminate between T1 versus T2 gastric carcinomas.

Summary ROC Plot of 46 studies investigating the EUS ability to discriminate between T1 versus T2 gastric carcinomas.

Although both summary sensitivity and specificity values were relatively satisfactory, between‐study heterogeneity was substantial, as visually assessable through both the forest plot (Data table 2) and predictive ellipse (Figure 7).

The Fagan plot (Figure 8) illustrates that EUS may be clinically useful because it increases the previous probability of being classified as T1 from 70% (average prevalence of T1 cases) to 94% when positive, and it lowers the same probability to 26% when negative. However, the likelihood ratio (LR) scattergram (Figure 9) shows that the summary point of positive and negative LR is located in the lower right quadrant, suggesting that EUS accuracy, although close to ideal values, is not optimal either for disease depth confirmation or exclusion.


Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 (rather than T2) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 (rather than T2) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).


EUS ability to discriminate between T1 and T2 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

EUS ability to discriminate between T1 and T2 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

These findings imply that in a hypothetical cohort of 1000 people with gastric carcinoma, EUS would correctly classify 865 of them, but would also over‐stage 105 patients by classifying them as T2 instead of T1, and under‐stage 30 patients by classifying them as T1 instead of T2 (see summary of findings Table).

Since the proportion of heterogeneity likely caused by the threshold effect was moderate (30%), we looked for further sources of heterogeneity. Subgroup analysis suggested that sample size, country of origin and type of EUS array might have a (limited) impact on EUS diagnostic performance (Table 2). However, these results are to be interpreted cautiously because of the low number of studies in some comparison groups. The other subgroup and sensitivity analyses were not informative.

Open in table viewer
Table 2. Subgroup and sensitivity analysis for T1 versus T2 gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

5

0.97 (0.93 to 1.00)

0.73 (0.53 to 0.93)

0.01

<100

41

0.81 (0.74 to 0.87)

0.91 (0.88 to 0.95)

Year of publication

2000 or later

29

0.82 (0.74 to 0.90)

0.91 (0.87 to 0.96)

0.51

before 2000

17

0.88 (0.80 to 0.96)

0.87 (0.80 to 0.95)

Country

Western

22

0.71 (0.59 to 0.82)

0.94 (0.91 to 0.97)

<0.01

Eastern

24

0.92 (0.88 to 0.96)

0.84 (0.77 to 0.91)

EUS array

Radial

37

0.85 (0.79 to 0.91)

0.90 (0.85 to 0.94)

<0.01

Linear

3

0.92 (0.77 to 1.00)

0.98 (0.93 to 1.00)

Tumor site

Cardia only

5

0.91 (0.80 to 1.00)

0.89 (0.77 to 1.00)

0.59

Any site

41

0.84 (0.77 to 0.90)

0.90 (0.86 to 0.94)

Quality

High

41

0.85 (0.79 to 0.91)

0.89 (0.85 to 0.93)

0.39

Low

5

0.79 (0.57 to 1.00)

0.96 (0.90 to 1.00)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Regression testing for funnel plot asymmetry showed no evidence of statistically significant small‐study effect bias (P = 0.58).

Exclusion of studies with a high risk of bias did not significantly change the above findings (Table 2).

We then focused on the EUS ability to distinguish between T1a (mucosal) versus T1b (submucosal) cancers: the meta‐analysis of 20 studies (n = 3321; Data table 3) (see forest plot and SROC curve in Data table 3 and Figure 10, respectively) showed that the summary sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio DOR were 0.87 (95% CI 0.81 to 0.92), 0.75 (95% CI 0.62 to 0.84), 3.4 (95% CI 2.3 to 5.0), 0.17 (95% CI 0.12 to 0.24), and 20 (95% CI 12 to 33), respectively.


Summary ROC Plot of 20 studies investigating the diagnostic ability of EUS to discriminate between T1a versus T1b tumors.

Summary ROC Plot of 20 studies investigating the diagnostic ability of EUS to discriminate between T1a versus T1b tumors.

Summary sensitivity (but not specificity) value was relatively high, but between‐study heterogeneity was substantial as visually assessable through both the forest plot (Data table 3) and predictive ellipse (Figure 10).

The Fagan plot (Figure 11) illustrates that EUS may be clinically useful because it increases the previous probability of being classified as T1 from 70% (average prevalence of T1a cases) to 88% when positive, and it lowers the same probability to 30% when negative. However, the likelihood ratio (LR) scattergram (Figure 12) shows that the summary point of positive and negative LR is located in the lower right quadrant, suggesting that EUS accuracy is not optimal either for disease depth confirmation or exclusion.


Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1a (rather than T1b) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1a (rather than T1b) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).


EUS ability to discriminate between T1a and T1b gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

EUS ability to discriminate between T1a and T1b gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

These findings imply that in a hypothetical cohort of 1000 people with gastric carcinoma, EUS would correctly classify 834 of them, but would also over‐stage 91 patients by classifying them as T1b instead of T1a, and under‐stage 75 patients by classifying them as T1a instead of T1b (see summary of findings Table).

The proportion of heterogeneity likely caused by the threshold effect was 49%. Subgroup and sensitivity analyses suggested that none of the covariates we considered was associated with between‐study heterogeneity (Table 3).

Open in table viewer
Table 3. Subgroup and sensitivity analysis for T1a versus T1b gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

8

0.90 (0.84 to 0.96)

0.67 (0.50 to 0.85)

0.49

<100

12

0.85 (0.76 to 0.93)

0.79 (0.67 to 0.91)

Year of publication

2000 or later

11

0.90 (0.84 to 0.95)

0.70 (0.55 to 0.85)

0.59

before 2000

9

0.84 (0.75 to 0.94)

0.79 (0.66 to 0.93)

Country

Western

1

1.00 (0.40 to 1.00)

1.00 (0.69 to 1.00)

N/A

Eastern

19

0.88 (0.81 to 0.92)

0.73 (0.61 to 0.82)

EUS array

Radial

18

0.89 (0.82 to 0.93)

0.72 (0.59 to 0.82)

N/A

Linear

1

0.50 (0.07 to 0.93)

0.88 (0.64 to 0.99)

Tumor site

Cardia region

1

0.50 (0.07 to 0.93)

0.88 (0.64 to 0.99)

N/A

Any site

19

0.88 (0.82 to 0.92)

0.74 (0.61 to 0.83)

Quality

High

18

0.84 (0.79 to 0.89)

0.75 (0.64 to 0.86)

0.05

Low

2

0.98 (0.96 to 1.00)

0.64 (0.26 to 1.00)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Regression testing for funnel plot asymmetry showed evidence of statistically significant small‐study effect bias (P = 0.04).

Exclusion of studies with a high risk of bias did not significantly change the above findings (Table 3).

Lymph node status (N‐stage)

We then carried out a meta‐analysis of the eligible studies reporting data on N‐stage (positive versus negative) to evaluate the diagnostic ability of EUS to assess the status of regional lymph nodes in people with gastric carcinoma. Forty‐four studies were available, with a total of 3573 patients (Data table 4).

The sensitivity and specificity of each single study are shown in Data table 4. The SROC curve along with the summary point and the 95% confidence and prediction regions are illustrated in Figure 13.


Summary ROC Plot of 44 studies addressing the issue of EUS ability to discriminate between lymph node negative (N0) and positive (N+) cases.

Summary ROC Plot of 44 studies addressing the issue of EUS ability to discriminate between lymph node negative (N0) and positive (N+) cases.

Summary sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) were 0.83 (95% CI 0.79 to 0.87), 0.67 (95% CI 0.61 to 0.72), 2.5 (95% CI 2.1 to 2.9), 0.25 (95% CI 0.20 to 0.31), and 10 (95% CI 7 to 13), respectively.

Summary sensitivity (but not specificity) value was relatively high, but between‐study heterogeneity was substantial as visually assessable through both the forest plot (Data table 4) and predictive ellipse (Figure 13).

The Fagan plot (Figure 14) shows that EUS may be clinically informative because it increases the previous probability of being classified as N+ from 50% (average prevalence of N+ cases) to 62% when positive, and it lowers the same probability to 14% when negative. However, the likelihood ratio (LR) scattergram (Figure 15) shows that the summary point of positive and negative LR is located in the lower right quadrant, suggesting that EUS accuracy is not optimal either for lymph node metastatic involvement confirmation or exclusion.


Fagan plot estimating how much the result of EUS changes the probability that a patient has a N+ (metastatic lymph nodes) (rather than a N0, disease free lymph nodes) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

Fagan plot estimating how much the result of EUS changes the probability that a patient has a N+ (metastatic lymph nodes) (rather than a N0, disease free lymph nodes) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).


EUS ability to discriminate between N+ (metastatic lymph nodes) and N0 (disease free lymph nodes) gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

EUS ability to discriminate between N+ (metastatic lymph nodes) and N0 (disease free lymph nodes) gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

These findings imply that in a hypothetical cohort of 1000 people with gastric carcinoma, EUS would correctly classify 750 of them, but would also over‐stage 85 patients by classifying them as T1b instead of T1a, and under‐stage 165 patients by classifying them as T1a instead of T1b (see summary of findings Table).

The proportion of heterogeneity likely caused by the threshold effect was moderate (17%). Subgroup and sensitivity analyses (Table 4) suggested that none of the covariates we considered were associated with between‐study heterogeneity, although analysis by country of origin was of borderline significance.

Open in table viewer
Table 4. Subgroup and sensitivity analysis for N0 versus N+ gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

12

0.83 (0.77 to 0.89)

0.65 (0.55 to 0.75)

0.90

<100

32

0.84 (0.79 to 0.88)

0.67 (0.61 to 0.74)

Year of publication

2000 or later

28

0.83 (0.79 to 0.88)

0.66 (0.59 to 0.73)

0.95

before 2000

16

0.83 (0.76 to 0.89)

0.68 (0.58 to 0.77)

Country

Western

24

0.80 (0.75 to 0.86)

0.72 (0.66 to 0.79)

0.04

Eastern

20

0.86 (0.81 to 0.90)

0.59 (0.51 to 0.68)

EUS array

Radial

35

0.84 (0.80 to 0.88)

0.66 (0.60 to 0.73)

0.11

Linear

4

0.83 (0.69 to 0.98)

0.66 (0.46 to 0.86)

Tumor site

Cardia region

6

0.86 (0.76 to 0.96)

0.76 (0.63 to 0.88)

0.27

Any site

38

0.83 (0.79 to 0.87)

0.65 (0.59 to 0.71)

Quality

High

41

0.83 (0.79 to 0.87)

0.67 (0.62 to 0.73)

0.57

Low

3

0.88 (0.77 to 0.99)

0.56 (0.32 to 0.80)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Regression testing for funnel plot asymmetry showed no evidence of statistically significant small‐study effect bias (P = 0.96).

Exclusion of studies with a high risk of bias did not significantly change the above findings (Table 4).

Discussion

In this systematic review of the diagnostic performance of endoscopic ultrasound (EUS) for the locoregional staging of gastric cancer we collected the data from the largest series of patients ever considered in the international medical literature (n = 7747). Using modern statistical methods specifically dedicated to diagnostic meta‐analysis, i.e., the hierarchical bivariate model, we quantitatively summarized the available evidence and found that overall EUS provides clinically useful information regarding gastric cancer locoregional spread. EUS summary sensitivities and specificities ranged from 0.83 to 0.87 and from 0.90 to 0.67 respectively, all significantly higher than the 0.50 'null' value. This means that EUS performs better than the prediction made with a flip of a coin: the 95% confidence intervals of those summary estimates do not in fact cross the 0.50 value, which is the probability value of being 'diseased' (e.g. the probability of having metastatic lymph nodes) assigned to each patient by the flip of a coin (i.e., 50%). This finding is strengthened by the results of the Bayesian analysis (see Fagan plots), which demonstrate that EUS also performs better than a 'smart observer', that is, one who knows the prevalence of the condition (e.g. percentage of T1 gastric cancers as a proportion of all gastric cancers) and thus would assign this probability value to patients, which would ultimately increase the predictive accuracy compared to the more simplistic flip‐of‐a‐coin approach. Consider the T1 versus T2 setting as an example: if the proportion of T1 tumors is 0.70 (based on previous epidemiological studies), one could use this value to classify patients by assigning to each patient the probability of being T1 equal to the prevalence of the condition (0.70): this would lead to an accuracy of 70% based on the fact that 70% of patients would be correctly identified by randomly classifying 70% of them as T1. This approach would yield a better diagnostic performance compared to the flip‐of‐a‐coin approach, which would assign a 0.50 probability to all patients, and thus would achieve 50% accuracy. Compared to these two approaches, EUS can better discriminate between T1 and T2 cases, since it changes the likelihood of being T1 from 70% to 94% when the test is positive and it lowers the same probability to 26% when tests are negative (overall accuracy: 93%). This could be very helpful for clinicians during the decision‐making process of patient therapeutic management.

Critical issues

Despite these favorable findings, some critical aspects must be emphasized to correctly appreciate the limitations of this diagnostic tool.

First, the remarkable heterogeneity of results we found across eligible studies, most of which (50/66, 76%) are retrospective in design, casts some doubts on the reliability and reproducibility of EUS in the locoregional staging of gastric carcinoma. Unfortunately, we did not identify any technical (e.g. EUS probe frequency) or tumor‐related (e.g. stomach site) feature that might explain such variability in results reported in the relevant literature, which does not allow us to suggest any strategy that might improve the performance of EUS. Notably, we could not explore the experience of the endoscopist as a source of heterogeneity, as no such information is available in the literature. However, we failed to detect an association between heterogeneity and the sample size (a potential surrogate for the experience of the endoscopy center). The only suspected source of heterogeneity we could identify was the year of publication (better and more homogeneous results in earlier studies), which was especially evident for distinguishing T1 ‐ T2 from T3 ‐ T4 tumors; this finding might be due to a 'first study' effect, i.e. more enthusiasm surrounding the procedure during the first years after its implementation in the clinical setting. However, we cannot rule out the possibility that the diffusion of EUS in clinical practice following the initial encouraging results might have led to the use of this tool by less experienced endoscopists, with an increased probability of less accurate interpretations of the test. Due to the lack of data, we could not explore the effect of other potential sources of heterogeneity, such as primary tumor characteristics, e.g., diameter, morphology (flat versus ulcerated versus vegetant), and experience of the endoscopist; these and other factors therefore remain to be investigated to better define the limits of EUS in the locoregional staging of gastric carcinoma.

Second, not all parameters of diagnostic performance reached ideal values, i.e., values believed to be desirable for the implementation of a diagnostic tool in clinical practice. In particular, the EUS summary specificity for the diagnosis of mucosal (T1a) versus submucosal (T1b) tumors and for the diagnosis of lymph node metastasis (N0 versus N+) was 0.75 and 0.67 respectively, which are below the desirable value of 0.8. Moreover, since the diagnostic ability of a test depends not only on its discriminatory value but also on the prevalence of the disease, we considered the likelihood ratios (LRs) associated with EUS performance, as illustrated in the LR matrices (see Figure 6; Figure 9; Figure 12; Figure 15). EUS showed an acceptable performance for the differentiation of T1 ‐ T2 from T3 ‐ T4 tumors and T1 from T2 tumors, but not for distinguishing T1a from T1b tumors or for diagnosis of lymph node status (N0 versus N+).

Comparison with existing literature

Two systematic reviews (without meta‐analysis) and three meta‐analyses have been published on this topic between 2007 and 2011. The two reviews, one dedicated to primary tumor depth and the other to lymph node status, concluded that EUS is a reliable imaging modality in staging tumor depth but not for the definition of lymph node status (Kwee 2007; Kwee 2009). These conclusions are similar to those we present here, although our work provides formal evidence to sustain these hypotheses as well as a quantification of the average performance of this endoscopic tool. This information, along with the Bayesian nomograms, enables clinicians to get a precise sense of the risk of making errors, both in terms of false‐positive and false‐negative predictions, while using EUS, which ultimately can help them optimize the therapeutic management of patients based on statistically‐estimated diagnostic accuracy parameters and not on dichotomous personal opinions (i.e. 'works' versus 'does not work') on EUS performance, such as those deriving from qualitative reviews. Between 2001 and 2011, three meta‐analyses were also published on both primary tumor depth staging and regional lymph node staging of gastric cancer with EUS (Kelly 2001; Mocellin 2011; Puli 2008). The reliability of the first two articles (Kelly 2001; Puli 2008) is undermined by the use of a statistical method, based on the Moses‐Littenberg model, that is no longer considered scientifically sound for the meta‐analysis of diagnostic accuracy studies (Harbord 2008; Leeflang 2008; Macaskill 2010; Reitsma 2005; Rutter 2001). Furthermore, the number of included studies (13 and 22 respectively) was much lower than our retrieval rate and analysis (n = 66). The third meta‐analysis (Mocellin 2011), which was conducted with modern statistics on 54 studies, reported results slightly better than those described here: this difference might be due to the lower number of studies included in that meta‐analysis and is in line with the above‐mentioned trend towards better results in earlier series.

In conclusion, by analyzing the data from the largest series ever considered, we found that the diagnostic accuracy of EUS can be considered clinically useful, although not optimal, to guide physicians in the locoregional staging of patients with gastric carcinoma. However, the heterogeneity of the results warrants some caution, as well as further investigation for the identification of factors influencing the outcome of this diagnostic tool. Physicians should also be warned that EUS performance is slightly lower in diagnosing superficial tumors (T1a versus T1b) and lymph node status (positive versus negative).

Summary of main results

The main results of our review, summarized in summary of findings Table, are the following:

  • By analyzing the data from 66 articles published from 1988 through 2012, we collected data on 7747 people with gastric cancer who were staged with endoscopic ultrasonography (EUS): this represents the largest series ever reported on this topic.

  • The meta‐analysis of 50 studies (n = 4397) showed that the summary sensitivity and specificity of EUS in discriminating T1 ‐ T2 (superficial) versus T3 ‐ T4 (advanced) gastric carcinomas were 0.86 (95% CI 0.81 to 0.90) and 0.90 (95% CI 0.87 to 0.93), respectively.

  • For the diagnostic capacity of EUS to distinguish T1 (early gastric cancer, EGC) versus T2 (muscle‐infiltrating) tumors, the meta‐analysis of 46 studies (n = 2742) showed that the summary sensitivity and specificity were 0.85 (95% CI 0.78 to 0.91) and 0.90 (95% CI 0.85 to 0.93) respectively. When we addressed the capacity to distinguish between T1a (mucosal) versus T1b (submucosal) cancers the meta‐analysis of 20 studies (n = 3321) showed that the summary sensitivity and specificity were 0.87 (95% CI 0.81 to 0.92) and 0.75 (95% CI 0.62 to 0.84) respectively.

  • For the metastatic involvement of lymph nodes (N‐stage), the meta‐analysis of 44 studies (n = 3573) showed that the summary sensitivity and specificity were 0.83 (95% CI 0.79 to 0.87) and 0.67 (95% CI 0.61 to 0.72) respectively.

  • Overall, EUS accuracy can be considered clinically useful to guide physicians in the locoregional staging of patients with gastric cancer.

  • However, between‐study heterogeneity was not negligible: unfortunately, we could not identify any consistent source of the observed heterogeneity, and thus all the results presented here must be interpreted cautiously.

  • Moreover, the analysis of positive and negative likelihood values revealed that EUS diagnostic performance cannot be considered optimal either for disease confirmation or for exclusion, especially for distinguishing T1a (mucosal) from T1b (submucosal) cancers and positive from negative lymph node status.

Strengths and weaknesses of the review

The main strength of this review is the number of patients enrolled (n = 7747), which is the highest ever reported and guarantees a good representation of the results obtained with this diagnostic tool worldwide. Moreover, we provide not only conventional meta‐analysis results, such as summary estimates of diagnostic performance measures, but also findings from additional analyses such as Bayesian analysis, which add further information of clinical use, including Fagan plots and likelihood ratio matrices. The main limitation of this review is that, despite the high number of patients enrolled, heterogeneity is remarkably high, which may partially undermine the reliability and reproducibility of most reported results. Furthermore, the data available in the literature did not allow identification of possible sources of heterogeneity.

Applicability of findings to the review question

The number of studies identified (66) and the number of patients enrolled (7747) were sufficient to address the review question, i.e., quantification of EUS diagnostic performance in the locoregional staging of gastric carcinoma. Patients enrolled, technical features of both index test and reference standard, and clinical settings were homogeneously suitable for our analysis across all studies. As expected in diagnostic test accuracy meta‐analysis, heterogeneity was a problem: unfortunately, we could not identify consistent sources of heterogeneity, which did not allow us to suggest factors potentially influencing the performance of this diagnostic tool.

Study flow diagram.
Figures and Tables -
Figure 1

Study flow diagram.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
Figures and Tables -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies
Figures and Tables -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Summary ROC Plot of studies assessing the accuracy of EUS in discriminating T1 ‐ T2 versus T3 ‐ T4 gastric carcinomas. Each study sensitivity/specificity value is represented by an empty circle. The summary point for sensitivity/specificity is represented by a black filled circle. Dotted closed line: 95% confidence region of the summary point. Dashed closed line: 95% prediction region.
Figures and Tables -
Figure 4

Summary ROC Plot of studies assessing the accuracy of EUS in discriminating T1 ‐ T2 versus T3 ‐ T4 gastric carcinomas. Each study sensitivity/specificity value is represented by an empty circle. The summary point for sensitivity/specificity is represented by a black filled circle. Dotted closed line: 95% confidence region of the summary point. Dashed closed line: 95% prediction region.

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 ‐ T2 (rather than T3 ‐ T4) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).
Figures and Tables -
Figure 5

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 ‐ T2 (rather than T3 ‐ T4) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

EUS ability to discriminate between T1‐T2 and T3‐T4 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).
Figures and Tables -
Figure 6

EUS ability to discriminate between T1‐T2 and T3‐T4 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

Summary ROC Plot of 46 studies investigating the EUS ability to discriminate between T1 versus T2 gastric carcinomas.
Figures and Tables -
Figure 7

Summary ROC Plot of 46 studies investigating the EUS ability to discriminate between T1 versus T2 gastric carcinomas.

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 (rather than T2) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).
Figures and Tables -
Figure 8

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1 (rather than T2) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

EUS ability to discriminate between T1 and T2 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).
Figures and Tables -
Figure 9

EUS ability to discriminate between T1 and T2 gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

Summary ROC Plot of 20 studies investigating the diagnostic ability of EUS to discriminate between T1a versus T1b tumors.
Figures and Tables -
Figure 10

Summary ROC Plot of 20 studies investigating the diagnostic ability of EUS to discriminate between T1a versus T1b tumors.

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1a (rather than T1b) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).
Figures and Tables -
Figure 11

Fagan plot estimating how much the result of EUS changes the probability that a patient has a T1a (rather than T1b) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

EUS ability to discriminate between T1a and T1b gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).
Figures and Tables -
Figure 12

EUS ability to discriminate between T1a and T1b gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

Summary ROC Plot of 44 studies addressing the issue of EUS ability to discriminate between lymph node negative (N0) and positive (N+) cases.
Figures and Tables -
Figure 13

Summary ROC Plot of 44 studies addressing the issue of EUS ability to discriminate between lymph node negative (N0) and positive (N+) cases.

Fagan plot estimating how much the result of EUS changes the probability that a patient has a N+ (metastatic lymph nodes) (rather than a N0, disease free lymph nodes) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).
Figures and Tables -
Figure 14

Fagan plot estimating how much the result of EUS changes the probability that a patient has a N+ (metastatic lymph nodes) (rather than a N0, disease free lymph nodes) gastric cancer, considering a given pre‐test probability (here the mean pre‐test probability found in eligible studies is shown as an example).

EUS ability to discriminate between N+ (metastatic lymph nodes) and N0 (disease free lymph nodes) gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).
Figures and Tables -
Figure 15

EUS ability to discriminate between N+ (metastatic lymph nodes) and N0 (disease free lymph nodes) gastric carcinomas. Likelihood ratio (LR) scattergram defining quadrants of informativeness based on desirable thresholds (positive LR>10, negative LR<0.1): left upper quadrant (test suitable both for diagnosis exclusion and confirmation), right upper (confirmation only), left lower (exclusion only), right lower (neither confirmation nor exclusion).

T12 vs T34.
Figures and Tables -
Test 1

T12 vs T34.

T1 vs T2.
Figures and Tables -
Test 2

T1 vs T2.

T1a vs T1b.
Figures and Tables -
Test 3

T1a vs T1b.

N0 vs N+.
Figures and Tables -
Test 4

N0 vs N+.

Summary of findings Summary of findings Table

General information

General issue

What is the diagnostic performance of endoscopic ultrasound (EUS) in assessing disease stage in people with gastric carcinoma ?

Specific questions

What is the diagnostic performance of EUS in assessing primary tumor depth ?

Superficial (T1 ‐ T2) versus advanced (T3 ‐ T4) tumors

Early (T1) versus muscular (T2) tumors

Mucosal (T1a) versus submucosal (T1b) tumors

What is the diagnostic performance of EUS in assessing regional lymph node status ?

Non‐metastatic (N0) versus metastatic (N+) lymph nodes

Patients

Patients diagnosed with gastric carcinoma

Settings

Pre‐treatment evaluation of disease stage

Index tests

Endoscopic ultrasound (EUS)

Reference standard

Histology of surgical or endoscopic specimen

Importance

Choosing best treatment or treatment sequence of gastric carcinoma

Studies

66 studies enrolling 7747 patients

Quality concerns

Overall judgement

Good quality

Applicability concerns

None

Patient selection bias

None

Index test interpretation bias

High risk: 5 studies

Reference test interpretation bias

None

Flow and timing selection bias

High risk: 2 studies

Unclear risk: 2 studies

T1 ‐ T2 versus T3 ‐ T4 tumors

Studies

50 (patients enrolled: 4397)

Summary results

Sensitivity: 0.86 (95% CI: 0.81 to 0.90). Specificity: 0.90 (95% CI: 0.87 to 0.93)

Consequences

In a hypothetical cohort of 1000 patients (T1 ‐ T2 prevalence: 50%)

Correctly classified: 880

Overstaged: 70

Understaged: 50

T1 versus T2 tumors

Studies

46 (patients enrolled: 2742)

Summary results

Sensitivity: 0.85 (95% CI: 0.78 to 0.91). Specificity: 0.90 (95% CI: 0.85 to 0.93)

Consequences

In a hypothetical cohort of 1000 patients (T1 prevalence: 70%)

Correctly classified: 865

Overstaged: 105

Understaged: 30

T1a versus T1b tumors

Studies

20 (patients enrolled: 3321)

Summary results

Sensitivity: 0.87 (95% CI: 0.81 to 0.92). Specificity: 0.75 (95% CI: 0.62 to 0.84)

Consequences

In a hypothetical cohort of 1000 patients (T1a prevalence: 70%)

Correctly classified: 834

Overstaged: 91

Understaged: 75

N0 versus N+ tumors

Studies

44 (patients enrolled: 3573)

Summary results

Sensitivity: 0.83 (95% CI: 0.79 to 0.87). Specificity: 0.67 (95% CI: 0.61 to 0.72)

Consequences

In a hypothetical cohort of 1000 patients (N+ prevalence: 50%)

Correctly classified: 750

Overstaged: 85

Understaged: 165

Figures and Tables -
Summary of findings Summary of findings Table
Table 1. Subgroup and sensitivity analysis for T1 ‐ T2 versus T3 ‐ T4 gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

18

0.89 (0.83 to 0.95)

0.90 (0.85 to 0.94)

0.46

<100

32

0.83 (0.77 to 0.90)

0.91 (0.87 to 0.94)

Year of publication

2000 or later

32

0.81 (0.75 to 0.88)

0.88 (0.84 to 0.91)

<0.01

before 2000

18

0.91 (0.87 to 0.96)

0.94 (0.91 to 0.96)

Country

Western

27

0.82 (0.75 to 0.89)

0.91 (0.87 to 0.94)

0.27

Eastern

23

0.89 (0.84 to 0.94)

0.90 (0.86 to 0.94)

EUS array

Radial

39

0.88 (0.84 to 0.93)

0.90 (0.87 to 0.93)

<0.01

Linear

4

0.68 (0.40 to 0.96)

0.86 (0.73 to 1.00)

Tumor site

Cardia only

6

0.70 (0.49 to 0.91)

0.90 (0.82 to 0.98)

0.12

Any site

44

0.87 (0.83 to 0.91)

0.90 (0.88 to 0.93)

Quality

High

45

0.86 (0.82 to 0.91)

0.90 (0.88 to 0.93)

0.59

Low

5

0.78 (0.72 to 1.00)

0.90 (0.81 to 0.99)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Figures and Tables -
Table 1. Subgroup and sensitivity analysis for T1 ‐ T2 versus T3 ‐ T4 gastric tumors
Table 2. Subgroup and sensitivity analysis for T1 versus T2 gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

5

0.97 (0.93 to 1.00)

0.73 (0.53 to 0.93)

0.01

<100

41

0.81 (0.74 to 0.87)

0.91 (0.88 to 0.95)

Year of publication

2000 or later

29

0.82 (0.74 to 0.90)

0.91 (0.87 to 0.96)

0.51

before 2000

17

0.88 (0.80 to 0.96)

0.87 (0.80 to 0.95)

Country

Western

22

0.71 (0.59 to 0.82)

0.94 (0.91 to 0.97)

<0.01

Eastern

24

0.92 (0.88 to 0.96)

0.84 (0.77 to 0.91)

EUS array

Radial

37

0.85 (0.79 to 0.91)

0.90 (0.85 to 0.94)

<0.01

Linear

3

0.92 (0.77 to 1.00)

0.98 (0.93 to 1.00)

Tumor site

Cardia only

5

0.91 (0.80 to 1.00)

0.89 (0.77 to 1.00)

0.59

Any site

41

0.84 (0.77 to 0.90)

0.90 (0.86 to 0.94)

Quality

High

41

0.85 (0.79 to 0.91)

0.89 (0.85 to 0.93)

0.39

Low

5

0.79 (0.57 to 1.00)

0.96 (0.90 to 1.00)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Figures and Tables -
Table 2. Subgroup and sensitivity analysis for T1 versus T2 gastric tumors
Table 3. Subgroup and sensitivity analysis for T1a versus T1b gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

8

0.90 (0.84 to 0.96)

0.67 (0.50 to 0.85)

0.49

<100

12

0.85 (0.76 to 0.93)

0.79 (0.67 to 0.91)

Year of publication

2000 or later

11

0.90 (0.84 to 0.95)

0.70 (0.55 to 0.85)

0.59

before 2000

9

0.84 (0.75 to 0.94)

0.79 (0.66 to 0.93)

Country

Western

1

1.00 (0.40 to 1.00)

1.00 (0.69 to 1.00)

N/A

Eastern

19

0.88 (0.81 to 0.92)

0.73 (0.61 to 0.82)

EUS array

Radial

18

0.89 (0.82 to 0.93)

0.72 (0.59 to 0.82)

N/A

Linear

1

0.50 (0.07 to 0.93)

0.88 (0.64 to 0.99)

Tumor site

Cardia region

1

0.50 (0.07 to 0.93)

0.88 (0.64 to 0.99)

N/A

Any site

19

0.88 (0.82 to 0.92)

0.74 (0.61 to 0.83)

Quality

High

18

0.84 (0.79 to 0.89)

0.75 (0.64 to 0.86)

0.05

Low

2

0.98 (0.96 to 1.00)

0.64 (0.26 to 1.00)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Figures and Tables -
Table 3. Subgroup and sensitivity analysis for T1a versus T1b gastric tumors
Table 4. Subgroup and sensitivity analysis for N0 versus N+ gastric tumors

Variable

Category

Studies

Sensitivity (95% CI)

Specificity (95% CI)

P value*

Sample size

>100

12

0.83 (0.77 to 0.89)

0.65 (0.55 to 0.75)

0.90

<100

32

0.84 (0.79 to 0.88)

0.67 (0.61 to 0.74)

Year of publication

2000 or later

28

0.83 (0.79 to 0.88)

0.66 (0.59 to 0.73)

0.95

before 2000

16

0.83 (0.76 to 0.89)

0.68 (0.58 to 0.77)

Country

Western

24

0.80 (0.75 to 0.86)

0.72 (0.66 to 0.79)

0.04

Eastern

20

0.86 (0.81 to 0.90)

0.59 (0.51 to 0.68)

EUS array

Radial

35

0.84 (0.80 to 0.88)

0.66 (0.60 to 0.73)

0.11

Linear

4

0.83 (0.69 to 0.98)

0.66 (0.46 to 0.86)

Tumor site

Cardia region

6

0.86 (0.76 to 0.96)

0.76 (0.63 to 0.88)

0.27

Any site

38

0.83 (0.79 to 0.87)

0.65 (0.59 to 0.71)

Quality

High

41

0.83 (0.79 to 0.87)

0.67 (0.62 to 0.73)

0.57

Low

3

0.88 (0.77 to 0.99)

0.56 (0.32 to 0.80)

CI: confidence interval
EUS: endoscopic ultrasonography
*P values are from likelihood ratio test for model with and without the covariate, to identify diagnostic performance differences across variable categories.

Figures and Tables -
Table 4. Subgroup and sensitivity analysis for N0 versus N+ gastric tumors
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 T12 vs T34 Show forest plot

50

4397

2 T1 vs T2 Show forest plot

46

2742

3 T1a vs T1b Show forest plot

20

3321

4 N0 vs N+ Show forest plot

44

3573

Figures and Tables -
Table Tests. Data tables by test