Screening tests for active pulmonary tuberculosis in children

Bryan Vonasek; Tara Ness; Yemisi Takwoingi; Alexander W Kay; Susanna S van Wyk; Lara Ouellette; Ben J Marais; Karen R Steingart<sup>a</sup>; Anna M Mandalakas<sup>a</sup>

doi:10.1002/14651858.CD013693

Screening tests for active pulmonary tuberculosis in children

Authors' declarations of interest

Version published: 28 July 2020 Version history

https://doi.org/10.1002/14651858.CD013693

Collapse all Expand all

Abstract

Objectives

This is a protocol for a Cochrane Review (diagnostic). The objectives are as follows:

To determine the sensitivity, specificity, and positive and negative predictive value of 1) the presence of one or more tuberculosis symptoms, or symptom combinations; 2) chest radiography; 3) Xpert MTB/RIF; 4) Xpert Ultra; and 5) combinations of the aforementioned tests as screening tests for detecting active pulmonary tuberculosis in children in the following groups.

Household contacts of a person with active tuberculosis;
School contacts of a person with active tuberculosis;
Other close contacts of a person with active tuberculosis;
Children living with HIV;
Children with pneumonia;
Other risk groups (e.g. children with a history of previous tuberculosis, malnourished children);
Children in the general population in high burden settings

Secondary objectives

To compare the accuracy of the different index tests, including different applications of tests (e.g. CXR with any abnormality versus, more specifically, CXR with abnormality suggestive of tuberculosis); we are interested in the accuracy of the index tests in any setting (i.e. community, outpatient, and inpatient).

To investigate potential sources of heterogeneity in accuracy estimates in relation to age group, HIV status, whether the study was conducted in a high tuberculosis burden country, and whether the child received a single screening or more than one screening.

Background

Tuberculosis continues to elude traditional control strategies. According to the WHO Global TB Report 2019, an estimated 10 million people in 2018 fell ill with tuberculosis worldwide. Of these, over 25% were not diagnosed or reported to the World Health Organization (WHO). Children less than 15 years old represented approximately 11% of incident cases but 14% of the estimated 1.3 million deaths from tuberculosis in 2018. This relatively higher share of mortality in children highlights urgent needs of improved case detection and subsequent access to treatment in this age group (WHO Global TB Report 2019).

Case finding is a crucial step in the cascade of care for patients with tuberculosis; however, for most deaths from tuberculosis in children, the disease is never diagnosed (Jenkins 2017). In the "Roadmap towards ending TB in children and adolescents," the WHO identifies case finding for child tuberculosis as a key activity (WHO 2018). Major factors lead to underdiagnosis of tuberculosis in children including the following: 1) symptoms tend to be less specific in children and overlap with those of other common childhood diseases; 2) existing tests for children are invasive and have suboptimal sensitivity; ideally, tests need to be inexpensive, accessible, and usable at the point of care, allowing for actionable information for patient care; and 3) reliance on a clinical diagnosis of tuberculosis, without microbiological evidence of disease, requires expertise, which is often not available in areas where the burden of disease is greatest. Given these factors, national and international guidelines for child health generally lack systematic screening strategies for tuberculosis (WHO 2018).

For adult populations, systematic screening for tuberculosis in high‐risk groups and vulnerable populations is a more established strategy to improve case detection in high burden settings. In 2013, the WHO published “Systematic screening for active tuberculosis: principles and recommendations.” This document provided guidance for the development of screening approaches for adult populations (WHO 2013a). A Cochrane Protocol (van’t Hoog 2014) and an ensuing non‐Cochrane systematic review (van’t Hoog 2013) contributed to the WHO recommendations (WHO 2013a). Participants included in the systematic review were adults 15 years and older. The review excluded studies of children 0 to 5 years of age or studies of child tuberculosis only. Since 2013, estimation of the true burden of child tuberculosis has improved and several promising strategies for case finding are being either newly implemented or developed (Schumacher 2019; Stop TB Partnership 2019). With this, there is a new call to push forward systematic screening for active tuberculosis in children (Reuter 2019; WHO 2018). This review will address tuberculosis screening strategies in children less than 15 years of age.

Screening

Tuberculosis screening is a term that has been used differently in the literature depending on the context. We have adopted the definition of tuberculosis screening from the WHO as "the systematic identification of people with suspected active TB, in a predetermined target group, using tests, examinations or other procedures that can be applied rapidly" (WHO 2013a;WHO 2015).” The WHO's more recent End‐TB strategy emphasizes early diagnosis of tuberculosis and systematic screening of contacts and high‐risk groups (WHO 2018), which is in line with the above definition of tuberculosis screening.

Target condition being diagnosed

Tuberculosis is a communicable disease caused by the bacterium Mycobacterium tuberculosis (M tuberculosis). A small fraction of those with tuberculosis infection initially develop active tuberculosis (tuberculosis disease). More commonly, initial infection leads to latent tuberculosis infection, which has the potential to become active tuberculosis throughout an individual’s lifetime, especially during states of immunosuppression such as HIV infection and malnutrition. M tuberculosis is transmitted from person to person through the air and, therefore, most commonly causes disease in the lungs, referred to as pulmonary tuberculosis. Tuberculosis can, however, occur in any organ or tissue outside of the lungs (termed extrapulmonary tuberculosis), with lymph node tuberculosis as the most common form and tuberculous meningitis as the most severe form of extrapulmonary disease. As the most common form of active tuberculosis is lung disease, most screening studies in adults and children evaluate tests and strategies for pulmonary tuberculosis and verify tuberculosis using respiratory specimens. In this review, the target condition will be pulmonary tuberculosis.

Signs and symptoms of pulmonary tuberculosis include fever, cough, night sweats, weight loss or poor weight gain, visible neck mass and lethargy or decreased activity/playfulness. However, pulmonary tuberculosis symptoms in children, especially those under five years of age, tend to be less specific because they often overlap with other common paediatric conditions such as pneumonia, HIV‐associated lung disease, and malnutrition (Jaganath 2012; Oliwa 2015). Compared to adults, children are much more likely to progress from latent tuberculosis infection to tuberculosis disease. Further, among those progressing to disease, younger children are more likely to experience severe manifestations (Marais 2004; Marais 2014).

Microbiological confirmation of pulmonary tuberculosis in children is complicated by two main factors. First, younger children are not able to voluntarily expectorate sputum, which is the standard specimen used for microbiological detection of pulmonary tuberculosis in adults. Therefore, specimens from young children traditionally are collected from more invasive methods such as gastric aspiration and sputum induction (Graham 2015). Second, lung cavities with high bacillary load as seen in adult pulmonary tuberculosis is uncommon in children, especially in young children (< 10 years of age). The number of bacilli causing disease in children tends to be low and the 'paucibacillary' nature of their disease compromises diagnostic yield (Dunn 2016).

Index test(s)

This review includes the following index tests used in screening for pulmonary tuberculosis in children: tuberculosis symptoms, chest radiography (CXR), Xpert MTB/RIF and Xpert Ultra, and various combinations of these tests.

With symptom screening, individuals or their caregivers are interviewed about symptoms suggestive of pulmonary tuberculosis such as cough of varying duration, fever of varying duration, weight loss or poor weight gain, night sweats, visible neck mass, and decreased activity. Though not a true symptom, contact with individuals with tuberculosis is another important factor when interviewing for tuberculosis risk (Graham 2015).

CXR can involve posterior‐anterior, anterior‐posterior, and/or lateral recording. Commonly used types of CXR include conventional CXR (producing 36 cm x 43 cm film), digital radiography, and computed radiography. The most common radiographic finding of pulmonary tuberculosis in children is hilar lymphadenopathy (Leung 1992), though CXR has limitations identifying this finding (Swingler 2005). Accurate interpretation of CXR findings for pulmonary tuberculosis in children is dependent on the ability of the individual interpreting the CXR, and wide interobserver variation has been reported (Du 2002; Kaguthi 2014). Computer‐aided interpretation of CXR for pulmonary tuberculosis is a promising new technology, especially for resource‐limited settings where expertise in CXR interpretation is limited (Qin 2019; Sodhi 2017), but this technology has currently not been assessed in children (Reuter 2019).

Xpert MTB/RIF and Xpert Ultra, the newest version, (Cepheid Inc, CA, USA) are nucleic acid amplification tests (NAAT) that can detect both M tuberculosis DNA and rifampicin resistance. We will not assess rifampicin resistance in this review. These two assays are completely automated and self‐contained once the sample is loaded into the cartridge. Specimen processing is similar for both Xpert MTB/RIF and Xpert Ultra using Xpert Sample Reagent and requires 15 minutes of incubation. Within two hours, results are available. Consistent supply of electricity, temperature control, and annual calibration of the cartridge modules are needed for Xpert MTB/RIF and Xpert Ultra (Global Laboratory Initiative 2019). Xpert Ultra has approximately 1‐log improvement in the lower limit of detection of bacterial load compared to previous generations of the test (Chakravorty 2017). Xpert Ultra also has a new result category, 'trace call,' that represents minimally detectable bacillary burden. According to the WHO, a 'trace call' result is adequate to prompt initiation of anti‐tuberculosis therapy in children or people living with HIV (WHO 2017b). The WHO recommends the use of Xpert MTB/RIF and Xpert Ultra as initial diagnostic tests for pulmonary tuberculosis in adults and children. Specifically in children, the guidelines recommend use of a variety of specimen types for diagnosis including gastric specimens, nasopharyngeal specimens, and stool specimens, in addition to sputum (WHO 2020). We will include Xpert MTB/RIF (all versions) and Xpert Ultra in this review.

Another WHO‐recommended NAAT for detection of tuberculosis is the TrueNat assay (Molbio Diagnostics/Bigtec Labs, Goa/Bengaluru, India) (WHO 2020). However, to our knowledge, there are currently no published studies assessing this test in children. We plan to include TrueNat in this review if data become available while we perform the review.

Clinical pathway

As shown in Figure 1, there are two complementary approaches to detection of tuberculosis disease. The first is the patient‐initiated pathway, also known as passive case finding. The second is the provider‐initiated screening or active case finding pathway (WHO 2015), which is the analytic framework for this review. One major challenge with either pathway is that 'high quality diagnosis' is elusive for child tuberculosis, especially for younger children and children in resource‐limited settings. This diagram also demonstrates the wide range of potential target populations for tuberculosis screening in children, ranging from contacts of those with tuberculosis ('exposed') to symptomatic patients accessing healthcare (e.g. children living with HIV, as described above). This review will include evidence from all of these systematic screening strategies.

Figure 1

There are two complementary approaches to detection of tuberculosis disease. The first is the patient‐initiated pathway, also known as passive case finding. The second is the provider‐initiated screening pathway (WHO 2015), which is the analytic framework for this review. One major challenge with either pathway is that 'high quality diagnosis' is elusive for child tuberculosis, especially for younger children and in resource‐limited settings. This diagram also demonstrates the wide range of potential target populations for tuberculosis screening, ranging from contacts of those with tuberculosis ('exposed') to symptomatic patients accessing healthcare, such as children living with HIV. Copyright © [2015] [World Health Organization]: reproduced with permission.

There is no standard screening approach for children less than 15 years old. For the subgroup of children living with HIV, since 2011 the WHO has recommended symptom screening for all children living with HIV presenting to healthcare facilities. Under this guideline, children living with HIV older than 12 months of age presenting with any cough, fever, weight loss or poor weight gain, or history of contact with someone with tuberculosis should be further investigated for tuberculosis. In the absence of any of these four symptoms, they "are unlikely to have active TB." Although this 'strong recommendation' was based upon 'low quality evidence' (WHO 2011), it exemplifies a standardized screening approach for tuberculosis that is otherwise lacking for the paediatric population.

Screening may use sequential or parallel strategies (Figure 2). With sequential strategies, only those with a positive result in the first step are screened in the second step. With parallel screening strategies, multiple different screens are done initially, and any positive screen or combinations of positive screens prompts further investigation (i.e. confirmatory test) for the target condition. Results from various screening strategies will be included in this review. We will consider individuals’ results to be ‘true screen positives’ if they were rightfully referred for confirmatory testing; in contrast, we will consider individuals’ results to be ‘false screen positives’ if the individuals were referred for confirmatory testing but not diagnosed with tuberculosis. Although individuals with negative screens should not undergo confirmatory testing during routine clinical practice, individuals with negative screens may complete confirmatory testing in a research context to establish true screen negatives and false screen negatives. As described in Types of studies, studies that only conduct confirmatory testing on those with positive screens will be analysed in this review. In the context of this review, the purpose of the index tests is considered to be 'screening', and their role is considered to be triage tests. With triage tests, the index test is used prior to an existing test or strategy, and only those with a specific result on the triage test continue along the clinical pathway (Bossuyt 2006).

Figure 2

Different screening and diagnostic algorithms

Alternative test(s)

Two types of immunologic tests not included in this review are the tuberculin skin test (TST) and the interferon gamma release assay (IGRA). Both of these methods are dependent on the cellular immune response to M tuberculosis antigens in individuals previously exposed to the organism, and neither can distinguish between latent tuberculosis infection and active tuberculosis disease (Pai 2014). Further, neither method is sensitive enough to serve as a rule out test for tuberculosis disease in children but rather may be used to support tuberculosis diagnosis. The TST has been in clinical use for over a century and involves intradermal injection of M tuberculosis purified protein derivative. Drawbacks to the TST include the need for a second clinical encounter 48 to 72 hours after placement for result interpretation, inter‐reader variability, a tendency for previous bacillus Calmette‐Guerin vaccination to result in false‐positive results, and a tendency for false‐negative results in immunosuppressed individuals or due to anergy in individuals with active disease (Pai 2014).

Commercially available IGRAs include QuantiFERON‐TB Gold In‐tube (QFT‐GIT; Qiagen, Germantown, MD), QuantiFERON‐TB Gold Plus (QFT‐Plus; Qiagen) and T‐SPOT.TB (Oxford Immunotec Ltd, Oxford, United Kingdom). To improve upon the TST, IGRAs were developed to measure release of interferon gamma from T cells stimulated by antigens specific to M tuberculosis. The QFT‐GIT assay stimulates interferon gamma release from CD4+ T cells, while the QFT‐Plus assay can stimulate both CD4+ and CD8+ T cell responses. CD8+ cytotoxic T cells have been shown to have higher responses in subjects with active pulmonary tuberculosis compared to latent tuberculosis infection (Day 2011; Rozot 2013). Individuals with low CD4+ T cell counts (e.g. those with advanced HIV) have been shown to maintain CD8+ T cell antigen responses to M tuberculosis (Sutherland 2010). For these reasons, it is theorized that the QFT‐Plus assay may be more sensitive for those with active tuberculosis and people living with HIV (Theel 2018). The T‐SPOT.TB is an enzyme‐linked immunoassay that involves incubation of separated and counted peripheral blood mononuclear cells with antigens specific to M tuberculosis. If the number of interferon gamma‐producing T cells (spot‐forming cells) exceeds a specific threshold relative to negative control wells, the result is positive. All IGRAs utilize positive and negative controls, and they can have indeterminate results if there is a low interferon gamma response in the positive control or if there is a high response in the negative control (Pai 2014).

Beyond the index tests described above, there are a number of alternative approaches and tests for screening (or diagnosis) of tuberculosis in children. Prior to the widespread roll out of Xpert MTB/RIF worldwide over the past decade, testing for tuberculosis mostly relied upon examination of sputum smears for acid‐fast bacilli under a light microscope using the classical Ziehl‐Neelsen staining technique, fluorescence microscopy, or light‐emitting diode (LED)‐based fluorescence microscopy. A recent review found that in children, the sensitivity of smear microscopy was about 22% in gastric specimens and about 29% in expectorated and induced sputum specimens (WHO 2013b).

A variety of newer assays detect lipoarabinomannan (LAM) antigen in urine of people with tuberculosis disease. LAM is a lipopolysaccharide present in the mycobacterial cell wall. Urine LAM assays have the advantages of being non‐invasive and rapid. Currently, the only rapid commercially available LAM assay is the Determine TB‐LAM (Alere, Waltham, MA, USA). Based on evidence from randomized trials and a Cochrane Review (Bjerrum 2019), the WHO recommends that AlereLAM should be used to assist in the diagnosis of active tuberculosis in HIV‐positive adults, adolescents, and children. The full recommendations, which differ for inpatients and outpatients, are described here: WHO Lateral flow LAM 2019. Another LAM assay expected to become commercially available in 2020 is the Fujifilm SILVAMP TB‐LAM (Fujifilm, Tokyo, Japan). Early evidence for this assay demonstrates superior sensitivity compared to AlereLAM for adults living with HIV (Bjerrum 2020, Broger 2019).

It is noteworthy that development of novel tools for detection of tuberculosis disease is a very active field. Noteworthy tests with emerging evidence include C‐reactive protein (Albuquerque 2019), IP‐10 (Alsleben 2011;Holm 2014;Jenum 2016; Sudbury 2019; Tebruegge 2015), and C‐Tb (Statens Serum Institut, Copenhagen (Aggerbeck 2019; Ruhwald 2017). Over the next decade, more efficient technologies are anticipated with the hope that these will advance screening strategies and reduce the burden of child tuberculosis worldwide (Schumacher 2019; Stop TB Partnership 2019; WHO 2017a).

Rationale

Effective screening for active tuberculosis in children supports timely and reliable diagnosis, which is essential for reducing tuberculosis‐attributable morbidity and mortality. Effective screening also supports disease rule out, thereby guiding treatment for latent tuberculosis infection and preventive treatment for exposed or other high‐risk groups such as people living with HIV. Historically, screening children for active tuberculosis has been limited by the lack of accurate screening and diagnostic tools. Therefore, systematic screening in children has only been performed within specific populations with increased risk of disease to limit the risk of false‐positive test errors and consequent over‐treatment of tuberculosis. Guidance from the WHO states that "only children who are close contacts of someone with tuberculosis and HIV‐positive children should be systematically screened for TB" (WHO 2015). Optimal screening strategies for these two high‐risk groups are lacking, particularly in resource‐limited settings (Szkwarko 2017, WHO 2011). Further, limiting systematic screening to child contacts and HIV‐positive children propagates missed opportunities as evidence has identified other high‐risk groups of children in certain settings and with health conditions, such as malnutrition or pneumonia, who are also at risk of tuberculosis (Arscott‐Mills 2014, Chisti 2014; LaCourse 2014; Munthali 2017; Oliwa 2015). Finally, increasing evidence demonstrates that children have considerable risk of tuberculosis exposure outside of their homes with up to 70% to 90% of children with tuberculosis having no known exposure (Martinez 2019).

This Cochrane Review will inform an upcoming WHO meeting to update guidelines for systematic screening for active tuberculosis. To our knowledge, this is the first systemic review on this topic in children. There have been several systematic reviews evaluating the diagnostic accuracy of the index tests described above for active tuberculosis, including an ongoing Cochrane Review evaluating Xpert MTB/RIF and Xpert Ultra in children (Kay 2019). The lack of knowledge regarding the performance of screening tests in children likely reflects the predominance of paediatric research which has assessed the performance of tuberculosis tests for diagnosis rather than screening. The current review will shed light on the potential of these tools for systematic screening for active pulmonary tuberculosis in children in specific high‐risk populations.

Objectives

Household contacts of a person with active tuberculosis;
School contacts of a person with active tuberculosis;
Other close contacts of a person with active tuberculosis;
Children living with HIV;
Children with pneumonia;
Other risk groups (e.g. children with a history of previous tuberculosis, malnourished children);
Children in the general population in high burden settings

Secondary objectives

Methods

Criteria for considering studies for this review

Types of studies

We will include cross‐sectional studies, cohort studies, and randomised controlled trials that assessed the accuracy of at least one of the defined index tests for pulmonary tuberculosis. We will only include studies that used a microbiologic reference standard (defined below). We will include studies from all settings and time periods. Randomized controlled trials will be included because these studies may report sensitivity and specificity in addition to patient health outcomes. For randomised studies that compare different screening strategies, we will evaluate each arm as a separate cohort. Data on the index test(s) must be available to be extracted as true positive, false positive, true negative, and false negative against the reference standard(s) so that we can construct two‐by‐two contingency tables.

Studies applying index tests multiple times to an individual within a short timeframe (e.g. within a single hospital admission), will be considered diagnostic rather than using a screening approach, and these studies will be excluded. Studies in which children with negative screening test results were not subjected to the reference standard will be included. As shown in Figure 2, this often occurs in tuberculosis prevalence studies when it is assumed that those with a negative screen (e.g. no CXR abnormalities) do not have active tuberculosis. However, this leads to poor enumeration of true negative and false negative test results. Therefore, we will only include such studies that partially verified tuberculosis status if they were conducted prospectively and enrolled a consecutive or random sample of eligible children. Our rationale for specifying these strict design criteria is to enable us to calculate positive predictive values from such studies and include them in a separate set of analyses. Due to the direct relationship between prevalence and predictive values, to reduce potential variation in prevalence between studies, we will only include studies in the analyses of positive predictive value if the studies were done in the same setting (e.g. community or accessing healthcare), target population, and for the same purpose (e.g. contact tracing).

We will include cohort studies with children with active tuberculosis identified after the time point that the screening test was applied. Especially with studies performed in settings of intended use, the collection of specimens and conduct of the reference standard may occur some time after the screening test was done. In low resource‐settings, this process may take weeks. However, a longer time between the index test and the reference standard would make us less sure that the target condition did not change between the two tests. We will address this issue in the QUADAS‐2 domain 4: Flow and timing and with a sensitivity analysis (see Sensitivity analyses).

We will exclude case reports and case‐control studies, the latter because of the high risk of bias in diagnostic accuracy studies (Rutjes 2006).

Participants

We will include studies enrolling HIV‐positive and HIV‐negative children younger than 15 years old not known to have active tuberculosis prior to screening. We will exclude studies if they include older children and we are unable to extract data for children younger than 15 years old from the publication. We will include children in the general population in high burden settings, children living in areas of high tuberculosis burden, and high‐risk groups, including children younger than five years old; children living with HIV; children with recent exposure to a person with active tuberculosis; and household, school, or other contacts of a person with active tuberculosis. We will include studies in which children are screened only once and studies that report longitudinal screening with repeated screening tests at predetermined intervals.

Index tests

For symptom screening, we will include studies that assess symptoms of tuberculosis or combinations of symptoms as described by the primary study authors. Symptoms of active tuberculosis in children may include cough, fever, decreased appetite, weight loss or failure to thrive, and fatigue or reduced playfulness. Older children may experience symptoms similar to those in adults and include persistent cough, haemoptysis, and weight loss, fever, night sweats and fatigue. The threshold was presence or absence of symptoms.

For CXR screening, we will include studies that utilize conventional radiography, digital radiography, and computed radiography. We will include all classification systems for identification of CXR abnormalities, including automated interpretation of radiography using deep learning or artificial intelligence technology. We will categorize all CXR screening results as follows. We will use an author defined threshold for CXR results. Essentially this is an implicit threshold utilized by the CXR reader.

Normal.
Any CXR abnormality, i.e. abnormalities suggestive of tuberculosis and other abnormalities.
Abnormalities suggestive of tuberculosis.

For Xpert MTB/RIF and Xpert Ultra, we will include studies in which the index tests are evaluated in expectorated or induced sputum, gastric aspirate specimens, nasopharyngeal aspirate specimens, and bronchoalveolar lavage specimens. Tuberculosis bacilli in sputum can be swallowed and detected in stool so we will also include studies assessing stool specimens. We will include studies assessing more than one type of respiratory specimen collected at the same time and extract 2 x 2 data separately for each specimen type.

Xpert MTB/RIF provides the following printed test results:

MTB (M tuberculosis) DETECTED; Rif (rifampicin)resistance DETECTED;
MTB DETECTED; Rif resistance NOT DETECTED;
MTB DETECTED; Rif resistance INDETERMINATE;
MTB NOT DETECTED;
INVALID (the presence or absence of MTB cannot be determined);
ERROR (the presence or absence of MTB cannot be determined);
NO RESULT (the presence or absence of MTB cannot be determined).

Xpert Ultra also gives the following semi‐quantitative classifications of M tuberculosis bacterial burden from the sample: trace, very low, low, moderate, and high. For this review, Xpert MTB/RIF and Xpert Ultra results will be categorized as:

Positive: 'MTB DETECTED,' including 'trace' results from Xpert Ultra
Negative: 'MTB NOT DETECTED'
Inconclusive: 'INVALID,' 'ERROR,' or 'NO RESULT'

We will not evaluate rifampicin resistance in this review.

As shown in Figure 2, with two parallel screening tests, the parallel strategy will entail any of the individual components of the strategy being positive resulting in a positive parallel strategy screen and all individual components being negative resulting in a negative parallel strategy screen. For studies assessing parallel screening tests, if data for the individual components of the parallel strategy against the reference standard is also available, these data will also be extracted for analysis.

Target conditions

The target condition is active pulmonary tuberculosis.

We anticipate that some studies may evaluate the index tests for active tuberculosis and not explicitly state 'pulmonary tuberculosis', the target condition in this review. We will include these studies because the most common type of active tuberculosis in children is lung disease; hence, most screening studies in children evaluate tests for pulmonary tuberculosis and diagnose tuberculosis using respiratory specimens. If data are sufficient, we will perform a sensitivity analysis limiting inclusion to those studies that explicitly evaluated the index tests for pulmonary tuberculosis.

Reference standards

We will utilize two reference standards, a microbiological and a composite reference standard.

Microbiological reference standard

Confirmed pulmonary tuberculosis will be defined as a positive culture (on solid or liquid medium) or a positive Xpert MTB/RIF or Xpert Ultra test from a respiratory specimen. When either the Xpert MTB/RIF or Xpert Ultra test is the index test, we will not include the test as a reference standard to avoid incorporation bias. We will not include studies where sputum smear microscopy is the reference standard.

Collection of multiple respiratory specimens may improve the diagnostic yield of testing for tuberculosis in children (Cruz 2012;Zar 2012). With respect to the microbiological reference standard, we will include studies that involve multiple specimens collected over time. In these studies, we will utilize the classification of the reference standard as defined by the primary study authors (most commonly at least one positive result representing a positive reference test).

Composite reference standard

We will define the composite reference standard as microbiological confirmation (as above) or author‐defined clinical pulmonary tuberculosis. Clinical pulmonary tuberculosis must include a component of follow‐up to verify the diagnosis of active tuberculosis. The consensus research definition of clinical tuberculosis in children (Graham 2015) is likely too restrictive for the purpose of this review. Two of our index tests, symptoms and CXR, are typically components of case definitions used to support the clinical diagnosis of tuberculosis (i.e. not microbiologically confirmed). This raises the potential for incorporation bias with the composite reference standard, i.e. where the result of the index test is used to help determine the reference standard diagnosis. Therefore, we will assess the composite reference standard for incorporation bias using QUADAS‐2, which will be enhanced with the following additional signalling question: "Was incorporation bias avoided (inclusion of index test as part of the reference standard)?" In addition, we will discuss incorporation bias as a limitation of the review.

We will define 'not tuberculosis' as negative microbiological test results and establishment of alternative diagnosis during the evaluation for tuberculosis, resolution of symptoms without tuberculosis treatment, or no progression of symptoms for at least one month without tuberculosis treatment.

Search methods for identification of studies

We will attempt to identify all relevant published studies regardless of language. We will describe unpublished studies in the Ongoing studies section of the review. Although they will not be assessed as index tests in this review, we will include immunologic tests in the search strategy. This will allow for archiving of relevant studies for a future systematic review assessing immunologic tests as index tests.

Electronic searches

We will search the following databases without language restriction, using the search terms and strategy described in Appendix 1.

MEDLINE and MEDLINE in Process (OVID), from 1946.
Embase (OVID), from 1947.
Cochrane Central Register of Controlled Trials (CENTRAL), published in the Cochrane Library.
Scopus (Elsevier, from 1970).

We will also search ClinicalTrials.gov, the WHO International Clinical Trials Registry Platform (ICTRP; www.who.int/trialsearch), and the International Standard Randomized Controlled Trials Number (ISRCTN) registry (www.isrctn.com/) for trials in progress.

Searching other resources

To identify any relevant published data not identified with our electronic search, we will contact experts in the field, and check the references of relevant reviews from the past ten years. With the studies selected for inclusion in this review, we will perform forward and backward reference checking to identify any additional eligible studies.

Data collection and analysis

Selection of studies

We will use Covidence to manage the selection of studies (Covidence 2017). Two review authors (BV and TN) will independently screen all titles and abstracts from the electronic searches to identify potentially eligible studies. We will obtain full‐text articles of potentially eligible studies, and the two review authors (BV and TN) will independently assess the full‐text articles for study eligibility using the predefined inclusion and exclusion criteria. We will resolve any disagreements by discussion or with a third review author (AMM). As needed, we will contact study authors to clarify the study methods and other information. Studies excluded during the full‐text review will be listed in the 'Characteristics of excluded studies' table with a summary of reasons for exclusion. We will illustrate the study selection process in a PRISMA flow diagram (Moher 2009).

Data extraction and management

We will design a data extraction form and pilot it on at least two included studies. After reviewing the piloted forms with the other review authors, we will finalize the form. Two review authors will use the data extraction form to independently extract data from the included studies. We will discuss any inconsistencies with a third review author. We will enter the extracted data into an Excel database (Excel 2013) on password‐protected computers and secured in the cloud storage Dropbox for future review updates.

We will extract the following information for each included study.

Study details

First author, title, year of publication, journal, language.
Study design, sampling method, prospective/retrospective, and inclusion criteria for presumptive tuberculosis (if any).
Number of participants after screening for exclusion and inclusion criteria.
Number of children included in the systematic review analysis.
Single or initial screening versus more than one screening in the population.
Any sequential or parallel screening strategies.

Patient characteristics and setting

Description of study population.
Age: median, mean, range, and disaggregation into categories (0 to 4, 5 to 14)
Gender.
HIV status.
Proportion with severe wasting or severe acute malnutrition.
Screening location: community, outpatient facility, or inpatient facility.
Children with prior tuberculosis included, yes/no? If yes, what proportion?
Country/countries where study was conducted
Country WHO classification for high tuberculosis burden country (WHO Global TB Report 2019).
Years of data collection.

Index test

Definition of positive symptom screen.
List symptoms assessed.
Details of timing of contact history (i.e. current, within past year, beyond one year).
Types of CXR used.
Description of radiographic findings classification.
Type of CXR reader: radiologist, pulmonologist, general medical officer, clinical officer, nurse, other.
Types of respiratory specimens used.
Types of NAATs used.
For each index test, number of results that are true positive, false positive, true negative, false negative, inconclusive, and missing.

Reference standard

Microbiological reference standard used: solid culture, liquid culture, Xpert MTB/RIF, or Xpert Ultra.
Criteria used for composite reference standard.
Reference standard applied to all children or only those with a positive screening test result?
Number of microbiological tests used to exclude tuberculosis.
Number of contaminated cultures and total number of cultures performed.
Time between the index test and the reference standard.

Assessment of methodological quality

Two review authors will independently assess the methodological quality of the included studies using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) instrument, which we will adapt for this review (Whiting 2011). The preliminary tool with signalling questions tailored to this review is in Appendix 2. As recommended, we will assess each of the four domains (patient selection, index tests, reference standards, and flow and timing) for risk of bias and the first three domains for concerns regarding applicability.

We will judge each item as 'yes' (adequately addressed), 'no' (inadequately addressed), or 'unclear' when there is insufficient information reported to make an assessment. One review author will pilot the tool on two included studies. We will then make any necessary revisions to finalize the tool. We will resolve disagreements between the two review authors' independent assessments through discussion or additional input from a third review author. We will present results of the quality assessment in text, tables, and graphs.

Statistical analysis and data synthesis

We will perform descriptive analyses of the included studies and present their key characteristics in the 'Characteristics of included studies' table and a summary table. We will present individual study estimates of sensitivity and specificity graphically on forest plots and in receiver operating characteristics (ROC) space using Review Manager 5 (RevMan 2020).

We will consider one index test result per child per time point. However, for studies assessing serial screening over time for individuals, separate screens may be assessed if they are also compared against serial confirmatory tests over time (i.e. multiple screens for one individual). Within each group listed in Objectives, we will perform analyses by index tests and reference standards. For symptom screening as the index test, we plan to perform analyses by single symptoms and multiple symptoms (such as the WHO four‐symptom screen), where data are available.

We will consider combining categories depending on the number of studies and screening definitions found in each category. We will also stratify the analyses by the type of reference standard used, microbiological or composite. Separate analyses will be performed for studies that verify participants regardless of their index test results (i.e. complete verification) and those that only verify test positives (i.e. partial verification).

When there are sufficient data, we will perform meta‐analyses to estimate summary values of sensitivity and specificity using a bivariate model (Chu 2006; Reitsma 2005). We chose the bivariate model because we anticipate dealing with binary test results or studies that used the same threshold because they applied the threshold recommended by the test manufacturer. Also, we note that the bivariate model is appropriate to use for index tests such as Xpert MTB/RIF and Xpert Ultra, which apply a common positivity criterion (Macaskill 2010). When we are unable to fit the models due to sparse data or few studies, we will simplify the models to univariate random‐effects logistic regression models to pool sensitivity and specificity separately (Takwoingi 2015). For studies that verify only test positives, we will pool positive predictive values using a univariate random‐effects logistic regression model. We will perform meta‐analyses using the meqrlogit command in Stata version 16 (Stata 16).

For test comparisons, we will exclude studies where one index test was used as the reference standard for another index test in the comparison e.g. if Xpert MTB/RIF was used as the reference standard for CXR. We will perform comparative meta‐analyses by first including all studies with relevant data in indirect comparisons to make use of all available data. We will then perform additional analyses by restricting the analyses to only comparative studies that made direct comparisons between the index tests within the same study population. Comparative meta‐analyses will be performed using bivariate meta‐regression by adding test‐type as a covariate to bivariate models. We will assess model fit using likelihood ratio tests to compare models with and without the covariate terms. We will calculate absolute differences in sensitivity and specificity using the model parameters. We will obtain 95% confidence intervals and P values for the absolute differences using the delta method and Wald tests, respectively.

For subgroups or screening definitions that do not have sufficient data for a meta‐analysis, we will summarize findings using descriptive methods.

Approach to inconclusive index test results

As described above, the NAAT assays assessed in this review as index tests may have inconclusive results. We will report the proportion of inconclusive index test results. Depending on the available data, we will reclassify these results as positive or negative and perform additional analyses to determine the impact of including these test results on test accuracy.

Investigations of heterogeneity

We will visually inspect forest plots and summary ROC (SROC) plots for heterogeneity. When data allow, we will evaluate potential sources of heterogeneity using subgroup analyses and bivariate meta‐regression.

For subgroup analyses, we will assess the following subgroups: children aged 0 to 4 years, children aged 5 to 14 years, HIV positive children, and HIV negative children.

For meta‐regression analyses, we will include high tuberculosis burden country (yes or no) and single or initial screening versus more than one screening and consider each source of heterogeneity as a single covariate in a bivariate model.

Sensitivity analyses

When data allow, we will perform sensitivity analyses and explore the effect of risk of bias and study characteristics on the accuracy of index test results by limiting inclusion in the meta‐analyses to the following.

Studies that only used consecutive or random selection of participants.
Studies with an appropriate interval between the index test and the reference standard.
Studies that avoid incorporation bias (inclusion of index test as part of the reference standard).
Studies that explicitly evaluated the index tests for pulmonary tuberculosis.

Assessment of reporting bias

We will not formally assess reporting bias using funnel plots or regression tests as these have not been reported as helpful for diagnostic test accuracy studies (Macaskill 2010).

Assessment of certainty of the evidence

We will assess the certainty of evidence using the GRADE approach for diagnostic studies (Balshem 2011; Schünemann 2008). As recommended, we will rate the certainty of evidence as either high (not downgraded), moderate (downgraded by one level), low (downgraded by two levels), or very low (downgraded by more than two levels) based on five domains: risk of bias, indirectness, inconsistency, imprecision, and publication bias. For each outcome, the certainty of evidence starts as high when there are high‐quality observational studies (cross‐sectional or cohort studies) that enrolled participants with diagnostic uncertainty. If we find a reason for downgrading, we will use our judgement to classify the reason as either serious (downgraded by one level) or very serious (downgraded by two levels).

Three review authors will discuss judgements and apply GRADE in the following way (Schünemann 2020a; Schünemann 2020b).

Assessment of risk of bias

We will use QUADAS‐2 to assess risk of bias.

Indirectness

We will assess indirectness in relation to the population (including disease spectrum), setting, interventions, and outcomes (accuracy measures). We will also use prevalence as a guide to whether there was indirectness in the population.

Inconsistency

GRADE recommends downgrading for unexplained inconsistency in sensitivity and specificity estimates. We will carry out prespecified analyses to investigate potential sources of heterogeneity and downgrade when we cannot explain inconsistency in the accuracy estimates.

Imprecision

We will consider a precise estimate to be one that would allow a clinically meaningful decision. We will consider the width of the CI, and ask, “Would we make a different decision if the lower or upper boundary of the CI represented the truth?” In addition, we will work out projected ranges for TP, FN, TN, and FP for a given prevalence of tuberculosis and make judgements on imprecision from these calculations.

Publication bias

We will rate publication bias as undetected (not serious) for several reasons, including the comprehensiveness of the literature search and extensive outreach to tuberculosis researchers to identify studies.

Figure 1

Navigate to figure in ProtocolOpen in new tab

Figure 2

Different screening and diagnostic algorithms

Navigate to figure in ProtocolOpen in new tab

Cochrane Review language

Website language

Abstract

Objectives

Secondary objectives

Visual summary

Background

Screening

Target condition being diagnosed

Index test(s)

Clinical pathway

Alternative test(s)

Rationale

Objectives

Secondary objectives

Methods

Criteria for considering studies for this review

Types of studies

Participants

Index tests

Target conditions

Reference standards

Microbiological reference standard

Composite reference standard

Search methods for identification of studies

Electronic searches

Searching other resources

Data collection and analysis

Selection of studies

Data extraction and management

Study details

Patient characteristics and setting

Index test

Reference standard

Assessment of methodological quality

Statistical analysis and data synthesis

Approach to inconclusive index test results

Investigations of heterogeneity

Sensitivity analyses

Assessment of reporting bias

Assessment of certainty of the evidence

Assessment of risk of bias

Indirectness

Inconsistency

Imprecision

Publication bias

Copy or download citation

Cochrane Review language

Website language

Previously accessed institutions

Institutional users

Previously accessed institutions

Other access options