The performance of digital technologies for measuring tuberculosis medication adherence: a systematic review

Abstract Introduction Digital adherence technologies (DATs), such as phone-based technologies and digital pillboxes, can provide more person-centric approaches to support tuberculosis (TB) treatment. However, there are varying estimates of their performance for measuring medication adherence. Methods We conducted a systematic review (PROSPERO—CRD42022313526), which identified relevant published literature and preprints from January 2000 to April 2023 in five databases. Studies reporting quantitative data on the performance of DATs for measuring TB medication adherence against a reference standard, with at least 20 participants, were included. Study characteristics and performance outcomes (eg, sensitivity, specificity and predictive values) were extracted. Sensitivity was the proportion correctly classified as adherent by the DAT, among persons deemed adherent by a reference standard. Specificity was the proportion correctly classified as non-adherent by the DAT, among those deemed non-adherent by a reference standard. Results Of 5692 studies identified by our systematic search, 13 met inclusion criteria. These studies investigated medication sleeves with phone calls (branded as ‘99DOTS’; N=4), digital pillboxes N=5), ingestible sensors (N=2), artificial intelligence-based video-observed therapy (N=1) and multifunctional mobile applications (N=1). All but one involved persons with TB disease. For medication sleeves with phone calls, compared with urine testing, reported sensitivity and specificity were 70%–94% and 0%–61%, respectively. For digital pillboxes, compared with pill counts, reported sensitivity and specificity were 25%–99% and 69%–100%, respectively. For ingestible sensors, the sensitivity of dose detection was ≥95% compared with direct observation. Participant selection was the most frequent potential source of bias. Conclusion The limited number of studies available suggests suboptimal and variable performance of DATs for dose monitoring, with significant evidence gaps, notably in real-world programmatic settings. Future research should aim to improve understanding of the relationships of specific technologies, settings and user engagement with DAT performance and should measure and report performance in a more standardised manner.

• SMS daily reminders (1-or 2-way) to patient to take treatment.
• SMS reminders (1-or 2-way) to patient to attend dispensing visit/routine follow-up appointment.• Automated phone calls to remind patient for visit/daily dose.
• Smart pillbox daily reminders to patient to take treatment.
• Smart pillbox reminders to patient attend dispensing visit/routine follow-up appointment.
• Chatbot/telemedicine accessed by TB patients that provides information about "treatment adherence".• Automated feedback (such as electronic health record) to alert HCW to a missed visit (e.g., dispensing, routine follow-up appt for patients on TB/TBI treatment) by a patient with an intended action of "promote treatment adherence and/or reducing LTFU".• Digital calendar generated (from electronic health record/smart pill box) showing missed visits/doses by patient at consultation with HCW.• Telehealth -video-call initiated by HCW.
• Specialist Healthcare provider Apps (-not automated) -used by patient and HCW.
• Electronic pillbox (or suchlike) used to measure adherence only (i.e., not used to promote adherence/improving outcomes) These need to be used in conjunction with the intention to measure or promote treatment adherence and/or reducing missed visits and/or reducing LTFU (and thereby improving successful treatment outcomes).

Examples to exclude (not limited to)
• Electronic health record to document (monitor/record/summarise) visit attendance only -with an NO intended action of "measure or promote treatment adherence and/or reducing LTFU".• Mobile technology to collect data (TB register etc..) with no feedback loop except for specific examples of technology used to measure.• Non-automated "routine telephone calls" to patient.
Phone calls or SMS or WhatsApp (by human) & not automated -with no other digital component ("older-style").
Table S3: PICO Definitions and Inclusion/Exclusion Criteria pre-specified in the PROSPERO systematic review registration.(1) Participants/population • Individuals diagnosed and treated for TB infection.
• Individuals treated for active TB disease, including those at risk of unfavorable outcomes (e.g., those with drug-resistant TB, persons living with HIV, children).Intervention/Exposure • DATs which support monitoring TB medication adherence, include reminding patients to take their medications, facilitating digital observation of pill-taking, compiling patient dosing histories and categorizing patients based on their level of adherence, to facilitate patient-centric approaches for monitoring TB medication adherence.See • Studies included will compare the dose reporting by DAT against a reference standard.This reference standard could be measurement of drug/ metabolite levels or their presence in serum or urine, and/or pill count, or directly observed therapy.Inclusion Criteria Studies will be included if: • They address tuberculosis in humans.
• They address at least one digital adherence technology.
• They assess accuracy quantitatively.
• There will be no language restriction.
• Relevant grey literature (such as ministry reports, technical papers, preprints) will be accepted if they meet the eligibility criteria.

Exclusion Criteria
Studies will be excluded if: • They are not themed on the use of digital health interventions for tuberculosis treatment support.
• If the same study with the same outcomes has been reported by different papers, -the most recent published paper will be included.
• For the purposes of our analysis, we will also exclude reports/ papers where only the abstract but not the full text is available.
However, during the search process, we will keep a record of abstracts for which the final study has not been published in a peer-reviewed journal to provide insights into potential publication bias.Publication bias occurs when smaller studies or studies with negative findings do not get published, which can contribute to bias in the findings of meta-analyses.If, for example, we identify a large number of abstracts with negative findings that never got published in peer-reviewed journals, this could be indicative of publication bias.• They are letters, editorials, and position papers.o They are review articles.• They are case control studies.
• They report purely qualitative information.
• They report fewer than 20 persons receiving treatment support with a DAT.

Main outcomes
Accuracy of dose reporting will be assessed by comparison of dose recording by DAT against a reference standard.Primary outcomes will be: • Sensitivity of dose reporting by DAT, against the reference standard • Specificity, assessed in the same manner Additional outcomes Secondary effect measures of accuracy of DAT dose reporting can include: • Area under the curve Fixed-dose combination medication blister packs are placed within a custom cardstock that has a series of toll-free phone numbers that are revealed once the person removes the prescribed dose.After ingestion, the person is expected to call the toll-free number to automatically log their adherence for the day.Healthcare providers (HCPs) visualize this adherence record remotely to identify nonadherent people.Branded as '99DOTS' in all studies.Patient-reported doses (default): Person with TB calling to report their dose.Patient/Provider-reported doses (only assessed in Subbaraman et al.( 2) and Thomas et al. ( 3)): Person with TB calling to report their dose; if the person with TB does not call on a given day or days, the provider is supposed to call the person to verify whether doses were taken or not and then report these data in the digital dosing history.
Refer to Table 2 (main text) for the reporting windows for each study.Previous adherence was assessed at the time of random urinalysis.4): SMS reminders automatically sent at specific times of the day prompted people to take their medication.Thomas et al. (3): People whose record reported doses taken <6 hours and 48-72 hours prior were excluded.

Digital pillbox
Electronic medication bottle, pill cap, or pillbox that contains a microelectronic chip that registers every bottle opening or box opening, respectively.Some devices contain a subscriber identity module (SIM) card and use a cellular network to provide real-time measurements of pillbox openings.Branded as 'MEMS' or 'Wisepill'.
Doses recorded as taken (adherence) or not taken (nonadherence) by the device were assessed per participant or across participants.Records were assessed at provider visits during treatment or at the end of treatment.

Ingestible sensors
Consists of a 1.0 mm x 1.0 mm ingestible sensor placed on or taken with medication and an on-body wearable sensor.The ingestible sensors are activated by gastric fluids, independent of the acidity level, and communicate unique identifying signatures to the body surface.The on-body sensor counts the number of times each unique signature is received.Data from the patch are transmitted wirelessly (can use Bluetooth) to a secured device and uploaded to a secured, centralized data storage location.Data could be shared with the participant in near-real time and viewed on their mobile device.Also termed "wirelessly observed therapy (WOT)" and "ingestion sensors".
A detected ingested marker (adherence) and a non-detected ingested marker (nonadherence), as found by the networked system.

VOT (AI result)
Deep Convolutional Neural Networks (DCNNs) used to extract visual features from medication intake videos, and support vector machine (SVMs) adopted as a classifier to generate prediction scores for videos, pretrained with external datasets.Known as 3D ResNet (6).
The generated prediction score is a decimal number between 0 and 1, which can be interpreted as the probability that the video represents a participant correctly ingesting their medication.A threshold of 0.6 was used to identify adherence.Mobile application (reader software) Includes a patient and treatment supporter facing mobile app and a direct drug metabolite test.The app allows patients to report self-administration of their TB medication, track potential medication side-effects, and upload a photo of their isoniazid (INH) urine test to verify their adherence.A reader software analyzes submitted urine test images for the presence of isoniazid metabolite.Branded as 'TB-TSTs'.(7) The app also includes access to TB education, a calendar view of treatment progress, and the ability to communicate with a Treatment Supporter or other people with TB anonymously.
Reader software extracts the results area from the images and analyzes the level of color in the urine test and control strips to report detected (adherence) or not detected (nonadherence).If the drug metabolite is present the test turns a blue-purple color within 20 minutes.Refer to INH Urine Test for more details.

Test Details Adherence and nonadherence classification INH Urine Test
The isoniazid (INH) urine test mixes reagents with a participant's urine sample to determine the presence of drug metabolites.Branded as 'IsoScreen'.
Colour change indicates the time of last medication ingestion: Purple/blue (<24 hours ago), Green (24-48 hours ago), Yellow (>48 to 72 hours ago) Refer to Table 2 for relevant time windows in each study.
van den Boogaard et al. (8): Participants with at least one negative test were regarded as non-adherent.Tests were conducted at routine clinic visits.

Efo et al.(4):
The setting in which the urine test was conducted was unclear.

RIF Urine Colour Test
Either the raw sample is checked for an orange colour, indicative of the presence of rifampicin (RIF) in the body, or alternatively, rifampicin is soluble in chloroform and trichloromethane so urine appears red when these are added.

Orange or red urine (adherence). Yellow urine (nonadherence).
van den Boogaard et al. (8): Participants with at least one negative test were regarded as non-adherent.Tests were conducted at routine clinic visits.Huan et al. (10): Test was conducted at an unannounced home visit.

Pill Count
The participant's remaining pills were counted by a healthcare provider.
A correct (adherence) or incorrect (nonadherence) number of pills remaining.Refer to Table 2 (main text) for relevant pill counts in each study.
Records were assessed at provider visits during treatment.

Scott et al.(5):
Pill count was assessed at the end of TB infection treatment.DOT Participants received their TB medication during in-person clinic visits or home visits by a healthcare provider.
All doses were ingested via DOT (adherence) to evaluate the performance of the ingestible sensor technology.

Belknap et al.(11):
After two clinic visits, participants and investigators could choose to complete the remaining visits as field-DOT in the participant's home.

VOT (provider reported)
Manual revision of the video by 3 reviewers who included a trained student annotator, a senior computer scientist and a physician with expertise in medication adherence.The protocol was summarized into three basic rules that guided labelling videos as positiveactual medication ingestion activity, negative-no medication intake activities or ambiguousif no pills were seen but there was a blurry image of a face.
3 reviewers agreeing that the participant did (adherence) or did not (nonadherence) take the medication in the video.6): Ambiguous videos were not retained for further analysis.

Mobile application (provider reported)
Includes a participant and treatment supporter facing mobile app and a direct drug metabolite test.Treatment Supporter or Research Staff analyzes submitted INH urine test images for the presence of isoniazid metabolite.
If the drug metabolite is present the test turns a blue-purple color within 20 minutes.Treatment supporter or research staff analyzes the urine test photo for a blue-purple colour (adherence) or otherwise (non-adherence).Refer to INH Urine Test for more details.AI artificial intelligence, DOT directly observed therapy, MEMS medication event monitoring system, TB-TSTs tuberculosis treatment support tool, VOT video observed therapy Table S5: Reported and calculated performance outcomes of interest in the included studies.† Arithmetic mean and standard deviation were calculated based on 5-fold cross validation using 497 videos (51 participants).364 videos (157 ambiguous or uncertain videos; 152 poor quality videos, 55 damaged videos) were excluded as there was not complete agreement of the classification across the three reviewers.

A B
Table S7: QUADAS-2 checklist assessment criteria for the performance of DATs for measuring TB medication adherence.

•
Any non-random sample was assessed to be a high risk of bias.

•
Any inappropriate exclusions were assessed to be a high risk of bias.Is there concern that the included patients do not match the review question

•
Participants with available demographic information were assessed to be of low concern.

Index Test
Could the conduct or interpretation of the index test have introduced bias?
Could the conduct or interpretation of the DAT intervention have introduced bias?
• If other medicine was administered at the same time of DAT use or behaviour changed as a result of anticipating adherence evaluations, there was a high risk of bias.

•
If there were DAT accessibility issues for clients there was a high risk of bias.

•
If it was not clear how/when the DAT reports were assessed relative to the reference standard there was an unclear risk of bias.Is there concern that the index test, its conduct, or interpretation differ from the review question?Is there concern that the DAT intervention, its conduct, or interpretation differ from OUR review question?

•
If performance was assessed in an inpatient cohort, there was a high applicability concern.

•
If the DAT was not initially assessed as the index test, there were unclear or high applicability concerns based on the provided performance results.

Reference Standard
Could the reference standard, its conduct, or its interpretation have introduced bias?
• If it was not clear how/when the DAT reports were assessed relative to the reference standard there was an unclear risk of bias.

•
If it was not clear how/when the DAT reports were assessed relative to the reference standard AND there was missing information on details of the reference standard there was a high risk of bias.

•
If the reference standard had the ability to misclassify participants to a certain degree, there was an unclear risk of bias.Is there concern that the target condition as defined by the reference standard does not match the review question?

•
If information on the performance of the reference standard was provided and misclassification was discussed, there was an unclear applicability concern.

•
If there was not enough detail provided on the reference standards to assess this question, there was an unclear applicability concern.

•
If the reference standard was conducted by individuals that would not typically assess adherence for those with TB, there was a high applicability concern.

Flow and Timing
Could the patient flow have introduced bias?

•
If performance was assessed at only one point during the course of treatment to depict adherence, there was a high risk of bias.

•
If a large number of participants were removed from the analysis/performance assessment, there was a high risk of bias.

•
If it was not described when participants underwent each test, there was an unclear risk of bias.

Explanations
a. Specificity was unable to be assessed due to how the performance of the ingestible sensor technology was assessed compared to directly observed therapy where all doses were known and taken.b.Publication bias could not be assessed due to only sensitivity being estimated and not possible to assess using Deek's test for diagnostic odds ratios.Therefore, test performance should be interpreted with caution.

Explanations
a.One article was an abstract, wherein the risk of bias could not be properly assessed due to a lack of information b.Sensitivity was variable across articles, with non-overlapping 95% confidence intervals c.Specificity had variable estimates but overlapping 95% confidence intervals.However, due to the strong variability in estimates, it was deemed serious.d.There are very wide 95% confidence intervals for specificity, with one study having a 95% CI range from 0% to 84%.
Table S10: The quality of evidence for the performance of digital pillboxes for measuring TB medication adherence.Assessed using criteria of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach for diagnostic tests and strategies.Bionghi et al. (13) was not included because the article reported performance for the total doses of the sample rather than total participants and it used an inpatient cohort.Huan et al. (10) was not included because it uses a different reference standard (RIF urine colour test).Scott et al. (5) was not included because it assessed the performance of digital pillboxes in those with TB infection.The result present in the forest plots of digital pillboxes (Figure 2c, main text) was used to assess the quality of evidence.

Explanations
a. Two reports originally assessed the digital pillbox as the reference standard and we back-calculated to determine the sensitivity and specificity for the digital pillbox as the index test.b.Sensitivity has highly variable across studies, with non-overlapping confidence intervals c.Due to the back-calculation of the performance of digital pillboxes, there are applicability concerns when answering the research question.d.There were very wide 95% confidence intervals for two of the articles for serious imprecision.e. Specificity had similar estimates across all studies and overlapping confidence intervals across two of the studies.

A B
Figure S1.• DATs include and are not limited to smartphone-based technologies such as phone-based dosing records, SMS or Videosupported treatment, digital pillboxes, and ingestible sensors.See Figure S1.Comparator/Control = reported = calculated based on provided outcomes = DAT was used as the reference standard of the published study.Performance outcomes of the pillbox were calculated from originally published values on sensitivity, specificity, PPV, and NPV.AI artificial intelligence, AUC area under receiver operating characteristic curve, VOT video observed therapy, § Inpatient cohort § § Same cohort as Thomas et al. 2020 † Abstract from the World Conference on Lung Health of the International Union Against Tuberculosis and Lung Disease (The Union) NPV Accuracy AUC Numbers of TP, FN, TN, or FP Browne et al. 2019 (12) Ingestible sensors Belknap et al. 2013 (11) Ingestible sensors Scott et al. 2023 (5) Digital pillboxes Bionghi et al. 2018 § (13) Digital pillboxes Huan et al. 2012 (10) Digital pillboxes van den Boogaard et al. 2011 (8) Digital pillboxes Ruslami et al. 2008 (14) Digital pillboxes Subbaraman et al. 2021 § § (2) Medication sleeves with phone calls Efo et al. 2021 (4) Medication sleeves with phone calls Thomas et al. 2020 (3) Medication sleeves with phone calls Alacapa et al. 2020 † (9) Medication sleeves with phone calls Sekandi et al. 2023 (6) VOT (AI result) Goodwin et al. 2022 † (15)Mobile application (Software result)

Figure S1 :
Figure S1: Risk of bias (A) and applicability concerns (B) of included studies using the QUADAS-2 tool for diagnostic accuracy studies.Risk of bias and applicability concerns were assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool for primary diagnostic accuracy studies.Alacapa et al. 2020 (9) and Goodwin et al. 2022 (15) are abstracts of the World Conference on Lung Health of the International Union Against Tuberculosis and Lung Disease (The Union) and could not be assessed for quality.

Figure S2 :
Figure S2: Funnel plot of included studies for the qualitative assessment of publication bias.Publication bias was assessed using Deek's test for diagnostic odds ratios.DOR is the odds of a positive dose record by the DAT in those who were adherent (based on the reference standard) relative to the odds of a positive dose record in those who were not adherent while ESS is a function of the number of adherent (n1) and non-adherent (n2) participants ((4n1*n2)/(n1 + n2)).(16)The results shown in the forest plots for each DAT, or the primary result in the data table, was used to assess publication bias.Subbaraman et al. (2) used the same primary cohort as Thomas et al. (3) and was not included in the publication bias assessment.Belknap et al. (11) and Browne et al. (12) only assessed sensitivity of ingestible sensors where no diagnostic odds ratio could be calculated and are not included in this analysis.Goodwin et al. (15)  did not provide enough information to assess publication bias.Deek's test uses a linear regression of lnDOR with 1/Effective Sample Size 1/2 weighted by the effective sample size.

Should
Digital pillboxes be used to report on medication adherence in those with TB disease?Patient or population: those with TB disease Setting: Outpatient TB treatment New test: Digital pillboxes (e.g., Wisepill, MEMS) | Cut-off value: 100% of doses (adherence) Reference test: Pill count | Threshold: 100% of doses (adherence) Range of sensitivities:0.25 to 0.75 | Range of specificities: 0.69 to 0.80

Figure S3 :
Figure S3: Publication bias using Deek's test of articles reporting on each DAT.(A) Publication bias for medication sleeves with phone calls and (B) Publication bias for digital pillboxes as analyzed for the GRADE assessment.(A) Subbaraman et al. (2) uses the same primary cohort as Thomas et al. (3) and was not included in the GRADE assessment.(B) Bionghi et al. (13) was not included because the article reported on performance for the total doses of the sample rather than total participants and it used an inpatient cohort.Huan et al.(10) was not included because it uses a different reference standard (RIF urine colour test).Scott et al.(5) was not included because it assessed the performance of digital pillboxes in those with TB infection.For ingestible sensors, Belknap et al.(11) and Browne et al.(12)only assessed sensitivity wherein no diagnostic odds ratio could be calculated and therefore, no publication bias could be assessed.Deek's test uses a linear regression of lnDOR with 1/Effective Sample Size 1/2 weighted by the effective sample size.

Table S1 : Search strategy for performance of DATs for measuring TB medication adherence systematic review.
agents/ or directly observed therapy/ or medication adherence/ or patient compliance/ or "treatment adherence and compliance"/) and technology/) or mobile applications/ or internet/ or cell phone/ or smartphone/ or text messaging/ or computer, handheld/ or telemedicine/ or therapy, computer-assisted/ or medical informatics applications/ 148201 5 (((digital* or electronic* or mobile or wireless* or virtual*) adj2 (adherence or medication monitor* or medication package* or observ*)) or digital technolog* or technology-based or Digital health or eHealth or e health or mhealth or m health or SMS or reminder* or short messag* service* or text messag* or MMS or multimedia messag* or MEMS or (monitor* adj2 (electronic* or sensor* or device*)) or Webcam* or web cam* or smartphone* or smart phone* or web based or health IT or health ICT or ((cell* or mobile) adj1 (device* or health or phone* or technolog*)) or video* or cellphone* or feature phone* or vdot or vmalt or tele* or dat or wot or 99dots or 99 dots or monitoring system* or ingestible sensor* or merm or artificial intelligence or ai or vot or ((digital or smart) adj (pill box* or pillbox*)) Sensitiv* or Specific* or likelihood or Precis* or Valid* or Positive predictive value* or negative predictive value* or Dosage or dosing or dose? or Report or reporting or Area under the curve or AUC or Receiver operator curve* or ROC or Agreement or reliab* or reproduc*).mp.agent/ or directly observed therapy/ or medication compliance/ or patient compliance/) and technology/) or communication technology/ or exp mobile application/ or internet/ or web-based intervention/ or exp mobile phone/ or text messaging/ or personal digital assistant/ or telemedicine/ or exp teleconsultation/ or telemonitoring/ or video consultation/ or computer-assisted therapy/ or computer-assisted drug therapy/ or medical informatics/ 261570 6 (((digital* or electronic* or mobile or wireless* or virtual*) adj2 (adherence or medication monitor* or medication package* or observ*)) or digital technolog* or technology-based or Digital health or eHealth or e health or mhealth or m health or SMS or reminder* or short messag* service* or text messag* or MMS or multimedia messag* or MEMS or (monitor* adj2 (electronic* or sensor* or device*)) or Webcam* or web cam* or smartphone* or smart phone* or web based or health IT or health ICT or ((cell* or mobile) adj1 (device* or health or phone* or technolog*)) or video* or cellphone* or feature phone* or vdot or vmalt or tele* or dat or wot or 99dots or 99 dots or monitoring system* or ingestible sensor* or merm or artificial intelligence or ai or vot or ((digital or smart) adj (pill box* or pillbox*)) predictive value* or Dosage or dosing or dose? or Report or reporting or Area under the curve or AUC or Receiver operator curve* or ROC or Agreement or reliab* or reproduc*).ti,ab,kf.11272549 OR electronic* OR mobile OR wireless* OR virtual* ) NEAR/2 (adherence OR "medication monitor*" OR "medication package*" OR observ* )) OR "digital technolog*" OR technology-based OR "Digital health" OR eHealth OR "e health" OR mhealth OR "m health" OR SMS OR reminder* OR "short messag* service*" OR "text messag*" OR MMS OR "multimedia messag*" OR MEMS OR (monitor* NEAR/2 (electronic* OR sensor* OR device* )) OR Webcam* OR "web cam*" OR smartphone* OR "smart phone*" OR "web based" OR "health IT" OR "health ICT" OR ((cell* OR mobile ) NEAR/1 (device* OR health OR phone* OR technolog* )) OR video* OR cellphone* OR "feature phone 10 exp "sensitivity and specificity"/ or measurement accuracy/ or reproducibility/ or predictive value/ or predictive validity/ or exp *validity/ 884855 11 (Accura* or Sensitiv* or Specific* or likelihood or Precis* or Valid* or Positive predictive value* or negative S11 (Accura* OR Sensitiv* OR Specific* OR likelihood OR Precis* OR Valid* OR "Positive predictive value*" OR "negative predictive value*" OR Dosage OR dosing OR dose# OR Report OR reporting OR "Area under the curve" OR AUC OR "Receiver operator curve*" OR ROC OR Agreement OR reliab* OR reproduc*) S6 (((digital* OR electronic* OR mobile OR wireless* OR virtual*) N2 (adherence OR "medication monitor*" OR "medication package*" OR observ*)) OR "digital technolog*" OR technology-based OR "Digital health" OR eHealth OR "e health" OR mhealth OR "m health" OR SMS OR reminder* OR "short messag* service*" OR "text messag*" OR MMS OR "multimedia messag*" OR MEMS OR (monitor* N2 (electronic* OR sensor* OR device*)) OR Webcam* OR "web cam*" OR smartphone* OR "smart phone*" OR "web based" OR "health IT" OR "health ICT" OR ((cell* OR mobile) N1 (device* OR health OR phone* OR technolog*)) OR video* OR cellphone* OR "feature phone*" OR vdot OR vmalt OR tele* OR dat OR WOS.SCI, WOS.ISTP, WOS.ESCI (Web of Science Core Collection) # Web of Science Search Strategy (v0.1)Search:TS=(Tuberculosis OR "Kochs disease" OR Phthisis OR TB OR MTB OR MDRTB OR XDRTB OR DRTB OR LTBI OR "directly observed treatment short course" ) Editions: WOS.SCI,WOS.ISTP,WOS.ESCI *" OR vdot OR vmalt OR tele* OR dat OR wot OR 99dots OR "99 dots" OR "monitoring system*" OR "ingestible sensor*" OR merm OR "artificial intelligence" OR ai OR vot OR ((digital OR smart ) NEAR/0 ("pill box*" OR pillbox* )) OR VST ) Editions: WOS.SCI,WOS.ISTP,WOS.ESCI Date Run: Fri Apr 28 2023 17:50:06 GMT-0400 (Eastern Daylight Time) Results: 1632660 Search: TS=(Accura* OR Sensitiv* OR Specific* OR likelihood OR Precis* OR Valid* OR "Positive predictive value*" OR "negative predictive value*" OR Dosage OR dosing OR dose$ OR Report OR reporting OR "Area under the curve" OR AUC OR "Receiver operator curve*" OR ROC OR Agreement OR reliab* OR reproduc*) Editions: WOS.SCI,WOS.ISTP,WOS.ESCI Date Run: Fri Apr 28 2023 17:54:01 GMT-0400 (Eastern Daylight Time) Results: 18948839 Search: #1 AND #2 AND #3 Editions: WOS.SCI,WOS.ISTP,WOS.ESCI Date Run: Fri Apr 28 2023 17:54:14 GMT-0400 (Eastern Daylight Time) Results: 1676 # Database: Web of Science Core Collection # Entitlements: -WOS.IC: 1993 to 2023 -WOS.CCR: 1985 to 2023 -WOS.SCI: 1900 to 2023 -WOS.AHCI: 1975 to 2023 -WOS.BHCI: 2005 to 2023 -WOS.BSCI: 2005 to 2023 (TITLE:Tuberculosis OR TITLE:"Kochs disease" OR TITLE:Phthisis OR TITLE:TB OR TITLE:MTB OR TITLE:MDRTB OR TITLE:XDRTB OR TITLE:DRTB OR TITLE:LTBI OR TITLE:"directly observed treatment short course" OR TITLE:"antituberculosis" OR TITLE:"antituberculous") AND (Adher* OR "directly observed" OR Digital OR electronic OR internet OR mobile OR wireless OR virtual OR TITLE:technology OR "technology based" OR tele* OR ehealth OR "e health" OR mhealth OR "m health" OR SMS OR reminder* OR messaging OR message* OR MEMS OR web OR webcam* OR smartphone* OR "health IT" OR "health ICT" OR video* OR cellphone* OR phone* OR vdot OR vmalt OR dat OR wot OR 99dotsOR "99 dots" OR "monitoring system" OR "monitoring systems" OR "ingestible sensor" OR "ingestible sensors" OR merm OR "artificial intelligence" OR AI OR VOT OR "smart pillbox" OR "smart pill box" OR VST OR "computer assisted") AND (SRC:PPR)

Table S2 : Definition of a digital adherence technology, with examples for inclusion and exclusion.
• A digital component (which could be part of a multi-component intervention)• with the intention to measure or promote treatment adherence and/or reducing missed visits and/or reducing LTFU (and thereby improving successful treatment outcomes)The intervention can be patient-facing only, provider-facing only, or patient and provider-facing.

Table S4 : (A) DAT and (B) Reference Standard Characteristics of the included studies.
(A) DATs

Table S6 : Performance of DATs assessed under controlled conditions against a reference standard.
For ingestible sensors, sensitivity was calculated according to positive detection accuracy (# of ingestible sensors detected/# of ingestible sensors ingested).Additionally, specificity, PPV, NPV, and accuracy were not estimable in the absence of false positives and true negatives.AI artificial intelligence, CI confidence interval, DAT digital adherence technology, DOT directly observed therapy, false negatives, FP false positives, N number of participants, NPV negative predictive value, NR not reported and incalculable, PPV positive predictive value, TN true negatives, TP true positives, VOT video observed therapy § The number of videos or images assessed for performance across the participants are shown in parentheses.§ § Confidence intervals account for clustering.

Table S8 : The quality of evidence for the performance of ingestible sensors for measuring TB medication adherence.
Assessed using criteria of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach for diagnostic tests and strategies.The quality of evidence for true negatives and false positives could not be assessed due to only true positives and false negatives provided as positive detection accuracy being provided in each study (# of ingestible sensors detected/# of ingestible sensors ingested).

Table S9 : The quality of evidence for the performance of medication sleeves for measuring TB medication adherence.
(3)essed using criteria of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) approach for diagnostic tests and strategies.The quality of evidence for Subbaraman et al.(2)was not assessed as it has the same cohort as Thomas et al.(3).The result present in the forest plots of medication sleeves with phone calls (Figure2b, main text) was used to assess the quality of evidence.