Do sugar-sweetened beverages cause adverse health outcomes in adults? A systematic review protocol

Background Chronic diseases, such as cardiovascular disease and type 2 diabetes, impose significant burden to public health. Most chronic diseases are associated with underlying preventable risk factors, such as elevated blood pressure, blood glucose, and lipids, physical inactivity, excessive sedentary behaviours, overweight and obesity, and tobacco usage. Sugar-sweetened beverages are known to be significant sources of additional caloric intake, and given recent attention to their contribution in the development of chronic diseases, a systematic review is warranted. We will assess whether the consumption of sugar-sweetened beverages in adults is associated with adverse health outcomes and what the potential moderating factors are. Methods/Design Of interest are studies addressing sugar-sweetened beverage consumption, taking a broad perspective. Both direct consumption studies as well as those evaluating interventions that influence consumption (e.g. school policy, educational) will be relevant. Non-specific or multi-faceted behavioural, educational, or policy interventions may also be included subject to the level of evidence that exists for the other interventions/exposures. Comparisons of interest and endpoints of interest are pre-specified. We will include randomized controlled trials, controlled clinical trials, interrupted time series studies, controlled before-after studies, prospective and retrospective comparative cohort studies, case-control studies, and nested case-control designs. The MEDLINE®, Embase, The Cochrane Library, CINAHL, ERIC, and PsycINFO® databases and grey literature sources will be searched. The processes for selecting studies, abstracting data, and resolving conflicts are described. We will assess risk of bias using design-specific tools. To determine sets of confounding variables that should be adjusted for, we have developed causal directed acyclic graphs and will use those to inform our risk of bias assessments. Meta-analysis will be conducted where appropriate; parameters for exploring statistical heterogeneity and effect modifiers are pre-specified. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach will be used for determining the quality of evidence for outcomes. Systematic review registration PROSPERO CRD42014009638


Background
Chronic diseases, such as cardiovascular disease and type 2 diabetes, cause significant morbidity and mortality worldwide. To address the burden of chronic diseases on the health of populations, effective interventions and public health policies are required. Most chronic diseases are associated with underlying preventable risk factors, such as elevated blood pressure, high blood glucose or glucose intolerance, hyperlipidemia, physical inactivity, excessive sedentary behaviours, overweight and obesity, and tobacco usage. The development of chronic diseases may be prevented if these risk factors are addressed before they progress to overt disease. A simple, unidirectional schematic depicts the hypothesized pathways by which sugar-sweetened beverage (SSB) consumption may lead to the development of chronic cardiovascular/cerebrovascular and metabolic diseases, chronic kidney disease, cancer, and gout (Additional file 1). These mechanisms have not been conclusively established by research studies, and several conflicting theories have been put forward [1][2][3][4][5][6][7][8]. However, as some of the SSB constituents, notably sugar but also caffeine and by-products of caramel colouring, where added, are postulated to be involved in the mechanisms of disease development, they are included in our depiction of the disease pathways.
Recent attention has focused on the contribution of SSBs to the chronic disease epidemic. In an attempt to curb rising rates of obesity, New York City has proposed a ban on the sale of large-sized SSB products greater than 16 oz, including sodas, sweetened teas and coffees, energy drinks, and fruit drinks in restaurants, delis, sports arenas, movie theatres, and food carts [9]. The proposed ban has been twice rejected but is being reviewed by the United States (US) Court of Appeals. In March 2014, the World Health Organization (WHO) released draft guidelines with recommendations on limiting sugar consumption (through food and beverage) to reduce public health problems like obesity and dental caries. They are recommending decreasing the total energy intake of sugar by day from 10% (recommended since 2002) to 5% [10].
Some researchers hypothesize that SSBs are significant sources of caloric intake. With the adjusted prevalence of SSB consumption at 73% for young adults (20-34 years) and 50% among adults (35 years or older) according to the 2007-2008 US National Health and Nutrition Examination Survey (NHANES) data, the mean adjusted SSB energy intake was 338 and 236 kcal/day, respectively [11]. Heavy consumption (≥500 kcal/day) occurred among 20% of young adult and 12% of adult consumers [11]. Soda was the most heavily consumed SSB [11].
Several reviews have presented evidence syntheses on associations between SSB consumption and metabolic syndrome/type 2 diabetes, [12][13][14] weight, [12,[14][15][16][17], and cardiovascular disease [12]. In a meta-analysis of 11 prospective cohort studies, the highest quantile of SSB consumption (one to two servings/day) was associated with a statistically significant increased risk of developing type 2 diabetes (risk ratio (RR) = 1.26, 95% confidence interval (CI): 1.12-1.41) and metabolic syndrome (RR = 1.20, 95% CI: 1.02-1.42) in adults [13]. Vartanian et al. found associations between SSB consumption and increased energy intake and body weight, lower intake of calcium and other nutrients, and increased risk of medical problems such as type 2 diabetes, hypocalcemia, dental caries, and elevated blood pressure [15]. A World Health Organization report concluded that the evidence to implicate high intake of sugar-sweetened drinks on weight gain is moderately strong [18]. In addition, in a systematic review conducted as part of the Dietary Guidelines Advisory Committee deliberations, the following conclusion statements were provided [19]: 'Limited evidence shows that intake of sugarsweetened beverages is linked to higher energy intake in adults'. ' A moderate body of epidemiologic evidence suggests that greater consumption of sugar-sweetened beverages is associated with increased body weight in adults'. ' A moderate body of evidence suggests that under isocaloric controlled conditions, added sugars, including sugar-sweetened beverages, are no more likely to cause weight gain than any other source of energy'.
Other reviews, however, have concluded that the association between SSB consumption and weight is close to zero [16,20]. Some reviews report a dose-response relationship between SSB and weight status but no corresponding weight loss when SSB consumption was reduced [21]. A systematic review of epidemiological research analysed beverages by category (water, milk, soft drinks, sugary drinks, non-carbonated, fruit juices, carbonated beverages, hot beverages, and alcoholic beverages), and the authors concluded that the results were inconsistent and did not establish an association between beverage intake and subsequent weight gain [22]. Within their beverage categories, however, the authors did not consistently separate SSBs from other drinks; for example, sweetened hot beverages were not differentiated from non-sweetened. Others have reviewed the biological plausibility of SSBs to uniquely affect the physiological energy balance regulatory systems (e.g. satiety and post-prandial regulatory systems) and concluded that known biological mechanisms did not support the concept that SSBs were somehow different from other sources of energy [23].
It is unclear what role SSBs play in the context of total caloric intake. Although total caloric intake and caloric intake from other dietary sources are factors that may be thought of as confounders, some evidence suggests that they may be clustered with SSB consumption and general unhealthy eating habits [12,17,24]. Conflicting evidence exists as to whether SSB intake is associated with increased energy intake [15,19] and may lower the intake of milk, calcium, and other nutrients [15]. Furthermore, given that SSB consumption may alter taste preferences and quality of diet, caloric intake may in fact mediate the effect of SSB consumption and health outcomes (i.e. it may lie in the causal pathway).
The available systematic review evidence is not only conflicting but fraught with several methodological issues, and it is difficult, therefore, to make any firm conclusions about the health effects of SSBs [25]. The definition of what constitutes an SSB, for example, may vary and may not be explicitly described [15][16][17]26,27]. Reviews may not have accounted for all variables that can confound associations between SSB consumption and health outcomes. Most reviews have addressed singular outcomes of interest to public health. Further, new primary evidence [28][29][30] on the topic is rapidly accumulating since the last systematic review was published in 2011. We will attempt to overcome some of the methodological challenges encountered in other reviews by carrying out a rigorous assessment of the internal validity of individual studies, which will include an evaluation of confounding and biasing factors using causal directed acyclic graphs (DAGs) [31].

Objectives
The objective of the systematic review is to answer the following research questions: In adults, does the consumption of SSBs cause adverse health outcomes? If so, what potential moderating factors affect the causal association between SSB consumption and outcomes?

Methods/Design
Eligibility criteria Studies will be selected according to the criteria outlined below.

Study designs
We will include randomized controlled trials (RCTs), including cluster RCTs, controlled (non-randomized) clinical trials (CCTs) or cluster trials, interrupted time series (ITS) studies with at least three data points before and after the intervention [32], controlled before-after (CBA) studies, prospective and retrospective comparative cohort studies, and case-control or nested case-control studies.
Cluster randomized, cluster non-randomized, or CBA studies will be included only if there are at least two intervention sites and two control sites [32]. We will exclude cross-sectional studies, case series, and case reports.

Participants
We will include studies examining the general adult human population or healthy adult humans (18 years or older). We will also include studies on people who are overweight or obese, but will otherwise exclude studies of populations restricted to specific diseases, conditions, or metabolic disorders. We will include studies addressing both adults and children if data provided for adults are reported separately.

Interventions
Of interest are interventions addressing SSB consumption, taking a broad perspective. In addition to direct consumption studies, we would consider interventions that influence consumption, such as those addressing the level of access to SSBs (e.g. university/college policy) and educational interventions addressing consumption as relevant. Non-specific or multi-faceted behavioural, educational, or policy interventions may also be included subject to the level of evidence that exists for the aforementioned interventions/exposures. We will also consider other types of interventions on a case-by-case basis, subject to what exists in the literature.
In terms of defining an SSB, we view them as akin to a complex intervention because they are composed of several parts. For example, in addition to sugar, some beverages contain caffeine and the by-products of caramel colouring (2-methylimidazole, 4-methylimidazole), which may contribute independently to adverse health outcomes. The scope of the review, therefore, warrants an examination of SSB consumption as a whole, rather than the specific constituents as exposure variables. Otherwise, such evaluations would have necessarily required the inclusion of studies addressing those constituents and in foods and drinks other than SSBs.
We will use the Centers for Disease Control and Prevention (CDC) definition of SSB for drinks that should be included. According to the CDC, SSBs contain added caloric sweeteners [33], which would include natural sweeteners such as honey and concentrated fruit juice. We have developed a classification scheme based on the CDC definition for use during the review (see classification scheme for SSBs below). For beverages such as coffee, tea, and homemade lemonade, studies will be included in the review if they explicitly state that sugar was added. We will exclude artificially sweetened (e.g. with aspartame or sucralose) beverages, alcoholic beverages, and 100% fruit or vegetable juices as exposures/ interventions. We will classify SSBs described in studies according to the following broad categories: Sodas-caffeinated/non-caffeinated (soft drinks, soda, pop, soda pop) Other non-carbonated sweetened beverages (fruitades, fruit drinks, fruit punches, [iced] teas, coffees, non-dairy fruit smoothies)-caffeinated/ non-caffeinated Fortified sweetened beverages (energy drinks, fortified waters, sports drinks)-caffeinated/ non-caffeinated and containing vitamins, amino acids, herbal stimulants, or other ingredients Flavored/sweetened milk or milk alternative beverages (dairy, soy, almond, milkshakes, dairy-based fruit smoothies)-caffeinated/ non-caffeinated

Comparators
Given the broad perspective for interventions of interest, several comparisons will be relevant to include. Some may be more likely to come from observational designs and others from experimental studies.
Direct consumption studies: For comparator groups 2 and 3, we anticipate that volume will be the most feasible to analyse; however, we will extract all measures in which consumption is reported (e.g. volume, caloric intake from sugar) in studies to see what analysis is possible.
For feasibility, category 6 comparisons (non-specific, multi-faceted interventions) will be coded at title/abstract screening and not put through to full-text screening. If sparse evidence exists in the other potential comparison types, we will revisit eligibility for comparison 6.

Outcomes
Endpoints important for decision-making are of primary interest. If reported on, these will be analysed and graded. If a given clinical endpoint is not reported on, we will analyse and grade their relevant surrogate outcome(s).
Endpoints important for decision-making: -Adverse cardiovascular (including cerebrovascular) events -Cancer (excluding basal cell and squamous cell carcinoma) -Chronic kidney disease -Mortality -Overweight/obesity -Type 2 diabetes -Dental caries -Quality of life (generic, validated tools only, such as those in Additional file 2) -Gout Surrogate outcomes: -Pre-diabetes -Metabolic syndrome -Change in cardiovascular disease (CVD) risk -Progression of obesity -Dyslipidemia -Hypertension As some outcomes may be reported as a composite measure, we will extract all composite and individual outcomes as reported in the studies.
Outcomes will be collected as reported, with the exception of quality of life, which will be collected only if assessed with generic (not disease-specific), validated tools. Due to possible variation in disease definitions over time, we will extract definitions of outcomes as reported in individual studies. We will extract outcomes in all data forms (e.g. dichotomous, continuous) as reported in the included studies.

Timing
Studies will be selected for inclusion based on the length of follow-up of outcomes. The following will be used as a guide for all study designs: For all decision-making endpoint outcomes, studies should have a follow-up time of at least 1 year. For all surrogate outcomes, studies should be at least 6 months duration for follow-up. For cancer, studies should be at least 1 year duration for follow-up. Some types of cancer may need longer than a 1-year follow-up, but this will be evaluated on a case-by-case basis.

Setting
There will be no restrictions by type of setting.

Language
We will include articles reported in the English and French languages. A list of possibly relevant titles in other languages will be provided as an appendix.

Electronic searches
A comprehensive literature search using high-recall subject searches will be conducted in MEDLINE®, Embase, CINAHL, PsycINFO®, ERIC, and The Cochrane Library. Electronic search strategies (Additional file 3), guided by an experienced information specialist, have been developed and peer-reviewed according to the peer review of electronic search strategies (PRESS) guidelines [34]. The search will not be restricted for time period or the language of publication. We will exclude comments, letters, and editorials.

Other sources
Grey literature sources, such as websites listed within the Canadian Agency for Drugs and Technologies in Health's (CADTH) Grey Matters checklist, the Federation of American Societies for Experimental Biology (FASEB) meeting abstracts, the Obesity Society abstracts, the Food and Nutrition Conference and Expo (FNCE) abstracts, the International Diabetes Federation website, the Centers for Disease Control and Prevention website, the Rudd Center for Food Policy and Obesity website, the Obesity and Energetics Offerings website, the American Heart Association, the European Union, the Center for Science in the Public Interest, the 20th International Congress of Nutrition, the American Beverage Association, and Refreshments Canada will be searched. We will consult within the research team for relevant studies that may not be in the published literature. We will scan the references of included studies and relevant reviews for additional articles and perform a forward search on key articles. We will search ClinialTrials.gov, the WHO International Clinical Trials Registry Platform, and the International Association for the Study of Obesity (IASO) for completed and ongoing studies.

Study selection process
Literature search results will be de-duplicated in Reference Manager [35] before uploading to Distiller Systematic Review Software® (Distiller SR), an online program that facilitates screening and data extraction [36]. Screening questions will be developed and pilot-tested with a subset of records before implementation. All titles and abstracts of records will be screened by one person; those deemed not relevant will be verified by a second person for exclusion. Full-text reports for all potentially relevant records and those without an available abstract will be screened by two independent reviewers. Discrepancies will be resolved by consensus or a third person. The study selection process will be reported using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram [37], including reasons for excluding fulltext articles.

Data collection and analysis Data extraction and management
Feedback will be solicited from the research team on the draft list of data variables for extraction. Data extraction forms will be developed and pilot-tested in Distiller SR. One person will extract all information. A second person will verify 20% of studies for general characteristics information and 100% of studies regarding outcomes data. Disagreements will be resolved by consensus or by a third team member, if needed. Information on the descriptive and quantitative characteristics of studies will include the following: Publication details (e.g. year of publication, language, publication status) Characteristics of study (e.g. study design, methods, country, setting, sample size, number of centres [if applicable], duration of follow-up, source of funding) Characteristics of population (e.g. age, gender, ethnicity, co-interventions, information regarding respondent bias/representativeness of the included population) Details about the exposure/intervention (e.g. type and category of SSB [see classification scheme for SSBs]), brand name, amount of sugar/serving, type of added sugar, amount of caffeine, other specific ingredients and their quantification, frequency of use, amount consumed in millilitres, SSB caloric content, percent of total calories obtained from SSB consumption, method of assessing SSB consumption; type of educational or other interventions and description, type of professional delivering intervention) Details about comparator group (e.g. for beverages: identity, frequency of use, amount used, brand name, specific ingredients) Outcomes of interest for the longest duration of follow-up (definitions, measurement methods, data, adjusted and unadjusted effect estimates) Confounding factors that were taken into consideration Risk of bias items

Assessing the risk of bias
The risk of bias for each included study will be assessed by one member of the research team and verified by a second member. Disagreements will be resolved by consensus or by a third team member, if needed. Assessment tool questions were reorganized (as needed) to ensure that domains relating to selection, confounding (where applicable), performance, attrition, detection, reporting, and 'other' biases were addressed in all tools. A modified version of the Cochrane Risk of Bias tool will be used to evaluate RCTs (Additional file 4) [38]. In addition to the standard domains of bias, we will assess cluster trials for the possibility of recruitment bias [39]. A modified version of the Academy of Nutrition and Dietetics' Evidence Analysis Library (EAL) Quality Criteria Checklist will be used to evaluate observational studies and CCTs (Additional file 4) [40]. We removed questions pertaining solely to reporting characteristics, and we added a few other relevant questions. To evaluate ITS and CBA studies, a modified Cochrane Effective Practice and Organisation of Care (EPOC) tool will be used (Additional file 4) [41]. Study sponsorship will be assessed for all studies. Each domain within a tool will be judged as unclear, low, or high risk of bias, with supporting information provided from the report or reviewer interpretation to rationalize the judgement of bias [38]. Some domains are outcome-specific and will be assessed at the outcome level. The risk of bias for outcomes will be factored into grading the quality of evidence. The overall risk of bias for the body of evidence will involve a judgement of the relative importance of domains, guided by known empirical evidence of bias, the likely direction of bias, and the likely magnitude of bias [38]. We will follow the Grading of Recommendations Assessment, Development and Evaluation (GRADE) guidance for determining the extent of the risk of bias for the body of evidence [42].
Step 1-Assessment of risk of bias for individual studies for a given outcome Low risk of bias: when all key domains are at a low risk of bias Moderate risk of bias: crucial limitation for one domain or some limitations for multiple domains sufficient to lower confidence in the effect estimate High risk of bias: crucial limitation for one or more domains sufficient to substantially lower confidence in the effect estimate Step 2-Assessment of risk of bias for the body of evidence across studies for a given outcome (incorporated into GRADE assessments as one of the required domains) No serious limitation, do not downgrade: body of evidence mostly from studies at a low risk of bias Serious limitations, downgrade one level: body of evidence mostly from studies at moderate risk of bias Very serious limitations, downgrade two levels: body of evidence mostly from studies at high risk of bias Regarding confounding bias, causal DAGs will be used [31]. Causal DAGs are graphical models used in epidemiology to determine sets of confounders that should be adjusted for to obtain unbiased effect estimates. They also identify biasing paths associated with selection bias. By adjusting for only those confounders that are derived from a causal DAG, the potential for over-adjustment or the creation of selection bias by conditioning on colliders (i.e. variables that are common effects of both the exposure and outcome) are reduced, given that the causal DAG is correct [43,44]. We will use the minimal sufficient adjustment sets generated from the causal DAGs as a guide to determine if studies have adequately accounted for confounding variables or created selection bias by overadjustment of imbalances between exposure and control groups. For a given outcome, if studies differentially account for variables from the minimum adjustment sets, we may not pool those studies in a meta-analysis. Draft causal DAGs (Additional file 5) have been developed using the DAGitty program [45]. The causal DAGs may be modified to include important and justifiable variables we encounter when reviewing included studies.

Dealing with missing data
If information or data are missing or incomplete, we will attempt to contact the study authors twice over 2 weeks by email. If feasible, we will incorporate loss-to-followup data. We will not impute effect estimates, but will impute missing standard deviations or standard errors using data from other similar studies in the review, using an approach suggested in the literature [46].

Data analyses
For dichotomous outcomes, the risk ratio or odds ratio and 95% confidence intervals will be used for pooling. For continuous outcomes, mean differences and 95% confidence intervals will be used for pooling for outcomes reported on the same scales or measured in the same units. Standardized mean differences will be used for pooling where continuous outcomes are reported using different scales or measures. Transformation of data to allow analyses with mean differences will be made, wherever possible. All formats of continuous outcome data will be extracted whether reported as post-intervention or change from baseline. We will consider using the r value, a correlation coefficient, for continuous variables when some studies analyse as a dichotomous outcome and others analyse as a continuous outcome. Statistics from individual studies will be converted to an r value before metaanalysing [47,48]. The r value can be roughly interpreted as a small (r = 0.1), medium (r = 0.3), or large (r = 0.5) effect size. We plan to transform the pooled r to another statistic, such as an odds ratio, to aid in interpretation [47]. For time-to-event data, the hazard ratio, which is usually estimated from a Cox proportional hazards model, will be pooled using the generic inverse variance method [49]. Comprehensive Meta-Analysis will be used for meta-analyses [50]. SAS software will be used for meta-regression analyses [51].

Other statistical considerations
Sparse binary data and studies with zero events When studies report rare events, narrative synthesis will suffice. When event rates are less than 1%, the Peto odds ratio method will be used. However, when control groups are of unequal sizes, when large magnitude of effect is observed, or when events become more frequent (5%-10%), the Mantel-Haenszel method without correction factor will be employed for quantitative synthesis [49].
Data conversions Where needed, we will convert data (e.g. standard error to standard deviation) for use in analyses and to facilitate consistent presentation of results across studies.
Interrupted time series designs If interrupted time series studies are included, we will re-analyse data, where needed and if feasible, for change in level and slope according to time series regression analyses [52].

Evidence synthesis
Study characteristics will be summarized narratively in the text and shown in summary tables in the report. Before meta-analyses are performed, studies will be assessed for heterogeneity on clinical and methodological characteristics; we plan to review these decisions with the research team before conducting analyses. For outcomes that can be measured on various scales (e.g. quality of life), heterogeneity of outcome measurements will also be assessed before pooling. With sufficient homogeneity and quantity of data, we will pool studies using standard random effects meta-analytic methods [53,54]. Any meta-analyses will be done separately for observational studies and experimental ones. Narrative synthesis of data will be conducted when quantitative pooling is considered inappropriate (team decision based on the aforementioned issues, such as disparate clinical characteristics of included patients/participants). When important clinical or methodological heterogeneity precludes pooling, we may still present forest plots without a pooled summary estimate to show individual study effects. Effect estimates from observational studies at a high risk of bias may be excluded from the evidence synthesis when their findings are inconsistent with studies at a lower risk of bias. If only observational studies at a high risk of bias exist for a given outcome, we will not synthesize those studies because they are unlikely to inform about causality.

Statistical heterogeneity and effect modifiers
Statistical heterogeneity will be assessed using Cochrane Q (considered statistically significant at p < 0.10) and I-squared statistics. For the interpretation of I-squared, a rough guide of low (0%-25%), moderate (25%-50%), substantial (50%-75%), and considerable (75%-100%) heterogeneity will be used [55,56]. When a body of evidence is determined to be statistically heterogeneous, we plan to explore the impact of moderating factors using a combination of subgroup analysis and meta-regression techniques, where the optimal approach for each variable will be determined once we see how data are reported in studies. Some variables (*) will be investigated as potential effect modifiers regardless of heterogeneity tests. The distribution of several aspects of patient demographics, exposure, and other characteristics will be of interest and include the following: Participant age* Participants who are post-menopausal women Sex* Ethnicity* Region (to account for culture and life style) Overweight/obesity* CVD risk group* SSB category* Amount of SSB consumption* Type of comparator* Caffeine-containing drinks* Outcome definition Study design Duration of study Single versus multi-centre studies Risk of bias assessments Covariate adjustment Funding Adjustment for total caloric intake* We will follow previously published guidance for metaregression [49]. Meta-regression will be based on the random effects model to allow for residual unexplained heterogeneity. A p value <0.10 will characterize statistical significance. When the sizes of the included studies are moderate or large, there should be at least 10 studies for a continuous study-level variable. For a categorical subgroup variable, each subgroup should have a minimum of four studies. These numbers serve as the lower bounds for considering meta-regression [49]. When included studies are mostly small in size, univariable meta-regression will be used when an insufficient number of studies are available to conduct multivariable analyses.
Regarding caloric intake, if highly correlated with SSB consumption, it will be difficult to decipher the independent effect of SSB consumption. If caloric intake lies in the causal pathway, then adjusting for it would eliminate any true associations between SSB consumption and outcomes. However, a counter-argument could be made that failure to adjust for caloric intake would produce spurious positive associations. Due to this uncertainty, we reasoned that assessing effect estimates for this variable in a subgroup analysis was the most appropriate; studies will not be penalized for adjusting or not adjusting in assessments of confounding bias.

Sensitivity analysis
Sensitivity analyses may be used to restrict analyses to low risk of bias and any decisions made regarding data handling.

Assessing for small study effects
We will investigate small study effects by the performance of cumulative meta-analysis (studies ordered and synthesized from the most to the least precise) and/or other graphical or statistical techniques if the following criteria are met: there are at least 10 studies available, studies are of unequal sizes, there are no substantial clinical and methodological differences between smaller and larger studies, and quantitative results are accompanied with measures of dispersion [57][58][59][60].

Grading the quality of evidence and interpretation
We will use the GRADE approach to evaluate the quality of evidence for outcomes [42]. Quality of evidence is the level of confidence for a causal inference that authors place in the estimate of effect for an outcome (i.e. their judgement that the evidence reflects the true effect). As stated previously, if surrogate measures are analysed in lieu of decision-making endpoints, these will be graded.
When grading the evidence, reviewers will evaluate the domains of study limitations (risk of bias), inconsistency, indirectness, imprecision, and publication bias and downgrade where important limitations exist. Studies may also be upgraded based on a strong magnitude of effect that is not due to known biases, dose-response gradient, and residual confounding that would have reduced the observed effect. The overall quality of evidence grade will be designated as high (confident the true effect lies close to that of the estimate), moderate (moderately confident in the effect estimate but may be substantially different), low (confidence in the effect estimate is limited), or very low (very little confidence in the effect estimate) [42]. The results will be discussed in light of the strength of findings as well as their implications for research and public health.

Quality assurance
We used the PRISMA for Protocols (PRISMA-P) checklist for reporting this protocol [61]. This review will be reported according to the PRISMA statement [37] and using a Measurement Tool to Assess the Methodological Quality of Systematic Reviews (AMSTAR) tool for additional quality control [62]. This protocol does not update any previously conducted systematic review. Any amendments made to this protocol when conducting the review will be outlined in the review's manuscript.