Low-calorie sweeteners and health outcomes: an evaluation of rapid versus traditional evidence mapping

Scientific evidence related to environmental exposures continues to mount. Tools such as evidence mapping support decision making, but can be resource- and time-intensive. We explored “rapid evidence mapping” to efficiently map scientific evidence using rigorous and transparent methodologies. We undertook a proof-of-concept case study on the topic of low-calorie sweeteners. Our intent was to conduct a traditional evidence map based on the same evidence base from a prior rapid evidence map case study to compare approaches, findings, and conclusions. We searched the literature, screened full text of studies, manually tagged and categorized articles, and created visualizations to map the evidence. We conducted full-text screening of studies from the prior rapid evidence map and identified 255 relevant studies. Our findings corroborated those of the rapid evidence map, identifying most studies as short-term conducted in healthy individuals studying outcomes of appetite, energy sensing and body weight. We identified gaps in research areas related to outcomes of appetite and dietary intake, particularly in study populations with diabetes. Our findings illustrate the promise of rapid evidence mapping as a rigorous approach that can summarize scientific evidence, identify knowledge gaps, and identify areas for a future systematic review in a time-efficient manner.


Introduction
A recurring challenge in the field of environmental health is for stakeholders to efficiently summarize the everexpanding scientific evidence and come to defendable scientific consensus to support decision-making [1][2][3]. Tools such as "evidence mapping" are frequently being used to identify the key areas of study along with important gaps in the literature [4]. These are gaining traction for its role informing risk management and policy decisions, and as a problem formulation tool to refine the focus of questions addressed in full systematic review [5].
We previously developed "rapid Evidence Mapping" (rEM), a knowledge synthesis approach where the evidence mapping review process is simplified in a resourceefficient manner [6]. We applied this to a case study on the topic of human dietary low-calorie sweeteners (LCS) exposures and select health outcomes [6]. However, in contrast to a more "traditional" evidence map approach, the rEM was based solely on title-and-abstract screening of references and relying on semi-automated machine learning to categorize and tag references. Here, our main goal is to compare the rEM to traditional evidence mapping, by creating a traditional evidence map on the same evidence base used to generate the rEM and comparing the results as well as the time and resource commitments required for each.

Methods
To create the traditional evidence map, we started with the results from an rEM of human exposures to LCS published by Lam et al. [6] and applied the traditional evidence mapping approach (Additional file 1: Fig. S1). Our approach was outlined beforehand in our publicly available protocol [1].

Lam et al. [6] rEM approach
Lam et al. [6] searched PubMed using the keywords and MeSH terms provided in the Additional file 5, with date limit January 1, 1946-May 1, 2014. We utilized the same evidence base for direct comparison and did not update the search to identify newer studies. Lam et al. [6] defined the following inclusion/exclusion criteria (Additional file 2: Table S1): (1) randomized or non-randomized, controlled, clinical trials or prospective cohort study designs; (2) orally administered, FDA-approved or generally recognized as safe (GRAS) LCS; (3) at least one health outcome within five specified categories (Additional file 3: Table S2); (4) published in English; and (5) human subjects.
Lam et al. [6] imported all included titles and abstracts into SWIFT-Review (Sciome Workbench of Interactive computer-Facilitated Text-mining), a freely available text mining and machine learning software application (https:// www. sciome. com/ swift-review/). Software functionalities were used to automatically classify outcome, baseline health, and comparison categories [11] from reference abstracts, whereas manual review was conducted to extract study length and sample size categories.

Traditional evidence map approach
We started with all references included from Lam et al. [6] following title and abstract screening and conducted full text screening, applying the same inclusion/exclusion criteria. Screening was conducted in duplicate by two independent reviewers (JL, RE, LA, or DB) in SWIFT-Active Screener.
The following data were manually extracted from the full text of included studies: • Study design characteristics (study design, study duration) • Study population characteristics (baseline health status, age, sample size, anthropometrics) • Study interventions/exposures and comparisons (type of LCS, comparison or control group) • Number of people analyzed, form of administration • Outcome information (all outcomes or endpoints of interest) • Aim or hypothesis and funding source of the study Data was extracted by a single reviewer (JL, RE, LA, or DB), with an independent second reviewer reviewing and verifying the data. A pilot test of 10 studies was conducted to calibrate screeners and resolve any confusion interpreting directions. Extracted data were limited to information available in the published manuscripts; additional details were not sought from study authors.
The same frequency tables and bubble plot visualizations generated by Lam et al. [6] were ultimately created based on the full-text extracted data (Additional file 1: Fig. S1).

Results
Lam et al. 's [6] original search retrieved 8122 unique records and ultimately included 301 references (Fig. 1). In our update, full-text screening of the 301 studies resulted in the ultimate inclusion of 255 references (Fig. 1). The following sections summarize the descriptive analyses of the data and compare the results of the rEM versus traditional evidence map.

Study characteristics and design
Lam et al. [6] reported that the majority of the 301 included studies were: (a) conducted in subjects where health status was healthy, mixed or other (83%); (b) interventions comparing LCS versus sugar (50%); and c) acute in duration lasting < 1 day (58%) ( Table 1). Our conclusions based on the 255 full text screened and data extracted references were identical, although the proportion of studies falling in each category was slightly different-90%, 72%, and 31% respectively (Table 1).
The majority of our included studies (61%) reported sample size ranging between 11 and 50 human participants. This aligned with findings from Lam et al. [6], although the proportion was smaller (39%). Few studies were reported to have larger sample sizes, for instance between 101 and 200 participants (we reported 5% and Lam et al. [6] 4%). One major difference was that Lam et al. [6] identified a large number of studies (36%) whose sample size was categorized as "unclear" whereas we only used this categorization for 3% of studies.

An evidence map to identify research gaps
We generated an evidence map consisting of frequency tables and bubble plots displaying the categorical features of included studies. Our results illustrate a consistent pattern with greater number of studies investigating comparisons of LCS versus sugars across all outcome categories. This difference was consistent, with the greatest difference in the body weight/composition outcome with 83% of studies comparing LCS vs. Sugars. The glycemic outcome demonstrated the lowest difference, with 70% of studies comparing LCS vs. Sugars. Lam et al. [6] reported a similar pattern, with the single exception of glycemic outcomes where the majority (56%) of studies categorized as LCS versus Others instead of LCS versus Sugars (44%).
We also categorized studies by study and outcome group (Additional file 4: Table S3). Our results illustrate that across four of the five outcome categories, the majority of studies were acute duration (either < 30 days). Very few studies were long-term (> 6 months). This was consistent with findings reported in Lam et al. [6].
The third component of the evidence map is a bubble plot (Fig. 2), where data points are grouped and plotted according to study population baseline health status and outcome category, stratified by intervention. Each data point represents a single study, and is randomly scattered in each grid (i.e., bubble position is meaningless). The size of each bubble indicates the sample size of the corresponding study, with larger bubbles representing larger study sample size (Lam et al. [6]). Figure 2 illustrates that the majority of included studies utilize either generally healthy or mixed/other study populations compared to overweight and diabetic population types across all outcome categories. Most studies with diabetic participants report outcomes of either glycemic or body weight/composition. In contrast, there is a general lack of studies assessing appetite outcomes in people with diabetes, studies using LCS versus Sugars interventions to assess energy sensing and glycemic outcomes in overweight people, and studies comparing interventions LCS versus Others to assess energy sensing outcomes in both overweight and diabetic participants. These findings mirrored those reported by Lam et al. [6].

Discussion
This case study was intended to explore the question of whether a rapid evidence map would produce similar conclusions based on a more traditional evidence map, while simultaneously providing an estimate of differences in time and resources required. To our knowledge, this is the first formal evaluation of rEM approaches compared to traditional evidence mapping. Our results indicate that rEM could be of great utility, particularly in the context of designing or implementing a systematic review. As evidence mapping gains visibility fields for their role in informing decision-making and risk management, our results outline promise for rEM approaches to play a significant role in increasing the efficiency of the policymaking process.
Our traditional evidence map initiated with the full text screening of references. An additional 46 studies (15%) were excluded at this step. This was not surprising-reviewing the full text provides additional details for reviewers to determine relevance. This illustrates  one limitation of the rEM approach that an evidence map generated based on solely titles and abstracts of records will almost certainly include irrelevant references. Our overall conclusions from the traditional evidence map corroborated those reported by Lam et al. [6]. Although there were minor discrepancies comparing our results with Lam et al. [6], the conclusions regarding the overarching question of this work as to where current research exists and where it is lacking was remarkably consistent. In general, most discrepancies were likely due to the greater precision from manually extracting data from the full text versus only the title and abstract. Furthermore, the utilization of the SWIFT-Review automated functionality to summarize and visualize data from the included abstracts by Lam et al. [6] further exacerbated the imprecision of data extracted from studies, combining issues with incomplete information included in the study abstracts versus full text along with the limitations of the automated software as compared to manual review and data extraction.
We estimated that the traditional evidence map required an additional 65 person-hours over the rEM approach. Although this additional required time may seem minimal, taken in the context of the relative size of the evidence base (i.e., n = 255) indicates that for a larger evidence base the potential time savings from implementing an rEM over a traditional evidence map could be significant.

Conclusions
Alternative approaches to evidence mapping such as rEM are a promising approach to identify, collect, and evaluate scientific evidence that reduces the required time, effort, and resources. From our direct comparison to a traditional evidence map, we identified strong concordance with the ultimate conclusions when utilizing rEM. Since the main goal for evidence mapping is to quickly identify research gaps as well as opportunities for systematic review, an rEM may be able to do so in a more efficient manner that requires less time and resources.

Limitations
Our traditional evidence map has several limitations, most of which could be readily addressed with increased time and resources. First, we only utilized references retrieved from searching one database (PubMed). By not including additional databases or the grey literature, we risk missing relevant studies. A broader literature search could have potentially retrieved a higher number of relevant studies and altered our final evidence map and conclusions.
We also utilized single-screening for records beyond the first 500 records (i.e., only one screener reviewing each reference as opposed to utilizing two independent reviewers). This reduced the time and resources required for screening, but introduces the possibility that a single screener could have erroneously excluded a relevant study. This limitation could be addressed by employing duplicate screening of references; however, this would have doubled the required time and resources.
Lastly, evidence mapping does not include the evaluation of study quality or risk of bias (i.e., internal validity). Thus, it is unknown whether the included studies are of high or poor quality. For instance, in categories where there was a high volume of included studies, it is possible that these studies are of variable quality. This also hinders limits our ability to state conclusions regarding the health effects of low calorie sweeteners; as expected, with an evidence map the end result is a visual depiction of where evidence exists and where it is lacking, not what conclusions one might ultimately draw regarding the body of scientific evidence. This could potentially be addressed by evaluating each included study using one of several available validated quality assessment tool; however, this would again require additional time and resources to complete this step.