A Real-World Prospective Study of the Safety and Effectiveness of the Loop Open Source Automated Insulin Delivery System

Objective: To evaluate the safety and effectiveness of the Loop Do-It-Yourself automated insulin delivery system. Research Design and Methods: A prospective real-world observational study was conducted, which included 558 adults and children (age range 1–71 years, mean HbA1c 6.8% ± 1.0%) who initiated Loop either on their own or with community-developed resources and provided data for 6 months. Results: Mean time-in-range 70–180 mg/dL (TIR) increased from 67% ± 16% at baseline (before starting Loop) to 73% ± 13% during the 6 months (mean change from baseline 6.6%, 95% confidence interval [CI] 5.9%–7.4%; P < 0.001). TIR increased in both adults and children, across the full range of baseline HbA1c, and in participants with both high- and moderate-income levels. Median time <54 mg/dL was 0.40% at baseline and changed by −0.05% (95% CI −0.09% to −0.03%, P < 0.001). Mean HbA1c was 6.8% ± 1.0% at baseline and decreased to 6.5% ± 0.8% after 6 months (mean difference = −0.33%, 95% CI −0.40% to −0.26%, P < 0.001). The incidence rate of reported severe hypoglycemia events was 18.7 per 100 person-years, a reduction from the incidence rate of 181 per 100 person-years during the 3 months before the study. Among the 481 users providing Loop data at 6 months, median continuous glucose monitoring use was 96% (interquartile range [IQR] 91%–98%) and median time Loop modulating basal insulin was at least 83% (IQR 73%–88%). Conclusions: The Loop open source system can be initiated with community-developed resources and used safely and effectively by adults and children with type 1 diabetes.


Introduction
B efore closed-loop systems becoming commercially available, ''Do-It-Yourself'' closed-loop systems were developed by individuals with a personal interest in automating insulin delivery for type 1 diabetes (T1D). One such system called ''Loop'' was developed by Nate Racklyeft, Pete Schwamb, and others in the open-source diabetes software community in 2015.
Loop is an open-source iOS app that runs on an iPhone. The software is a hybrid closed-loop controller utilizing model prediction that anticipates future glucose based on the effects of delivered insulin, user-entered carbohydrates, and two forms of short-term adaptation dubbed ''glucose momentum'' and ''retrospective correction.'' The effects ofcarbohydrates are based on user-specified insulin-to-carbohydrate ratio and insulin sensitivity factor (ISF), whereas the insulin effect is determined by the ISF. Loop then alters insulin delivery to attempt to drive the blood sugar toward a user-specified glucose target. In the most common implementation, Loop alters insulin delivery by instructing the insulin pump to temporarily increase or decrease basal insulin delivery.
Originally, Loop supported versions of Medtronic insulin pumps that could receive unsecure remote commands. The Insulet Omnipod pump became another option in April 2019 after months of internal testing by the development community. A RileyLink serves as a bridge between the iPhone's Bluetooth and the sub-gigahertz radio frequency used by these pumps. The system is compatible with Dexcom and Medtronic continuous glucose monitors (CGMs). An Apple Watch optionally may be used with the system for additional user interaction. Additionally, users can enter meal absorption times, adjust glucose targets and insulin needs during exercise, and track their insulin delivery from the Loop app or through logging software such as Nightscout or Tidepool.
Loop is being used worldwide by up to 9000 individuals based on the RileyLink order history (personal communication, Jeremy Lucas, founder of GetRileyLink.org, February 2020), despite limited data on the system's safety and efficacy. To provide this crucial information, we conducted an observational, longitudinal study of existing and new Loop users. Herein, we report on the data provided by new Loop users from the time they initiated Loop for up to 6 months.

Methods
The study was conducted in a real-world setting, outside of the clinic, with all data provided directly by study participants. The protocol was approved by the Institutional Review Board of the JAEB Center for Health Research. The protocol is available at https://public.jaeb.org/datasets and summarized on clinicaltrials.gov (NCT03838900).
The study included adults and children with T1D who were U.S. residents. The existence of the study was publicized on websites and at the time of ordering a RileyLink (required for Loop operation). Interested individuals were directed to the study website for information about the study where electronic informed consent was obtained from participants ‡18 years of age and the legally authorized representative for participants <18 years of age who provided assent. The analysis herein included only participants who were initiating Loop to understand Loop's impact from a baseline without prior use of Loop. Enrollment was open between January 2019 and August 2019. The data collection ended in April 2020 such that all participants had the opportunity to complete 6 months of follow-up, which comprises the dataset reported herein.
Participants provided demographic and socioeconomic information, information about their medical history and medications, diabetes history and management, and height and weight. Weekly, participants received a text and/or email prompt to report any device issues or serious adverse events, including diabetic ketoacidosis (DKA), severe hypoglycemia, and hospitalizations. All DKA and severe hypoglycemia events reported after enrollment were reviewed. Confirma-tion of DKA required hospitalization for at least one night. Confirmation of severe hypoglycemia required a description that was consistent with the participant being impaired cognitively to the point that he/she was unable to treat himself/ herself; being unable to verbalize his/her needs; being incoherent, disoriented, and/or combative; or having experienced seizure or loss of consciousness. At baseline, participants were asked to report the number of such episodes that had occurred in the prior 3 months.
A fingerstick blood sample was obtained for HbA1c measurement, using a collection kit mailed directly to the study participant, at baseline and after 3, 6, and 12 months and mailed to a central laboratory (University of Minnesota Advanced Research and Diagnostic Laboratory). Of the 1412 samples sent to the laboratory, 215 (15%) were determined to not be analyzed when received by the laboratory, typically due to suspected temperature-related loss of sample integrity or because too much time had passed since sample collection. Results from 5 (0.4%) of the 1197 analyzed samples were considered unreliable and excluded. Samples were considered unreliable if the HbA1c was less than 5.0% (31 mmol/mol) and the difference with the glucose management indicator (GMI) estimate of HbA1c 1 was inconsistent with HbA1c-GMI differences at other time points, or the HbA1c-GMI difference was >2% and not consistent with differences at other time points. The quality of life/psychosocial and treatment satisfaction surveys were completed at baseline and after 3, 6, and 12 months. Focus group sessions, which will be described in a separate report, were held within the first 3 months of initiating Loop and at the end of the study; additionally, interviews were held with 20 participants who discontinued Loop.
At study entry, prior pump and CGM data were obtained when available. During the study, Loop system data were written to Apple Health and then continuously streamed to Tidepool, where the data were aggregated. There was no standardization of how the Loop system was to be used. Participants were asked to export Loop's Issue Report monthly, which provided data on pump settings, Loop version, and Loop settings, including therapy settings.

Statistical methods
The study sample was a convenience sample and was not based on statistical principles. To be included in the cohort for analysis, participants had to (1) not have started Loop, or used it for <7 days at the time of enrollment; (2) provided at least 50 records of Loop basal insulin data (represents 4-8 h of Loop use) or at least 1 Loop device issue report after starting Loop; and (3) provided at least 336 h (14 days) of CGM data in the first 182 days after starting Loop.
All eligible participants were included in the safety analysis. CGM data provided before the date of Loop initiation (minimum of 168 h) were used to calculate metrics at baseline, whereas all data provided from the date of Loop initiation to 182 days after Loop initiation were used to calculate outcomes at 6 months of follow-up. For normally distributed outcomes, a paired t-test was used to evaluate differences between baseline and 6 months follow-up. If outcomes were skewed, a Wilcoxon signed rank test was used instead. For binary outcomes, McNemar's test was used to evaluate differences between baseline and follow-up.
The prespecified primary efficacy outcomes were percent time in range 70-180 mg/dL, time >180 mg/dL, mean glucose, time <70 mg/dL, time <54 mg/dL, and HbA1c. For comparisons of these outcomes between baseline and followup, the family-wise type 1 error rate was controlled at a twosided alpha of 0.05 using a hierarchical approach. Secondary efficacy outcomes tested were the glucose standard deviation, glucose coefficient of variation, time in range 70-140 mg/dL, time >250 mg/dL, high blood glucose index, low blood glucose index, area under the curve above 180 mg/dL (area under the receiver operating characteristic curve >180 mg/dL), area over the curve below 70 mg/dL (area over the curve [AOC] <70 mg/dL), the rate of hypoglycemia events below 54 mg/dL per week, and the percentage of participants meeting international glucose consensus targets to T1D (nonpregnant). 2 For comparisons of secondary outcomes, the false discovery rate was controlled using the Benjamini-Hochberg procedure with <0.05 as a threshold for statistical significance.
Additional analyses assessed change from baseline in the first 3 months and second 3 months after starting Loop. Analysis of CGM outcomes were conducted separately for daytime (6:00 AM-11:59 PM) and nighttime (12:00 AM-5:59 AM).
All P-values and confidence intervals (CIs) reported are two-sided. All analyses were conducted using SAS software, version 9.4 (SAS Institute, Inc.).

Results
Of the 799 new Loop users who were enrolled and provided electronic consent, 241 did not meet the inclusion criteria ( Supplementary Fig. S1). Most (93%) of the exclusions were due to no data or insufficient data being provided. The age range of the 558 eligible participants was 1-71 years. Baseline HbA1c averaged 6.8% -1.0% (51 -10.9 mmol/mol) (including 84 participants with HbA1c ‡7.5% [58 mmol/ mol]). Thirty-six (6%) used only a Medtronic pump during the study, whereas 502 (90%) used only the Omnipod pump and 20 (4%) used both at some point during the study follow-up (Table 1). At least 168 h (1 week) of baseline CGM data (before starting Loop) were available for 447 (80%) of the 558 participants. There were several variants of Loop utilized in this study (Supplementary Table S1 footnote).
The beneficial effect on CGM metrics was seen throughout the age range of participants (Supplementary Table S4). As seen in Supplementary Table S1, improvement in TIR was seen across the range of baseline HbA1c, with greater amount of improvement occurring in those with higher baseline HbA1c (and lower baseline TIR) and higher TIR levels achieved in those with higher baseline TIR. Medtronic pump users were older and more likely to have prior automated insulin delivery (AID) use compared with Omnipod pump users (Supplementary Table S5), but glycemic results were similar by pump manufacturer (Supplementary Table S6).
Glucose monitoring and closed-loop system use Seventy-seven (14%) participants stopped providing Loop data during the first 6 months, of whom 15 (3%) provided information indicating that they had discontinued Loop. Reasons for discontinuing are indicated in Supplementary  Table S7. For the other 62, it is not known whether they stopped using Loop or just stopped providing data. The most common reported issues with use of Loop were problems with connectivity and communication (27% of total issues reported) and hardware damage/failure (10% of issues reported) (Supplementary Table S8).
Among the 481 participants who provided Loop data through 6 months, median CGM use during 6 months was 96% (interquartile range [IQR] 91%-98%) and median time that Loop was modulating the basal rate was at least 83% (IQR 73%-88%) (Supplementary Table S9). The 83% number represents a lower bound for percent time in closedloop with device data not permitting differentiation between closed-loop, open-loop, or other system status 17% of the time. Among all subjects included, the mean total daily insulin over 6 months was 0.70 -0.40 U/kg, with modulated basal insulin representing 54% of insulin delivered (Supplementary Table S10).

Safety outcomes
Supplementary Table S11 shows the safety outcomes according to age groups. For the 14,755 weekly surveys that could possibly be completed, median percent completion per participant was 89% (IQR 67%-93%). There were no cases of confirmed DKA. During the 6 months of the study, 35 (6%) participants experienced a total of 51 confirmed severe hypoglycemia events (incidence rate 18.7 per 100 personyears), with 28 (5%) participants experiencing the event in       Five of the 51 events involved seizure or loss of consciousness (incidence rate 1.8 per 100 person-years). One of the 51 events was attributed to the use of Loop, however, this event was not associated with a seizure or loss of consciousness. For comparison, 97 (18%) participants reported at least one severe hypoglycemia event in the 3 months before entering the study (incidence rate 181 per 100 person-years). The frequency of severe hypoglycemia events during the study was substantially higher in participants who had experienced an event in the 3 months before the study (Supplementary  Table S12).

Discussion
Open-source AID systems are used by many adults and children to automate insulin delivery for management of T1D, with at least 9000 having used Loop. The study data reflect real-world use of Loop in that there was no guidance provided by the study as to how Loop was to be used and no formal customer support for troubleshooting. The study participants initiated Loop with community-developed resources. TIR, which on average was already at a high level before starting Loop, increased further; and time <54 mg/dL decreased. Improvement in TIR occurred immediately after starting Loop and was sustained on average over 6 months. The benefits of Loop were seen in both adults and children, across the full range of baseline HbA1c, and with both highand moderate-income levels.
The improvement in TIR during this observational study was similar in magnitude to that reported in randomized controlled trials of other closed-loop systems after accounting for differences among study cohorts in baseline HbA1c levels. [3][4][5][6] For HbA1c levels above 7.0% (53 mmol/mol) with baseline TIR of about 60%, the improvement in TIR observed with different systems has been remarkably consistent, about  represents mean baseline. Shaded blue represents baseline quartiles. Solid red line represents median months 1-6. Solid red dot represents mean months 1-6. Shaded red represents months 1-6 quartiles. (B) Time <54 mg/dL over a 24 h period. Solid blue line represents median baseline. Solid blue dot represents mean baseline. Shaded blue represents baseline quartiles. Solid red line represents median months 1-6. Solid red dot represents mean months 1-6. Shaded red represents months 1-6 quartiles.
10% overall, with greater improvement seen overnight than during the day. For higher baseline TIR, the magnitude of improvement tends to be less and for lower baseline TIR, it tends to be more.
The incidence rate of severe hypoglycemia of 18.7 events per 100 person-years is higher than what has been reported in the afore-referenced studies of other closed-loop systems, some (but not all) of which excluded individuals with recent severe hypoglycemia. There are several possible explanations: (1) this could reflect our frequent ascertainment of severe hypoglycemia through weekly text prompts; (2) differences in study design: the Loop study was real-world and virtual compared with other studies that had structured protocols with close clinical oversight of closed-loop system use by study staff; or (3) higher prestudy risk of this cohort for severe hypoglycemia, possibly due to a more hyperglycemiaavoidant approach to diabetes management. Indeed, prior severe hypoglycemia has been shown to be the strongest predictor of future severe hypoglycemia, similar to the finding in this study. 7 The study data suggest that the use of Loop substantially reduced the risk of severe hypoglycemia in this cohort, since the rate of severe hypoglycemia was much lower during the first 3 months of Loop use than what was reported for the 3 months before starting Loop (27.3 vs. 181.3 per 100person years); however, this must be interpreted in the context of different data collection methods used to capture the prestudy and on-study reports of severe hypoglycemia events.
The strengths of the study include the large sample size, the real-world approach to the protocol, the prospective data collection, the inclusion in the cohort of individuals who were new Loop users, the availability of pre-Loop CGM data for most participants to establish a baseline for comparison with glycemic metrics while using Loop, and the wide age range of study participants from infants to older adults. The main limitations are the lack of concurrent control group, and selfselection bias in participants starting on Loop. Indeed, most of the cohort had HbA1c levels <7.0% (<53 mmol/mol) before starting Loop and most were of high socioeconomic status. This limits the generalizability of the results. Despite the skewness of these factors relative to the population of individuals with T1D, the sample size was sufficiently large that the number of participants with high HbA1c (40 with baseline HbA1c ‡ 8.0% [ ‡64 mmol/mol]), and moderate family income (33 < $50,000) was not inconsequential. In these subgroups, the benefit of Loop appeared comparable to those with lower HbA1c and higher income. We have uncertainty as to the percentage of participants who started Loop and discontinued use of Loop before 6 months, since for 62 participants we are unable to determine if Loop was discontinued or if the participant just stopped providing data. The percentage could be as low as 3% or as high as 14%. We also have uncertainty about the percentage of time that Loop was in closed-loop mode during the 6 months, automatically modulating the basal rate. The percentage is no lower than 83% and is almost certainly higher, but we could not differentiate between possible system states in the remaining 17%, including the user actively turning off closed-loop, communication errors between components or other component failures causing reversion to open-loop, or the system delivering an unaltered open-loop scheduled basal rate as designed (e.g., when glucose is within the user set ''correction range''). We believe that the completeness of the dataset is quite robust for a real-world observational study, but the amount of missing data, nevertheless, reflects a limitation of this type of study.
In summary, this real-world study has demonstrated that the Loop open-source hybrid closed-loop system can be safely self-initiated and used by adults and children with T1D and reduced time in range without increasing hypoglycemia. Tidepool is developing a commercial version of Loop (''Tidepool Loop''), which will rely on the data generated in this study to support FDA clearance.