Update on the General Practice Optimising Structured Monitoring to Improve Clinical Outcomes in Type 2 Diabetes (GP-OSMOTIC) trial: statistical analysis plan for a multi-centre randomised controlled trial

Background General Practice Optimising Structured Monitoring to Improve Clinical Outcomes in Type 2 Diabetes (GP-OSMOTIC) is a multicentre, individually randomised controlled trial aiming to compare the use of intermittent retrospective continuous glucose monitoring (r-CGM) to usual care in patients with type 2 diabetes attending general practice. The study protocol was published in the British Medical Journal Open and described the principal features of the statistical methods that will be used to analyse the trial data. This paper provides greater detail on the statistical analysis plan, including background and justification for the statistical methods chosen, in accordance with SPIRIT guidelines. Objective To describe in detail the data management process and statistical methods that will be used to analyse the trial data. Methods An overview of the trial design and primary and secondary research questions are provided. Sample size assumptions and calculations are explained, and randomisation and data management processes are described in detail. The planned statistical analyses for primary and secondary outcomes and sub-group analyses are specified along with the intended table layouts for presentation of the results. Conclusion In accordance with best practice, all analyses outlined in the document are based on the aims of the study and have been pre-specified prior to the completion of data collection and outcome analyses. Trial registration Australian New Zealand Clinical Trials Registry, ACTRN12616001372471. Registered on 3 August 2016.


Background
The prevalence of type 2 diabetes (T2D) is rapidly increasing and is expected to reach close to 600 million worldwide by 2030 [1]. Close to 1.3 million Australians have been diagnosed with diabetes, with over 85% having T2D [2].
Early management and maintenance of glycaemic (blood glucose) levels through lifestyle modification and pharmacological treatments can reduce the likelihood of diabetes-related complications [3]. Glycated haemoglobin (HbA1c) is an index of average blood glucose level over the preceding 12 weeks and can be measured in mmol/mol or % [4]. HbA1c can be converted from one unit to the other using the relationship mmol/mol = 10.93 × HbA1c (%) -23.5 [5]. Current guidelines base treatment intensification recommendations on HbA1c levels [6,7]. The general HbA1c target in Australia is 53 mmol/mol (7%) [8]; however, the Australian Diabetes Society recommends that targets should also take into consideration factors such as age, duration of diabetes, and risk of hypoglycaemia [9]. Clinical care in general practice can help people with T2D achieve HbA1c targets [10] through adopting an evidence-based "treat-to-target" approach (step-wise treatment intensification through changes to lifestyle, medication doses, and/or prescription of additional medications). However, the majority of people with T2D have an HbA1c above their target level and treatment intensification is commonly delayed beyond clinical need [11]. One contributor to this may be that general practitioners (GP) and people with T2D lack an acceptable, feasible, simple, reliable, and effective method for identifying detailed day-to-day blood glucose patterns (glucose profiles) to guide decisions about treatment intensification.
Continuous glucose monitoring (CGM) is one method of identifying such glucose profiles and is measured in mmol/L. Retrospective CGM (r-CGM) involves the patient wearing a CGM sensor for a period of up to 2 weeks and then, usually in collaboration with their health professional, downloading the glucose data to identify day-to-day glucose profiles to guide treatment decisions. For many people with T2D glucose profiles tend to be stable over time and. Therefore. intermittent r-CGM measurements may be sufficient to guide clinical management. r-CGM can also provide detail about hypoglycaemia, hyperglycaemia, glycaemic variability (GV), i.e. the extent to which glucose fluctuates throughout the day as well as time spent in day-to-day glucose target range, all of which may be important to clinical and psychosocial outcomes for people with T2D [12,13]. R-CGM thus offers the prospect of an advance in appropriate and personalised care for people with T2D [14].
General Practice Optimising Structured Monitoring to Improve Clinical Outcomes in Type 2 Diabetes (GP-OSMOTIC) is a stratified (by GP clinic) individually randomised controlled trial in general practice comparing the use of r-CGM (intervention) to usual care (control) in those with T2D whose HbA1c is above their individualised target level. Within each clinic, participants will be randomly allocated to either the intervention or control group. Full details of the trial method are described elsewhere [15], but are briefly outlined below before presenting the detailed description of the planned statistical methods.

Primary objective
The primary objective was to assess whether the judicious use of intermittent retrospective continuous glucose monitoring (r-CGM) in people with T2D in primary care improves glycaemic control at 12 months as measured by HbA1c.

Secondary objectives
Compared with the control arm, does the judicious use of intermittent retrospective continuous glucose monitoring (r-CGM) in people with T2D in primary care: 1. Improve the percentage of time spent in the target glucose range at 12 months? 2. Decrease diabetes-specific distress at 12 months? 3. Result in cost-effective care? 4. Decrease HbA1c at 6 months?

Primary outcome
The primary outcome measure is the difference in mean HbA1c at 12 months between the intervention and control groups.

Secondary outcomes
The secondary outcome measures are: 1. Difference in mean percent time in target (4-10 mmol/L) range at 12 months between the study groups (from data downloaded from the r-CGM device). 2. Difference in mean diabetes-specific distress at 12 months between the study groups as measured by the Problem Areas in Diabetes (PAID) scale [16]. 3. Incremental cost per quality-adjusted life year (QALY) for the intervention relative to control for the trial period, as measured by the EuroQol 5 dimension 3 levels (EQ-5D-3 L) [17]. 4. Difference in mean HbA1c (%) at 6 months between the intervention and control groups.

Inclusion criteria
Eligible participants will be aged 18-80 years, active patients of the practice (defined as three or more visits to the practice in the last 2 years), have had T2D for at least 1 year with their most recent HbA1c (in the previous 1 month) ≥ 7 mmol/mol (0.5%) above their individualised target (see below) while on at least two non-insulin hypoglycaemic therapy and/or insulin (therapy stable for the last 4 months). Our general glycaemic target is set at 53 mmol/mol (7%) while patients with a history of severe hypoglycaemia (requiring assistance from a third person) or who report impaired awareness of hypoglycaemia (i.e. are unable or have reduced capacity to recognise the early signs and symptoms of hypoglycaemia, which may impede timely self-treatment) will have a target of 64 mmol/mol (8%). In the setting of this pragmatic trial we will allow GPs to indicate a personalised target for a participant if they feel that it should differ from the two pre-specified targets set out above. Patient exclusion criteria will include: any debilitating medical condition (e.g. unstable cardiovascular disease (CVD), severe mental illness, end-stage cancer), an estimated glomerular filtration rate (eGFR) < 30 ml/min/ 1.73m 2 , proliferative retinopathy, pregnancy, lactating or planning pregnancy, unable to speak English/give informed consent, unwilling to use r-CGM or follow study protocol, allergy to adhesive tape, diagnosis of T2D within the past 12 months, and any condition that makes monitoring diabetes using HbA1c unreliable (e.g. haemoglobinopathy, iron deficiency anaemia).

Randomisation
Participants will be stratified by clinic and randomised to either the intervention or control group using randomly permuted block sizes of 4 and 6. The randomisation process will be through REDCap© electronic data capture tools hosted at the University of Melbourne [18], using the application programming interface (API). This allows project information to be exported to a separate statistical computing package which generates allocation sequence tables allowing for random block sizes. These will then be imported back into REDCap© for use through the randomisation graphical user interface (GUI).

Intervention
In brief, intervention group participants will be asked to wear the r-CGM device for a period of 2 weeks every 3 months, i.e. at baseline, 3, 6, 9, and 12 months, as well as having an HbA1c test at those times, and to attend a consultation with their GP (clinic assessment visit (CAV)) to discuss the r-CGM reports. This 3-monthly interval is in keeping with clinical practice guidelines [19]. Intervention participants will also attend a 60-min education session with the study registered nurse credentialed diabetes educator (RN-CDE) which will include instruction on how to wear the r-CGM device and how to interpret the glucose reports from the device to better understand their blood glucose and how this relates to their diabetes self-management and treatment options. The r-CGM device being used in the study is the Abbott FreeStyle Libre Pro® Flash Glucose Monitoring System.
Control group participants will wear the r-CGM device at baseline (blinded) and thereafter will be managed according to usual clinical care. The GP and patient will be prompted to undertake 3-monthly diabetes reviews in keeping with clinical practice guidelines about step-wise regular consideration of treatment intensification. Patients randomised to the control group will also attend an education session with a local CDE, funded by the study if required to ensure financial barriers do not exist. Control group participants will have an r-CGM sensing at 12 months, which will be used in collaboration with their GP in their management of diabetes after the final HbA1c blood measurement and all other trial outcomes have been collected.

Outcome measures
The primary outcome, HbA1c, will be measured by venous blood test in an accredited laboratory. Time in the target range will be calculated as the percentage of time blood glucose levels remain between 4 and 10 mmol/L as measured by the r-CGM device. Diabetes-specific distress will be measured using the PAID scale [16]. This scale consists of 20 questions relating to negative emotions associated with diabetes, with five possible responses to each question: 0 = no problem, 1 = minor problem, 2 = moderate problem, 3 = somewhat serious problem, and 4 = serious problem. The 20 items are summed, and the total is multiplied by 1.25 so that total score ranges from 0 to 100. Higher scores indicate greater levels of diabetes-specific distress; a score of ≥ 40 indicates severe diabetes distress [20]. The PAID measure has high internal reliability and validity [16].
Results from the EQ-5D-3 L assessment at each measurement will be transformed into utility scores using Australian preference weights [21]. An average utility curve, which measures the mean quality of life trajectory for patients, will be derived by interpolating between baseline and the follow-up measurement points [22]. QALYs will then be estimated for both the intervention and the control group using the 'area under the curve' method [23]. As the economic evaluation will be performed within a 12-month period, discounting will not be applied.

Statistical analysis Sample size
The sample size is based on an individually randomised controlled trial without accounting for stratification by clinic. Clinical significance was considered to be a difference of at least 0.5% (7 mmol/mol) in mean HbA1c between the groups and is based on current guidelines which recommend intensification of therapy when HbA1c levels remain 0.5% (7 mmol/mol) above target [19]. The sample size was calculated using HbA1c in %. Using a significance level of 0.05, power of 0.8, clinically significant difference of 0.5%, and standard deviation of 1.3% for HbA1c [24], the required number of participants in each group is 108, a total of 216. This is equivalent to a difference in the mean HbA1c of 7 mmol/mol between the groups with a standard deviation of 14 mmol/mol [24]. Assuming a 20% attrition rate, the required sample size inflates to 270 (135 in each group). Allowing for 10% clinic attrition and assuming six participants per clinic, we require 50 clinics with six participants per clinic (150 in each group). Figure 1 shows the minimum number of clinics and participants per clinic required for 20% participant attrition and 10% clinic attrition. The figure shows that it is possible to recruit 300 participants in a variety of ways; for example, 25 clinics with 12 participants per clinic, 30 clinics with 10 participants per clinic, 50 clinics with six participants per clinic, and 75 clinics with four participants per clinic. Four participants per clinic was the minimum recommended to allow for estimation of the correlation in outcome measure between participants in the same group and clinic. From prior knowledge of recruitment patterns from the Stepping Up Study [24] it was decided to recruit 50 clinics with six participants per clinic.

Data collection and preparation
An in-house, web-based, purpose-built recruitment database will be used to document all practices approached to participate in the study. Once consented to the study, RED-Cap© will be used to store all clinic, GP, and practice nurse (PN) characteristics. All clinic, staff, and participant data will be collected at baseline and 12 months and entered into the database by research assistants using either a desktop computer or tablet. Data from CAVs and any technical issues or adverse events associated with the r-CGM device will be logged by research assistants in REDCap©.
HbA1c data will be collected 6-monthly from the same pathology laboratory for each patient and collated in a Microsoft Excel 2016 file. Participants will be encouraged to have their HbA1c levels collected at 3 and 9 months, but this will not be compulsory. The pathology data will be merged with the clinical patient data in STATA version 15.1 [25].
An in-house, web-based, purpose-built participant tracking database will be used to track changes in patient medication and the progress of patients throughout the study.

Trial profile
A study flow diagram (Fig. 2) will be used to summarise the progress of participants throughout the trial, from Descriptive statistics STATA version 15.1 (StataCorp, College Station, Texas) will be used for all analyses. Practice, GP, PN, and participant characteristics at baseline will be summarised (Tables 1 and 2). Continuous measures will be summarised using means and standard deviations or medians and interquartile ranges for skewed distributions. Categorical variables will be summarised using frequencies and percentages. Where applicable, the number of missing values will be specified and percentages for categorical variables will be based on the available data only.

Statistical modelling
Primary and secondary outcomes Whilst our primary outcome is HbA1c at 12 months post-intervention, we will estimate the between-group difference in mean HbA1c at 6 and 12 months with the same linear mixed-effects model using restricted maximum likelihood estimation. As the data are longitudinal, HbA1c measured at baseline, 6 months, and 12 months will be included in the model as the dependent variable and study groups (intervention and control) and time of the pathology result (baseline, 6, and 12 months) will be collected as fixed effects. A two-way interaction term between study group and time will be included in the model to estimate the between-group difference in mean HbA1c at 6 and 12 months, but we will constrain the estimated baseline means to be equal. The model will include random intercepts for clinic (as individuals will be clustered within clinics) and individuals (as patient measures are repeated within individuals). An unstructured variance-covariance structure will be assumed for the random effects variables as correlations between measurements within individuals and correlations between measurements in participants from the same clinic are expected to be unique.
Age, index of relative socio-economic disadvantage (IRSD), and a history of severe hypoglycaemia are known to be at least moderately associated with HbA1c [9,27]. In a secondary analysis, the outcome measure will be adjusted for these potential confounders. These measures will be included as fixed effects in the model.
An intention-to-treat (ITT) approach will be used where participants will be analysed according to the study group they were assigned, and all participants will be included in the analysis, consistent with mixed model analysis [28]. The estimated mean HbA1c levels at baseline, 6 months, and 12 months will be plotted for each study group with 95% confidence intervals.
The same statistical modelling approach described for HbA1c will be used for the secondary outcomes, percentage time in target and diabetes-specific distress at 12 months. Transformations for skewed outcome measures will be considered.
Economic evaluation A within trial economic evaluation using participants' Medicare costs, pharmaceutical benefit schedule (PBS) costs, hospitalisation costs, self-reported costs, diabetic outcomes (proportion with controlled diabetes, HbA1c ≤ 7 mmol/mol) and quality of life data will be performed using a decision analytic framework [29]. The economic model will construct costs and quality of life associated with the health states 'controlled diabetes' , 'uncontrolled diabetes' , and 'death'. It will be constructed in STATA statistical software [25] based on the original trial data and will utilise linear and generalised linear modelling techniques to determine a cost per QALY gained. The analysis will be conducted from a health system and societal perspective. Costs and benefits will be bootstrapped. The distribution of costs and benefits will be simulated using a probabilistic analysis. The results of the economic modelling will be presented as the mean and 95% confidence interval (CI) of the incremental cost per QALY gained at trial conclusion for the r-CGM study group relative to the control group. Simulated cost-effectiveness will be presented for r-CGM relative to the control via a cost-effectiveness plane and a cost-effectiveness acceptability curve. Univariate and probabilistic sensitivity analyses will be performed to assess uncertainty. Estimates of projected implementation costs across Australia will be estimated.

Explanatory analysis
We will conduct two planned subgroup analyses for HbA1c at 6 and 12 months. In the first analysis, a two-way interaction term between history of severe hypoglycaemia (yes/no) and study group will be included in the primary analysis model to examine if there is a different intervention effect between those with a history of severe hypoglycaemia compared to those without. For the second subgroup analysis, a two-way interaction term between study group and type of HbA1c target (personalised vs general) will be added to the primary analysis model, to examine whether the intervention effect varies according to whether participants have a personalised HbA1c target that is different from the general target of 7% or not.
Results from the primary, secondary, and sub-analyses will be presented as shown in Tables 3, 4, 5, and 6. Estimates of the between-group difference for mean outcomes will be reported with their respective 95% confidence intervals and p values.

Complier average causal effect (CACE) analysis
A blinded review of compliance will be conducted by study investigators and the data management team prior to data analysis to determine whether a CACE analysis is required. If appropriate, CACE analysis will be performed on HbA1c at 12 months (primary outcome) to assess the size of the benefit of the intervention in those who comply with the intervention. Unlike a per-protocol analysis (PP), CACE analysis preserves randomisation when estimating the intervention effect [30]. This is achieved by comparing the mean HbA1c of 'compliers' in the intervention group (defined in Table 7) with a similar group of control participants who would have complied if they were offered the intervention. The outcome of the analysis is the CACE effect which represents the difference in mean HbA1c between compliers in the intervention group and their counterpart compliers in the control group.
The method assumes the same proportion of participants in the control group would have complied with the intervention if it was offered to them as those who   Table 8) [30]. Another important assumption is that mean HbA1c at 12 months is the same for non-compliers in both the intervention and control groups (x in Table 8) [30]. It is this assumption that allows the mean HbA1c of the (expected) compliers in the control group to be calculated (using the observed mean HbA1c in the control group). The CACE effect is then calculated as the difference in mean HbA1c between actual compliers in the intervention group and expected compliers control group. This will be reported with 95% confidence intervals.

Sensitivity analysis
The missing data patterns will be described and the drop-out rates between the two study groups will be compared. A sensitivity analysis will be performed on the primary analysis for HbA1c at 12 months to test the robustness of the missing data assumption using a pattern-mixture model. Under the mixed-effects model, missing data are assumed to be missing at random [28]. Under this assumption, the difference between the mean of the missing data and the mean of the observed data δ is assumed to be zero. In a pattern-mixture model, a range of plausible values for δ other than 0 will be considered, where positive values of δ would indicate that, on average, participants who have missing data have higher (worse) HbA1c than observed participants, and negative values of δ assume participants with missing data have lower (better) mean HbA1c than observed participants. Results for plausible values of δ will be examined to determine whether study conclusions change for departures from the missing at random assumption in the primary analysis.   GP general practitioner, HbA1c glycated haemoglobin, IQR interquartile range, IRSD index of relative socio-economic disadvantage (calculated using patient postcode [33]), PAID problem area in diabetes, PN practice nurse, SD standard deviation, TAFE technical and further education a Hypoglycaemia requiring third party assistance *Medicare is managed by the Department of Human Services and is Australia's publicly funded healthcare system funding primary health care for Australian citizens and permanent residents *The PBS is managed by the Department of Human Services and is a list of medicines available to be dispensed to patients at a government-subsidised price. The scheme is available for all Australian residents *Public hospitals are funded by the state, territory and Australian governments, and managed by state and territory governments. Victorian Admitted Episodes Dataset (VAED) and the Victorian Emergency Minimum Dataset (VEMD) provide hospital costings for Victorian patients

Discussion
The design effect is a multiplier applied to sample size calculations for an individually randomised trial to account for the sampling method, such as stratified or cluster randomisation. In this study, participants will be randomly allocated to study groups stratified by the clinic they attend. For stratified randomised trials the design effect is (1 -ICC), where the intraclass correlation coefficient (ICC) quantifies the correlation of outcomes within clinics. Applying this design effect to the sample size calculations will reduce the number of individuals required for the same power as an individually randomised controlled trial with no stratification when the ICC is greater than zero [31]. For this study, we chose the more conservative sample size that did not adjust for stratification by clinic, that is the ICC was assumed to be zero to avoid challenges associated with estimating the ICC. Randomly permuted block sizes of 4 and 6 were chosen to minimise differences in the number of participants in each study group should recruitment stop abruptly in a clinic and to ensure adequate participants in each study group for estimation of clinic effects. Random effects were chosen to model the clinic effects as we assumed clinics involved were a random sample across Victoria. Furthermore, random-effects models can perform better than fixed-effects models in terms of power and efficiency when there are a small number of participants per clinic and there are treatment assignment imbalances within clinics [32]. Lastly, the mixed-effects model includes all data observed on the subjects and satisfies the intention-to-treat principle in the presence of missing outcome data, provided the missing at random assumption holds.
This analysis plan was written prior to completion of the trial data collection phase. Analyses are pre-specified, consistent with the study objectives, and not driven by the data. An outcomes paper based on this analysis plan will be available upon completion of data collection, which is anticipated in late 2018.

Funding
Funding for this study has been provided by the National Health and Medical Research Council Project Grant (ID APP1104241). Additional funding has been provided by Sanofi Australia. Libre Pro reader devices, sensors, and software will be provided by Abbott Diabetes Care as in-kind support. Study sponsor and funders have had no role in the development of this analysis plan.
Availability of data and materials Not applicable.
Authors' contributions ST, JF, J-AM-N, MC, KD, and PC drafted and finalised the statistical analysis plan with critical input from the study investigators (JS, EH-T, RA, JC, IB, DO, KK, and JB). PC assisted with development of analysis methodology. All authors read and approved the manuscript for publication.

Ethics approval and consent to participate
The trial was approved by the University of Melbourne Health Sciences Human Ethics Sub-Committee on 21 of June 2016 (ethics ID 164715.1). Informed consent will be obtained from all participants in the study.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Table 7 Definition of a complier for the complier average causal effect (CACE) analysis The following four requirements must be met for a participant to be considered a complier: 1. Participant attended the educational session at baseline with the study credentialed diabetes educator (CDE) 2. General practitioner attended a face-to-face group education session or an education session with the study CDE or completed online training 3. Participant wore a continuous glucose monitoring (CGM) sensor at baseline, 3 months, 6 months, and 9 months 4. Participant attended clinic assessment visit (CAV) and discussed sensor trace at baseline, 3 months, 6 months, and 9 months Observed mean HbA1c HbA1c glycated haemoglobin