Addendum to: Can Internet-Based Sexual Health Services Increase Diagnoses of Sexually Transmitted Infections (STI)? Protocol for a Randomized Evaluation of an Internet-Based STI Testing and Results Service

[This corrects the article DOI: .].


Primary Analysis
For the primary analysis we will use multivariate imputation using chained equations (MICE) which uses the observed predictors of outcome and the predictors of loss to follow up to impute missing outcome data, thus attempting to correct for any potential bias caused by missing data under the assumption that data are 'missing at random'.
Missing data will occur if: 1. Participants do not complete a 6-week follow up questionnaire (or submit an incomplete questionnaire), and attend a different health service (ie, not a clinic in Lambeth and Southwark or SH:24). 2. Participants who report testing for an STI but whose patient records we are unable to access (because they did not test in a clinic in Lambeth or Southwark or SH24 and they did not tell us where they tested so we were unable to obtain data from the clinic where they were tested). 3. Participants who are diagnosed with an STI but there is no record of them attending any clinic in Lambeth and Southwark for treatment and they did not tell us where they obtained treatment so we were unable to obtain data regarding whether or not they were treated.
We will impute each of the incomplete outcome variables using multivariate imputation using chained equations. Sexuality is also incomplete but is a baseline variable, so a missing category will be used. The propensity score for randomised allocation will be estimated for all participants using a logistic regression model with randomised group as the response, and gender, age (years), number of sexual partners in the last 12 months, sexual orientation and ethnicity as covariates. The imputation model will then contain randomised group as a covariate and will be weighted by the inverse of the estimated propensity score. The imputation model for any incomplete variable will then condition on other incomplete variables. In particular, the conditional model to impute testing according to clinic data will include self-reported testing; and the model to impute treated STI will include diagnosis of STI. One hundred imputed datasets will be generated. Multiple imputation inference will then proceed via Rubin's rules [1].
We will account for baseline factors (gender, age, number of sexual partners in last 12 months, sexuality and ethnicity) by weighting on the inverse propensity score, which we will estimate by logistic regression. This will allow us to obtain more precise estimates and confidence intervals with the correct coverage.

Sensitivity to Missing Outcome Data
We will perform a sensitivity analysis to explore departures from MAR assumptions. We will multiply impute missing outcome data, using inverse probability weighting on the estimated propensity score and with allocated group and whether or not participants report having been tested as covariates. The odds of STI diagnosis and the odds of a completed STI test for missing participants will be varied to be ¼, ½, 2 and then 4 times larger than the MAR analyses.
The risk difference and risk ratios weighted by inverse propensity score will be reported alongside proportions.

Subgroup Analyses
In order to explore heterogeneity of the intervention effect on our primary outcomes, we will test for interaction at a 5% level of significance to assess whether effectiveness varies by: We will test for linear interaction for deprivation (centiles of overall UK Indices of Multiple Deprivation ranks) using a log binomial model. These analyses will be conducted in the complete cases under a missing-at-random assumption. As with the primary analyses, they will be weighted by the inverse of the estimated propensity score. When a subgroup variable is one that appears in the propensity score (defined above), we will re-estimate the propensity score omitting the subgroup variable.
Intervention effect estimates by subgroups will be presented in a forest-type plot. Given that the study is not powered to test for interactions, these analyses will be treated as exploratory and the statistical significance of the interactions will be interpreted with caution.

Secondary Outcomes
The primary analysis of the following secondary outcome will follow the same principles as the analysis of our co-primary outcomes described above: • The proportion of participants who are prescribed treatment in each arm For our time-to-event secondary outcomes we will conduct the following analyses: We will use survival analysis to estimate time from randomisation to (1) test completion and (2) treatment. For each measure we will estimate the restricted mean survival time (RMST) setting the restricted mean time t*=6 weeks (42 days) for time to test and t*= 3 months (84 days) for time to treatment.
This will be estimated from a "3df/1df" Royston-Parmar model and the difference in restricted mean survival time will be estimated.

Process Outcomes
For the following process outcomes we will summarise: • The proportion of participants who are diagnosed with an STI, by service type These estimates will be summarised by arm but there will be no comparison of groups.
Acceptability will be constructed as a binary variable, derived from 4 questions. A score of 8 will be coded as 1 (acceptable); a score <8 will be coded as 0 (not acceptable).