Foot thermometry with mHeath-based supplementation to prevent diabetic foot ulcers: A randomized controlled trial [version 2; peer review: 3 approved with reservations]

Background: Novel approaches to reduce diabetic foot ulcers (DFU) in lowand middle-income countries are needed. Our objective was to compare incidence of DFUs in the thermometry plus mobile health (mHealth) reminders (intervention) vs. thermometry-only (control). Methods: We conducted a randomized trial enrolling adults with type 2 diabetes mellitus at risk of foot ulcers (risk groups 2 or 3) but without foot ulcers at the time of recruitment, and allocating them to control (instruction to use a liquid crystal-based foot thermometer daily) or intervention (same instruction supplemented with text and voice messages with reminders to use the device and messages to promote foot care) groups, and followed for 18 months. The primary outcome was time to occurrence of DFU. A process evaluation was also conducted. Results: A total of 172 patients (63% women, mean age 61 years) were enrolled; 86 to each study group. More patients enrolled in the intervention arm had a history of previous DFU (66% vs. 48%). Followup for the primary endpoint was complete for 158 of 172 participants (92%). Adherence to ≥80% of daily temperature measurements was Open Peer Review


Background
The prevalence of type 2 diabetes mellitus in the adult population worldwide has doubled from 4.7% in 1980 to 8.5% in 2014 1 . Low-and middle-income countries (LMICs) are disproportionally affected by diabetes, since diabetes-related complications, such as diabetic foot ulcer (DFU), are more frequent in these contexts 1,2 . In the US, 60%-70% of people with diabetes will develop peripheral neuropathy 3 . This is important since one in four patients with peripheral neuropathy will develop a DFU, which will increase the risk of foot amputation significantly 4 .
Thermometry is a tool that can identify early signs of foot inflammation, thus providing early signals to enact management and reduce the incidence of DFU and amputation 5 . Three previous clinical trials [6][7][8] and one systematic review have found that the use of thermometry reduced DFU incidence four-to ten-fold among individuals with diabetes at high-risk of developing a DFU 9 . Additionally, one study found that the addition of counselling to promote self-monitoring of skin temperature to standard care is feasible 10 . However, the benefits of thermometry depend on patient adherence to self-assessment, and foot temperature should be evaluated on at least half of the days to effectively reduce the risk of foot ulceration 7 . Yet, adherence could be challenging, especially in LMIC settings. Therefore, novel approaches to improve self-management thermometry adherence are needed. In this context, interventions using short message service (SMS) for diabetes management have been found to be useful to improve self-efficacy, social support 11 , and clinical diabetes-related outcomes 12 .
Other approaches that could prevent foot ulcers include patient's foot self-care behaviour, annual foot evaluations, knowledge about diabetic foot in health care workers, and therapeutic footwear 13 . Also, in order to prevent recurrent ulcers, it is important to consider the integration or combination of these approaches 14 .
We propose to evaluate the efficacy of a combination of foot thermometry plus mobile health (mHealth)-delivered reminders, using SMS and voice messaging, in reducing DFU in Peru. Our objective was to compare incidence of DFU in the thermometry plus mHealth reminders intervention arm vs. thermometry-only control arm.

Trial design
This was a physician-and evaluator-blinded, 18-month, randomized clinical trial with two parallel arms and a 1:1 allocation. Details of the intervention and the study protocol have been published elsewhere 15 . We followed the extension of the CONSORT 2010 statement for reporting pragmatic trials 16 .
Although initially planned to follow participants for 12 months, we decided to extent the follow-up period to 18 months to accrue enough DFU events, as we noticed that the frequency of DFU at six months was lower than we expected. Thus, only the extension of the trial follow-up was changed without affecting randomization or assessment rates. There were no other deviations from the original trial protocol.

Participants
Participants were recruited at the outpatient clinics of two third-level public hospitals in Lima, Peru; Hospital Nacional Cayetano Heredia and Hospital Nacional Arzobispo Loayza. In some cases, physicians referred the patient to the study fieldworkers to perform a foot evaluation and in other cases fieldworkers conducted an active search for potential participants in the waiting room of the Endocrinology clinic.
Patients were eligible if they: had a diagnosis of type 2 diabetes mellitus; were between 18 and 80 years of age; were in risk group 2 or 3 using the diabetic foot risk classification system as specified by the International Working Group on the Diabetic Foot ([IWGDF], neuropathy and deformity = category 2, history of ulcer and/or amputation = category 3) 17-19 ; had a palpable dorsalis pedis pulse in both feet; had an operating cell phone or a caregiver with an operating cell phone; and had the ability to provide informed consent. Patients were considered not eligible if they had current foot ulcers, active Charcot osteoarthropathy, severe peripheral arterial disease, or foot infection.
Our eligibility criteria used IWGDF categories and included people with diabetes at risk of ulceration group 2 and 3. In so doing, rather than focusing only on those at the highest risk for ulceration (IWGDF group 3) we wanted to pursue a pragmatic approach for the prevention of DFU among people with diabetes, thus including also those participants from the IWGDF group 2 category. All previous studies included mostly

Amendments from Version 1
We have edited the manuscript, adding or providing more detailed information on: Introduction, second paragraph. More previous evidence related to thermometry.
Introduction, third paragraph. Other approaches that could prevent diabetic foot ulcers.
Methods, participants. Clarification about the inclusion criteria.
Methods, development and validation of mHealth messages. A specific section for this text was generated, for clarity purposes, as this was carried out prior to the intervention.
Methods, intervention. Detailed information about videos content and previous evidence about TempStat™.
Methods, randomization. More information about the randomization process.
Results, primary outcome. Information about the percentages of people that reported alarm signs in their logbooks.
Discussion, comparison to previous studies. Discussion about low rate of DFU incidence in the study population and more evidence about mHealth "negative" trials.
Discussion, limitations. Expanded in the manuscript.

Conclusion. Refinement of the conclusion.
Any further responses from the reviewers can be found at the end of the article REVISED participants from IWGDF group 3, and only one clinical trial included group 2 patients.

Development and validation of mHealth messages
The content of the mHealth messages was developed and validated with 19 people with type 2 diabetes mellitus. Messages were tested using short open surveys to evaluate the clarity and appropriateness of the messages. These messages were constructed based on a literature review about the characteristics of health education messages, paired with the advice from a specialist in health communication, taking into consideration the reading level of our population and the use of short messages focused on a single idea. We also asked colleagues with previous experience on the use of SMS and mHealth to review the messages before testing them with patients, and changes were introduced after their revision.
We printed all the messages in a single page which was provided to the participants to read by himself/herself. Afterwards, we evaluated each message using the following six questions: (1) Is the message clear?, (2) Could you tell me how would you explain the content of the message to another person?, (3) Is there any word(s) that is difficult to understand?, (4) Is there something that you do not like about the message?, (5) Is there any suggestion to improve the message?, and (6) Would you prefer to be addressed in a formal way "usted" or an informal way "tu"? (see Extended data 20 ).

Interventions
At the initiation visit, all participants received education about foot care, i.e. etiology and risk factors for the development of neuropathy and ulcers, as well as recommendations for foot care practices and early signs of ulceration; and instructions for the use of the TempStat™ device (see Extended data). This foot care education was done through three videos that were validated by physicians and patients with type 2 diabetes mellitus. The first two videos lasted 8 and 6 minutes and they were related to foot care, whereas the third video lasted 6 minutes and presented the instructions on the use of the TempStat™ device. The three videos were in Spanish and were showed once at the initiation visit, as detailed elsewhere 15 . The device uses liquid crystal technology to provide a visual image of the temperatures (e.g. yellow image represents a higher temperature than blue image) ( Figure 1). Frykberg et al. 21 showed that TempStat™ can detect alarm signs, represented by a yellow color change, and the results positively correlate to temperature findings of infrared thermometer, the gold standard of thermometry devices. Another study found that the device identified 74% of serious foot problems 22 .
One week after enrollment, the TempStat™ was provided to each participant. Fieldworkers instructed the participants to use the device daily and to contact them by phone or SMS if one of the alarm signs appeared in the pads of the TempStat™: two different colors in the contralateral areas of the feet or a yellow spot in any area for two consecutive days. In these cases, the nurse asked about any lesions in the feet as well as the participant's activity in the last two weeks and provided recommendations on how to decrease activity until foot temperature normalized. Also, in cases where the alarm sign persisted more than one week, an in-person evaluation was performed to assess the patient for infection and/or a masked injury. Additionally, participants were trained to contact the study nurse in cases of dermal lesion of the foot and they were asked to be evaluated promptly by a nurse who was blind to the intervention. When a DFU was confirmed, the study nurse referred the patients to follow the standard protocol.
In the intervention arm, in addition to the TempStat™, participants received the mHealth component weekly (two reminder messages and six foot-care promotion messages each week) for the 18-month study period via both SMS and voice messaging.
Developed and validated messages 23 were sent at 8am approximately and, for the first two weeks of the intervention, daily (Monday to Friday) reminders to use the TempStat™ were sent. Thereafter, for the remaining 76 weeks, patients received only two messages per week at the same time: the content alternating between reminders to use the TempStat™ and promotion of foot care (one SMS and one voice message). Messages were delivered to the participant or caregiver's cell phones through an automated software system developed by the study team (see Software availability 24 ). Every week the system was evaluated by the study coordinator to verify its functionality.

Study procedures
At baseline, enrolled participants provided information to the fieldworker through questionnaires on lifestyle, history of cardiovascular disease and diabetes, current diabetes treatment, use of insoles, use of orthopedic shoes and mobile phone literacy and underwent a demographic evaluation (age, gender, educational level), socioeconomic evaluation (working status), depression assessment (Patient Health Questionnaire-9), anthropometric evaluation (weight, height and body mass index) and blood pressure measurements (see Extended data 25 ).
Periodic assessments of the participants involving a general checkup and lower extremity evaluation was conducted every two months by the nurse evaluator. Additionally, the nurse collected data about diabetes treatment, caregiver presence, use of insoles and/or orthopedic shoes, and had their weight and blood pressure measured (Extended data 25 ). In some cases, participants could not attend to the hospital for the checkup; in those cases, we completed the visit by phone or by domiciliary visits. In the last visit at 18 months, participants were asked to return their logbook of temperature measurements. In general, participants were encouraged to maintain regular visits with their treating physician in the outpatient clinic.
Glycated hemoglobin (HbA1c) was measured at baseline, six, 12 and 18 months. Measurements at baseline and 18 months were used for the study and measurements at six and 12 months were for standard of care. HbA1c was measured using highperformance liquid chromatography (D10, BioRad, Munich, Germany). The blood sample was collected in the endocrinology clinic by the nurse evaluator during the periodic assessment at the time periods specify above. All samples were transported to be analyzed in a single facility and were checked with regular external standards and internal duplicate assays and monitored by BioRad for quality control.

Outcomes
The primary outcome was DFU. The definition was based on the American Diabetes Association criteria 26,27 and for this study it was considered as the presence of DFU occurring at any point during the 18-month study period after randomization. The evaluator was a trained nurse blind to the intervention allocation. The identification of a DFU was through three ways: during the bimonthly clinical nurse evaluations; if an alarm sign had been noted and prompted the participant to seek clinical evaluation; or if the participant identifies a dermal lesion and seeks clinical evaluation.
The following were pre-defined as secondary outcomes: adherence to daily temperature measurement, defined as the participants having recorded their temperature measurements in the logbook on ≥80% of days, and ≥1% reduction in HbA1c when comparing the 18-month with baseline values. Another outcome was alarm signs registered in the logbook.
Our protocol 15 considered one additional pre-defined secondary outcomes: frequency of alarm signs reported to the study nurse. This was not analyzed because of their low frequency. The dose-response analysis of SMS and voice messaging, pre-specified as a secondary outcome in the protocol, was included as part of the process evaluation.

Sub-group analyses
Our a priori sub-group analyses were i) previous foot ulceration and ii) caregiving status, considering assistance provided to the patient with basic activities of daily living, or in the identification, prevention, or treatment of diabetes or any disability. Also, within the intervention-arm only, the type of recipient of the messaging (patient vs. caregivers) was considered for sub-group analyses. In our protocol 15 , we also considered sub-group analyses of participants that use insoles and/or orthopedic shoes, but these were not analyzed due to low frequency.

Sample size
The sample size was estimated using data from previous randomized trials in study populations similar to our study population 7,8 . We expected an absolute change of 21% between the intervention arm and the control arm (9% vs 30%) and with a power of 0.9 and an alpha of 0.05, we required a sample size of 78 participants. We planned to enroll 86 participants in each study arm, anticipating a 10% dropout rate.

Randomization
We conducted stratification using the hospital site as a single stratum and blocks of 6 to generate a random allocation sequence. Sealed envelopes with codes to randomize participants were used. An independent researcher prepared the envelopes, and the study nurses assigned the codes to each of the enrolled participants. Separately, the study coordinator was responsible for opening the envelopes and informing participants about their intervention or control allocation as per the random list. The nurse/independent evaluators were not aware of the patient's group allocation.

Blinding
The participants were instructed not to discuss their treatment assignment with the blinded evaluator. Physicians providing care to study participants, nurses and the field coordinators were blind to treatment allocation.

Process evaluation
Additionally, we performed a process evaluation during the 18-month follow-up visit to a random group of participants of the two study sites. We obtained information through a set of questions and direct observation of the use of the TempStat™ with 102 participants. In addition, with 39 participants, we asked close and open questions about the messages received in the week prior to the 18-month follow-up visit. As part of this process evaluation, we aimed to know: i) if participants knew how to use the TempStat™; ii) how many SMS and voice messages were delivered by the automated system to study participants according to the automated system; iii) how many SMS and voice messages were received by study participants according to the automated system; iv) if participants understood the messages (only if participants reported that they had received a message in the previous two weeks); and v) opinions from the participants about their preferences in SMS vs. voice messages.
The process evaluation was performed by two fieldworkers different to those who delivered the intervention and data collection was conducted through observation (participants were asked to show how they used the TempStat™), questionnaire (about nursing consultation, report of communication with study nurses, reasons for communication, alarm sign detection) and open questions (related to SMS or voice messaging preferences, use of TempStat™, suggestions about how to improve the intervention) 25 .

Statistical methods
To compare the rates of DFU between study arms we performed a time-to-event approximation using Cox's regression, having time to DFU at 18 months as an outcome. Hazard ratios (HR) and their respective 95% confidence intervals (95% CI) were estimated for the primary outcome of DFU and for the a priori defined sub-group analyses. These analyses included all retained participants, regardless of the number of visits attended, following the intention-to-treat principle. The model was adjusted by site and history of previous ulcer. Evaluation of secondary outcomes of interest was performed using logistic regression analysis to calculate odds ratios (OR) and 95% CI. Data analysis was conducted in STATA V.14.0 (StataCorp, College Station, TX, USA).
For the process evaluation, frequencies and percentages are presented. Also, open-ended questions were transcribed, and then a codebook was created, themes were derived from the data. Coding was performed manually and patterns of answers are described.

Ethics
The study protocol, informed consent templates, and questionnaires were reviewed and approved by the Institutional Review Board (IRB) at Universidad Peruana Cayetano Heredia (UPCH) in Lima, Peru (SIDISI 61482). In addition, participating hospitals (Hospital Cayetano Heredia and Hospital Nacional Arzobispo Loayza) in the study received the protocol and consent form for approval 16 . The extension in the follow-up period was also approved by the IRB at UPCH and the participants re-consented. The fieldworker explained the study procedures, then the potential participant read the informed consent form and asked questions. After that, if they accepted, they signed the informed consent form. The trial was registered at ClinicalTrials.gov with the identifier NCT02373592 (27/02/2015).

Results
The recruitment was conducted between October 2015 and March 2016 and the follow-up period lasted until October 2017.
In total, 416 participants were screened and 214 were eligible for the study. Of these, 192 gave informed consent and 172 attended the initiation visit and were allocated to the control (n=86) or intervention (n=86) arms ( Figure 2). Only 79/86 (91.9%) participants in each arm completed the 18-month follow-up. Reasons for lost to follow-up included migration back to the participant's place of origin, wrong/incomplete addresses provided, or the participant did not answer the contact phone calls.

Baseline characteristics
The baseline characteristics were similar between the intervention and control arms, with few exceptions (Table 1). History of previous foot ulcers was reported with more frequency in the intervention arm; 65.9% vs. 48.2% in the control arm (p-value 0.02). Mean HbA1c was 8.9% in the intervention arm and 8.2% among the controls (p-value 0.03). In terms of mHealth literacy, there were no major differences between study arms, with the exception that participants in the intervention arm reported more frequently never having problems with cellphone coverage (89.5% vs. 74.4% in the control arm, p-value 0.01).

Primary outcome
The cumulative incidence of DFU in the entire sample was 17.7% (28/158), and it was higher among participants with a history of previous ulceration (27.8%, 25/90) 28 .
The incidence of DFU was 11.4% (95% CI 5.2% -21.6%) in the control arm and 24.1% (95% CI 14.5% -37.6%) in the intervention arm. Compared to the thermometry-only control arm, the adjusted hazard ratio (aHR) of DFU in the thermometry + mHealth intervention arm adjusted by site was 2.12 (95% CI 0.96 -4.68), and 1.44 (95% CI 0.65 -3.22) adjusted by site and previous foot ulceration ( Table 2). The incidence of DFU in participants with previous foot ulceration was 23.7% (9/38) in the control arm and 30.8% (16/52) in the intervention arm, whereas in the participants without previous foot ulceration, incidence was 0% (0/38) in the control arm and 7.7% (2/26) in the intervention arm. Four participants did not have information related to their previous foot ulceration status (three from the control arm and one from the intervention arm).

Secondary outcomes
The frequency of ≥80% of adherence to daily temperature measurement was 87.2% (103/118) among the study participants that returned the logbook. There was no evidence of a difference between study arms in the secondary outcomes of adherence to daily temperature measurements or reduction of HbA1c (Table 2). Also, we found that 41% of the participants recorded an alarm sign in their logbooks. Additionally, 67% of the participants that presented an ulcer also reported an alarm sign in their logbook.

Sub-group analyses in intervention vs. control arms
No effects of the intervention were found according to a priori pre-defined sub-groups. Among participants that did not have a caregiver (n=96), the aHR of developing a DFU was 3.34 (95% CI 0.94 -11.92), adjusted by site and previous ulcer. Other results for sub-group analyses are shown in Table 3. Sub-group analysis within the intervention arm Participants were arranged according to the recipient of the mHealth reminders; the participants themselves (45/86) or the caregiver (41/86). We found no evidence of a difference in DFU incidence between these two groups in crude (HR 1.09, 95% CI 0.44 -2.70), and adjusted analyses (aHR 1.72, 95% CI 0.65 -4.54, adjusted by site and previous ulcer).

Process evaluation indicators
Some process evaluation indicators for TempStat™ use and understanding of the messages are shown in Table 4 and Table 5. This data was obtained at the 18-month follow-up visit 29,30 .

Dose of the mHealth component.
The total number of messages to be sent to the patients in the intervention group during

Control arm Intervention arm (N=86) n (%) (N=86) n (%)
HbA1c at baseline %, mean (SD) 8    preference because they had difficulty reading text messages on the cell phone screen. Other participants with this preference mentioned that they have quicker access to the information with a voice message. Those who preferred SMS for reminders cited the fact that SMS can be read at their convenience. Some mentioned that they prefer SMS because they don't want to have to listen for phone calls and/or pay attention to their phone at certain times.
Some participants commented that regardless of the reminder system (SMS or voice messaging), it was necessary to receive help from other people to read or listen to the messages. Their children were most commonly cited as the people to whom the participants would turn for help.

Use of TempStat™.
Some participants mentioned that they had some periods during which they did not use the device. Among the reasons provided were that the device had technical problems or because they did not have the logbook to record their measurements.

Suggestions.
Among the suggestions to improve the device and its use, technical comments were the most common. Participants mentioned that they preferred a smaller size and lighter weight device. Furthermore, of the 8% of participants that had to replace the TempStat™ because of technical problems, some mentioned that an improved design could increase the lifetime of the device. Additionally, participants found the reinforcement of the logbook and device utilization by the nurses to be very important, and some commented that more frequent communication with the nurse could improve compliance with device use.

Main findings
This study was designed to compare the 18-month incidence of DFU between those receiving thermometry + mHealth reminders versus thermometry-only. The uptake of the thermometry was high in this study, nearly 90% of the participants who returned the logbook had achieved ≥80% of the daily feet temperature measurements. At baseline, we unexpectedly found a higher prevalence of previous foot ulceration in the intervention arm, and the incidence of DFU was higher in this arm. In our study, conducted in a low-income setting, the addition of mHealth was not effective in reducing foot ulceration or increasing adherence to thermometry after 18 months of follow-up. However, these results need to be interpreted with caution as the expected incidence rates of DFU used in our sample size calculations were not met and there was a higher rate of previous DFU in the intervention group.
Comparison to previous studies In our cohort, according to the process evaluation results, adherence to temperature measurement was good, procedures about how to use the TempStat™ were regular (some steps have less than 50% of correct answers) and correct alarm sign detection was good (81%). One previous study using thermometry found that 80% of participants who developed an ulcer did not comply with 50% of the temperature assessments, in contrast with the group that did not develop an ulcer, where 92% of participants recorded their foot temperatures at least half the time 7 .
Also, in our results, 41% (44/108) of the study participants recorded alarm signs for two consecutive days in their logbooks, and we only have data from 9/44 (20%) that had a record of reporting an alarm sign to the study nurse. These figures do not consider those with alarm signs that did not seek nurse support or those who did report to the nurse but their report was not recorded.
The low rate of ulceration occurrence in our study could be potentially explained by two factors. First, that the participants did follow the instructions to reduce physical activity when observing alarm signs, even when they were not for two consecutive days or if they did not seek or receive the feedback of the study nurse. This is because the recommendations about reducing foot pressure and physical activity were given at the beginning of the study (videos) and they were also printed in their logbooks. Secondly, it is possible that the frequent assessment of the participant by the study nurse, every two months, may have played a role among study participants, including the control group. These two could have contributed to the lack of effect of the mHealth component in reducing foot ulceration.
Health interventions using SMS for diabetes management have been found to be useful for improving self-efficacy and social support 11 , as well as clinical diabetes-related outcomes 12 . However, most of the mHealth studies were conducted in highincome countries, with a young population and with outcomes related to HbA1c measurements or questionnaires, without evaluating patient important outcomes like mortality, complications or quality of life. Despite the perceived benefit of mHealth in the elderly population 31 , very few studies with this population have been conducted in LMICs. Our automatic system delivered >75% of the messages to two-thirds of the participants only and it did not have a human support component, factors that may have affected the effective engagement with the mHealth intervention 32,33 . For example, a previous study using tailored motivational phone calls followed by SMS in people with pre-hypertension found a larger effect on bodyweight and waist circumference reduction in participants that received ≥75% of the calls 34 . Additionally, our system was automatic and did not allow direct bilateral communication. In a previous qualitative study from Canada, conducted to explore the views of patients in using mHealth to monitor and prevent DFU 35 , patients expressed interest in a two-way communication system to facilitate sharing of medical data, scheduling appointments and using of alerts to get access to medical attention. Also, a recent publication, evaluating 17 systematic reviews of mHealth intervention studies in diabetes and obesity 36 , showed that fewer than half of the studies included in 2 reviews (out of 7 systematic reviews that covered the topic) improved diabetes management practices or medication adherence 37,38 , and recommend the use of valid measures for outcomes and rigorous study designs to improve their quality. Finally, compared to previous mHealth studies where the focus has been on laboratory parameters or questionnaires 39,40 , we measured the impact of mHealth on DFU, an outcome of patient importance.

Limitations and strengths
Our study has some limitations. At baseline, participants assigned to the intervention arm were at higher risk of DFU, and the ulceration rate observed in the study was lower than expected. Together these reduced the accrual of sufficient DFU events despite extending the study from 12 to 18 months. Also, we did not collect information about the duration since the most recent wound healed. Recent research suggests ~10% of wounds recur within a month and 40% within a year of entering diabetic foot remission. Also, adherence to foot temperature measurements was self-reported, and the adherence to the recommendations of the reduction of physical activity was not recorded, not being able to characterise certain behaviours of direct relevance to our DFU outcome. Our sample size calculations, which were made with an absolute change in DFU of 21% between the intervention and the control arm (9% vs 30%), using data derived from studies in high income countries, were different from the incidence of DFU observed in our trial. Hence, it is possible that our study was underpowered to detect the expected effects. Finally, it is possible that those who did not return the logbook (~30%), where alarm signs were to be recorded, may be less conscientious about foot temperature measurements and thus may have had lower rates of adherence to the thermometry.
The study also has some strengths; namely, it is a practical and pragmatic trial, well protected from bias, measuring an outcome of importance to patients and inclusive of low-income patients over 60 years-old attending public hospitals in a middle-income country.

Relevance to public health
The experience of introducing a device to engage with selfcare behaviors for the prevention of DFU in a LMIC setting showed good adherence rates in both study arms, nearly reaching 90%, signaling that mHealth had little room to further exert an impact. Future studies could pre-select participants with low adherence and explore if mHealth appears as a good supplement to prevent DFU.
Maintaining such DFU prevention efforts in routine clinical settings may be difficult to sustain, yet this study demonstrates that adequate promotion of foot care can be achieved.

Conclusions
In this randomized trial, conducted in a LMIC setting, the uptake of the foot thermometry for the prevention of foot ulcers was 87% in the intervention and control groups, and the addition of mHealth was not effective in reducing foot ulceration or increasing adherence to thermometry after 18 months of follow-up. However, these results need to be interpreted with caution as the expected rates of DFU used in our sample size References calculations were not met and there was a higher rate of previous DFU in the intervention group.

Acknowledgments
Our acknowledgments go to the fieldworkers Carmen Cisneros, Yvonne Huaylinos, Edith Rojas and Angela Roncal for their work and support in the implementation of the intervention. Also, we would like to thanks to the health professionals including physicians, nurses and technicians from the Endocrinology Services of the Hospital Cayetano Heredia and Hospital Nacional Arzobispo Loayza. This study would not have been possible without the involvement of the participants, and we appreciate their time and commitment to the study.
We are also grateful to Sol Abarca, Jorge Chachaima Mar, Gianpier Rojas and Bridgette Zarzosa Mezzich for their collaboration in data quality control and the revision of the participant's logbook. Finally, we would like to thank to Jill Portocarrero for her support in the process evaluation of the study, Miguel Moscoso Porras for his initial support in the design of the messaging system and to the engineers Jorge Estrada and Oscar Giraldo for the development and monitoring of the messaging system.

Open Peer Review Current Peer Review Status: Version 1
Reviewer Report 15 June 2020 https://doi.org/10.21956/wellcomeopenres.17001.r38648 © 2020 Mertens P. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Peter Mertens
Clinic of Nephrology, Hypertension, Diabetes and Endocrinology, Otto-von-Guericke University Magdeburg, Magdeburg, Germany The manuscript submitted by Lazo-Porras et al. entitled "Foot thermometry with mHealth-based supplementation to prevent diabetic foot ulcers: A randomized controlled trial" is a well performed clinical study with published study protocol and enormous effort to ensure proper sampling of data.
The main finding of the study is reported as a negative outcome finding: mHealth does not add to the adherence of the patients at risk for diabetic foot ulceration to perform daily thermometry. There may be numerous interpretations and reasons for this negative result, which at first sight surprises and is counterintuitive to the general field.
As referee I ask myself whether this is unexpected or whether there is an outlying explanation. The study protocol states: "Periodic assessments of the participants involving a general checkup and lower extremity evaluation was conducted every two months by the nurse evaluator." Thus every participant was seen at least every second month by the nurse practitioner and the conversations within this assessment may be more important than the messages (either voice or text messages). What was the duration of the consultations? Was it standardized? It is my impression that these contacts may have skewed the results markedly and these effects have not been tested, as far as I understand the study.
Another important aspect may be that the alarms have not been analyzed in their consequences. Have there been additional contacts to the study centre? Have these been recorded? How may a bias been excluded.
In addition there is an unfortunate bias due to more DFU in the intervention group which makes it difficult to draw solid conclusions.
The statements regarding the effectiveness of mHealth should be weakened markedly given the limitations and shortcomings of the study. These should therefore be stated in the respective chapters (abstract, discussion,conclusions).

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: Cofounder of medixmind, a firm developing support assistence devices for patients with diabetes.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 01 Jul 2020 Maria Lazo-Porras, Universidad Peruana Cayetano Heredia, Lima, Peru 1. The manuscript submitted by Lazo-Porras et al. entitled "Foot thermometry with mHealthbased supplementation to prevent diabetic foot ulcers: A randomized controlled trial" is a well performed clinical study with published study protocol and enormous effort to ensure proper sampling of data.

Response: We very much appreciate your comment. Thank you.
2. The main finding of the study is reported as a negative outcome finding: mHealth does not add to the adherence of the patients at risk for diabetic foot ulceration to perform daily thermometry. There may be numerous interpretations and reasons for this negative result, which at first sight surprises and is counterintuitive to the general field. As referee I ask myself whether this is unexpected or whether there is an outlying explanation. The study protocol states: "Periodic assessments of the participants involving a general checkup and lower extremity evaluation was conducted every two months by the nurse evaluator." Thus every participant was seen at least every second month by the nurse practitioner and the conversations within this assessment may be more important than the messages (either voice or text messages). What was the duration of the consultations? Was it standardized? It is my impression that these contacts may have skewed the results markedly and these effects have not been tested, as far as I understand the study.
Response: The reviewer raises an important point related to the frequency of contacts between the study participants and the health system, which may help understanding the negative effects of our intervention. Prior to the study, we do not have any information about the participant's frequency of contact with the health system. In principle, as part of the existing usual care, patients may be asked to visit their doctor every three months and whether or not they meet that criteria may have been affected by a variety of reasons, e.g. distance, transportation, availability of an appointment, costs, etc. So, yes, we concur with the observation that our study promoted more frequent visits to the health system than usual care, and this may have had a role in the prevention of foot ulceration. Also, we did not collect the duration of the follow-up visit but the assessment included a questionnaire about medication, use of shoes and/or insoles, presence of a caregiver, blood pressure measurement and foot evaluation. These interactions could have impact the study findings skewed the results markedly. We added the following in the manuscript:

"The low rate of ulceration occurrence in our study could be potentially explained by two factors. First, that the participants did follow the instructions to reduce physical activity when observing alarm signs, even when they were not for two consecutive days or if they did not seek or receive the feedback of the study nurse. This is because the recommendations about reducing foot pressure and physical activity was given at the beginning of the study (videos) and they were also printed in their logbooks. Secondly, it is possible that the frequent assessment of the participant by the study nurse, every two months, may have played a role among study participants, including the control group. These two could have contributed to the negative effect of the mHealth component to reduce foot ulceration."
3. Another important aspect may be that the alarms have not been analyzed in their consequences. Have there been additional contacts to the study centre? Have these been recorded? How may a bias been excluded?
Response: Information about the alarms reported to the study nurse was low. However, we do not have reports about additional contacts with the study sites. What we know is that some participants maintained their usual care appointments with their endocrinologists.
4. In addition there is an unfortunate bias due to more DFU in the intervention group which makes it difficult to draw solid conclusions.
Response: This is true, and we concur. For that reason, we already discuss this as a limitation of the study.
5. The statements regarding the effectiveness of mHealth should be weakened markedly given the limitations and shortcomings of the study. These should therefore be stated in the respective chapters (abstract, discussion, conclusions).
Response: Yes. This is correct. We have refined our conclusions (see Reviewer 1, response #2).

Javier Ena Hospital Marina Baixa, Alicante, Spain
This is a well-conducted randomized clinical trial including 172 patients at risk of DFU. Randomization technique was well carried out, but unfortunately, there was a greater risk for DFU in the intervention group due to a greater proportion of previous DFU. The intervention was well explained. However, the authors did not assess patients' adherence to recommendations to reduce daily exercise. Finally, the clinical trial did not show the advantage of using foot thermometry and mHealth supplementation to reduce DFU. A big caveat of the study is the sample size. According to my estimates, the number of patients included in the clinical trial was too small to show a possible benefit of the intervention. Taking into account a risk of DFU in the control group of 24%, a reduction of 50% in the risk of DFU in the intervention group with a power of 80% and a two-sided alfa error of 5%, the sample size called for 159 patients per arm (Fleiss estimation). In summary, it is not clear whether foot thermometry with mHealth-based supplementation is useful or not to prevent DFU.

Is the study design appropriate and is the work technically sound? Yes
Are sufficient details of methods and analysis provided to allow replication by others?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Diabetes complications
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Author Response 01 Jul 2020
Maria Lazo-Porras, Universidad Peruana Cayetano Heredia, Lima, Peru 1. This is a well-conducted randomized clinical trial including 172 patients at risk of DFU. Randomization technique was well carried out, but unfortunately, there was a greater risk for DFU in the intervention group due to a greater proportion of previous DFU. The intervention was well explained. However, the authors did not assess patients' adherence to recommendations to reduce daily exercise. Finally, the clinical trial did not show the advantage of using foot thermometry and mHealth supplementation to reduce DFU.
Response: Thank you for your feedback. Yes, we did not measure adherence to the recommendations of the reduction of daily physical activity and for that reason we are adding that as a limitation: "Also, adherence to measurements was self-reported, which potentially introduced bias. Additionally, 2. A big caveat of the study is the sample size. According to my estimates, the number of patients included in the clinical trial was too small to show a possible benefit of the intervention. Taking into account a risk of DFU in the control group of 24%, a reduction of 50% in the risk of DFU in the intervention group with a power of 80% and a two-sided alfa error of 5%, the sample size called for 159 patients per arm (Fleiss estimation).
Response: Thank you for your comment. We estimated the sample size with an absolute change of 21% between the intervention arm and the control arm (9% vs 30%). However, these data came from previous results in high income countries and differs from the incidence of diabetic foot ulcer results observed in our trial. We added the following information in the manuscript.

"With regards to our sample size calculations, which were made with an absolute change of
various other papers, such as the IWGDF systematic review on preventative interventions and the review on modifiable interventions (Van Netten et al. 2020a and 2020b) 3,4 , or papers on adherence in relation to diabetic foot disease (e.g. Price 2016) 5 . The introduction would improve in depth if importance of adhering to these aspects of diabetic foot self-care would also be included, rather than focusing only on thermometry.
Methods: It is stated that the previous trials only included IWGDF3, but that is not correct. At least one temperature trial (Lavery et al.) 6 included also IWGDF2 patients.
How was education at baseline delivered? Verbal, written, pictures, videos?
The TempStat is used, but no information is provided in the methods about the validity and reliability of this instrument.
Was there any way to assess if alerts appeared, other than the logbook? Did authors test this instrument in advance?
From the sentence: "In the intervention arm, additional to the TempStat™ participants received the mHealth component (two reminder messages and six foot-care promotion messages) during the 18-month study period via both SMS and voice messaging." it reads as if these messages were only sent 8 times throughout the entire period. However, in a subsequent paragraph it's clear they were sent weekly. Please rephrase this sentence to make sure it reflects what happened.
It is stated that alarms were not analysed "because of their low frequency." However, with 28 ulcers developed, one would expect a fair amount of warnings (if the skin indeed heats up before it breaks down). Can the authors explain why alarms were of such low frequency?
Is it correct that the blinded evaluator was evaluating the ulcers in real-life, not via photos? i.e.: patients could tell which arm they were in, albeit inadvertently? Randomization via envelopes is not considered independent anymore, since these can be (easily) manipulated. Please describe what was done to prevent this. Results: Flowchart: add "n=" in the last box (e.g. n=59), to avoid that readers may think that this is an outcome (because HbA1c of 50 is a potential finding).
Flowchart: are there any reasons for lost to follow-up?
Baseline: any information on comorbidities, such as cardiac disease?
Baseline: can authors provide some additional info on foot deformities? What sort of deformities were found, how were these assessed? I miss information on footwear and other preventative self-care of the patients throughout the trial.
Why were no multivariate analyses done?

Discussion
Main findings: the actual main finding is somewhat obscured in this paragraph. It is described as "At baseline, we unexpectedly found a higher prevalence of previous foot ulceration in the intervention arm, and the incidence of DFU was higher in this arm.". However, these are two separate findings. The current presentation suggests causality, but that's not the case. Because it is clear in the results that even a subanalysis in patients with a previous DFU shows higher incidence during the study in the intervention group (23% vs 30%). Authors should be crystal clear in their main finding here, see also my comment concerning the abstract. The finding is simple: mHealth does not help.
In the "comparison to previous studies", the actual findings are not discussed. Part of this section is now a simple repetition of the introduction, the other parts fail to acknowledge the negative findings of the current study. Why does mHealth appear to be beneficial in other studies, but not in this study? That is what should be reflected on.
I miss any discussion about the comment that alarms were not evaluated because of their low frequency. That implies that thermometry is not useful in ulcer prevention. This should be discussed in depth.
Ulcer incidence in patients with a previous ulcer is 27.8%, which is somewhat lower than other studies. Can this be attributed to the thermometer? And if so, how, if the thermometer never gives an alarm? Or is this because usual care is really good? I miss discussion of the reliability and validity of the instrument, or the lack of knowledge thereof.
In the limitation section, I miss acknowledging limitations in randomization (done with envelopes) and blinding (outcome assessor could be unblinded by patients). See comments in method section.

Conclusions:
Authors have a more clear conclusion in this section (first sentence). That is what should be used in the abstract and discussion.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes temperature trial (Lavery et al.) 6 included also IWGDF2 patients.
Response: Yes, that it is true. We have changed the information accordingly.

"Our eligibility criteria used IWGDF categories and included people with diabetes at risk of ulceration group 2 and 3. In so doing, rather than focusing only on those at the highest risk for ulceration (IWGDF group 3) we wanted to pursue a pragmatic approach for the prevention of DFU among people with diabetes, thus including also those participants from the IWGDF group 2 category. All previous studies included mostly participants from IWGDF group 3, and only one clinical trial included group 2 patients."
6. How was education at baseline delivered? Verbal, written, pictures, videos?
Response: We used videos to educate participants about foot care. We have clarified this information in the article.

"At the initiation visit, all participants received education about foot care, i.e. etiology and risk factors for the development of neuropathy and ulcers, as well as recommendations for foot care practices and early signs of ulceration; and instructions for the use of the TempStat™ device (see Extended data). This foot care education was done through three videos that were validated by physicians and patients with type 2 diabetes mellitus. The first two videos lasted 8 and 6 minutes and they were related to foot care, whereas the third video lasted 6 minutes and presented the instructions on the use of the TempStat™ device. The three videos were in Spanish and were showed once at the initiation visit."
7. The TempStat is used, but no information is provided in the methods about the validity and reliability of this instrument.

Response:
We included some of this information in our protocol, already published 5 . But we also think that it is a good suggestion and we decided to include this in the article.

"Frykberg et al. 6 showed that TempStat™ can detect alarm signs, represented by a yellow color change, and the results positively correlate to temperature findings of infrared thermometer, the gold standard of thermometry devices. Another study found that the device identified 74% of serious foot problems 7 "
8. Was there any way to assess if alerts appeared, other than the logbook? Did authors test this instrument in advance?
Response: We did not have another method to assess this during the study, other than using the logbook. Before embarking on this strategy, we piloted the TempStat™ and logbook with 10 patients, for two weeks, to know if the participants understood the instructions of the video about how to use the TempStat™ and if they recorded adequately their information in the logbook. We verified that they had understood the instructions and completed the logbook correctly. We know that this is not the best method because of the information being self-reported but, given the pragmatic nature of the study, it was the most feasible approach in the study setting where our study was carried out. During the conduction of the study, between 2015 and 2016, many patients did not yet have a smartphone or access to other technologies to ask them to send us an objective image of the TempStat™. We include this information as a limitation in the discussion.
"Also, adherence to measurements was self-reported, which potentially introduced bias." "Finally, it is possible that those who did not return the logbook (~30%), where alarm signs were to be recorded, may be less conscientious about foot temperature measurements and thus may have had lower rates of adherence to the thermometry." 9. From the sentence: "In the intervention arm, additional to the TempStat™ participants received the mHealth component (two reminder messages and six foot-care promotion messages) during the 18-month study period via both SMS and voice messaging." it reads as if these messages were only sent 8 times throughout the entire period. However, in a subsequent paragraph it's clear they were sent weekly. Please rephrase this sentence to make sure it reflects what happened.
Response: Thank you for your comment. We have clarified this section because the messages were sent weekly.

"In the intervention arm, in addition to the TempStat™, participants received the mHealth component weekly (two reminder messages and six foot-care promotion messages each week) during the 18-month study period via both SMS and voice messaging."
10. It is stated that alarms were not analysed "because of their low frequency." However, with 28 ulcers developed, one would expect a fair amount of warnings (if the skin indeed heats up before it breaks down). Can the authors explain why alarms were of such low frequency?
Response: The report of the alarms to the nurses was low, but records from the logbooks showed that 41% of the participants reported an alarm sign (we only considered an alarm sign if it was reported during 2 or more consecutive days). Additionally, 67% of the participants that presented an ulcer reported an alarm sign in the logbook. We have now included the following information in the manuscript "Also, we found that 41% of the participants recorded an alarm sign in their logbooks. Additionally, 67% of the participants that presented an ulcer also reported an alarm sign in their logbook." 11. Is it correct that the blinded evaluator was evaluating the ulcers in real-life, not via photos? i.e.: patients could tell which arm they were in, albeit inadvertently?
Response: The blinded evaluator saw the patient every two months, and if an ulcer was developed in this period the patient could also attend to an additional visit to the hospital to receive care and to be evaluated by the blinded evaluator. Participants were recommended not to mention their study arm (thermometry + mHealth vs. thermometry only). We did not receive any report of the blinded nurses related to a protocol deviation.
12. Randomization via envelopes is not considered independent anymore, since these can be (easily) manipulated. Please describe what was done to prevent this.
Response: The procedure was as follow, the patient received a code in the hospital when it was recruited by the study nurse, then the study coordinator opened the envelop containing details of the random allocation of the participant to the intervention or the control group. So, it was not possible to the study nurse to know the allocation of the participant, and it was not possible to the study coordinator to evaluate the participant and give the code to him/her. We are providing more details in our manuscript:

"We conducted stratification using the hospital site as a single stratum and blocks of 6 to generate a random allocation sequence. Sealed envelopes with codes to randomize participants were used. An independent researcher prepared the envelopes, and the study nurses assigned the codes to each of the enrolled participants. Separately, the study coordinator was responsible for opening the envelopes and informing participants about their intervention or control allocation as per the random list. The nurse/independent evaluators were not aware of the patient's group allocation."
Results 13. Flowchart: add "n=" in the last box (e.g. n=59), to avoid that readers may think that this is an outcome (because HbA1c of 50 is a potential finding).
Response: Thank you for your suggestion, we added the "n=" 14. Flowchart: are there any reasons for lost to follow-up?
Response: In our study, lost to follow-up were study participants where re-contact was not possible: some of them migrated back to their place of origin, and in other cases it was not possible to find the addresses provided, or the participant did not answer the phone calls. We have added these details in the manuscript:

"Reasons for lost to follow-up included migration back to the participant's place of origin, wrong/incomplete addresses provided, or the participant did not answer the contact phone calls."
15. Baseline: any information on comorbidities, such as cardiac disease?
Response: Yes, that information is already provided in Table 1. 16. Baseline: can authors provide some additional info on foot deformities? What sort of deformities were found, how were these assessed?
Response: The study nurses evaluated the deformities during the screening assessment, and we only considered 4 types of deformities: claw foot (40.5%), prominent metatarsal head (33.3%), Charcot foot (9.5%), and hammer toe (19.1%). This information is provided in the baseline characteristics section.
17. I miss information on footwear and other preventative self-care of the patients throughout the trial.
Response: Less than 10% of the participants reported using orthopedic shoes or insoles. This information is available in Table 1.

Why were no multivariate analyses done?
Response: Yes, we conducted a multivariate analyses adjusting by site and history of previous ulcer.

Discussion
19. Main findings: the actual main finding is somewhat obscured in this paragraph. It is described as "At baseline, we unexpectedly found a higher prevalence of previous foot ulceration in the intervention arm, and the incidence of DFU was higher in this arm." However, these are two separate findings. The current presentation suggests causality, but that's not the case. Because it is clear in the results that even a subanalysis in patients with a previous DFU shows higher incidence during the study in the intervention group (23% vs 30%). Authors should be crystal clear in their main finding here, see also my comment concerning the abstract. The finding is simple: mHealth does not help.
Response: Thank you for your comment. Even when our study found that mHealth does not help, some study limitations could explain this finding. For that reason, we change our conclusion. We have edited our abstract conclusions (see response #2) and we have done the same in here.

"In our study, conducted in a low-income setting, the addition of mHealth was not effective in reducing foot ulceration or increasing adherence to thermometry after 18 months of follow-up. However, these results need to be interpreted with caution as the expected incidence rates of DFU used in our sample size calculations were not met and there was a higher rate of previous DFU in the intervention group."
20. In the "comparison to previous studies", the actual findings are not discussed. Part of this section is now a simple repetition of the introduction, the other parts fail to acknowledge the negative findings of the current study. [Why does mHealth appear to be beneficial in other studies, but not in this study? That is what should be reflected on.
Response: Thank you for your comment, some of the reasons that we cover in our discussion include: Most previous studies in mHealth were conducted in high-income countries with mostly young populations. In contrast, our study was implemented in a low-○ resource setting of a middle-income country with an elderly population. Outcomes in previous studies were glycated hemoglobin measurements or questionnaires and scales of self-management. We used ulceration, a patientcentered and clinically relevant outcome.
○ The automated software system used in our study to send messages did not have bilateral communication, it was one-way only, and some studies suggest that bilateral communication may be a preferred route among people with diabetes.
○ Finally, we are now including evidence from a recent systematic review of systematic reviews showing not so promising evidence from existing mHealth studies.
○ "A recent publication, evaluating 17 systematic reviews of mHealth intervention studies in diabetes and obesity 8 , showed that fewer than half of the studies included in 2 reviews (out of 7 systematic reviews that covered the topic) improved diabetes management practices or medication adherence 9, 10 , and recommend the use of valid measures for outcomes and rigorous study designs to improve their quality".
21. I miss any discussion about the comment that alarms were not evaluated because of their low frequency. That implies that thermometry is not useful in ulcer prevention. This should be discussed in depth.
Response: As we clarified in response #10, the reporting of alarm signs to the study nurse was low, but information registered in the logbooks showed reports of alarm signs of up to 41%. We have now added this information in the results section (see response #10). Also, in the discussion we now have added the following statement: "Also, in our results, 41% (44/108) of the study participants recorded alarm signs for two consecutive days in their logbooks, and we only have data from 9/44 (20%) that had a record of reporting an alarm sign to the study nurse. These figures do not consider those with alarm signs that did not seek nurse support or those who did report to the nurse but their report was not recorded."

"The low rate of ulceration occurrence in our study could be potentially explained by two factors. First, that the participants did follow the instructions to reduce physical activity when observing alarm signs, even when they were not for two consecutive days or if they did not seek or receive the feedback of the study nurse. This is because the recommendations about reducing foot pressure and physical activity was given at the beginning of the study (videos) and they were also printed in their logbooks. Secondly, …"
22. Ulcer incidence in patients with a previous ulcer is 27.8%, which is somewhat lower than other studies. Can this be attributed to the thermometer? And if so, how, if the thermometer never gives an alarm? Or is this because usual care is really good?
Response: We acknowledge that ulcer incidence was somewhat low in our study and perhaps the thermometry, provided to all study participants, may have been a contributing factor to the low incidence of ulcers observed. In our study, patients were explained about how to use the TempStat™ and one of the recommendations provided was to reduce physical activity if they find an alarm sign. Another practical feature was that the thermometer had a mirror in the middle of the two pads allowing the observation of the foot soles. So, it is possible that the thermometer alone was an effective intervention, in our population, for the prevention of the incidence of ulcers. Foot care among people with type 2 diabetes in Peru is very low, which also indicates need to enhance current standards of care in terms of prevention of foot ulcers.
23. I miss discussion of the reliability and validity of the instrument, or the lack of knowledge thereof.
Response: As mentioned before in the response #7, we have now added information about the reliability and validity of the TempStat™ 24. In the limitation section, I miss acknowledging limitations in randomization (done with envelopes) and blinding (outcome assessor could be unblinded by patients). See comments in method section.
Response: We have already explained our randomization procedures (see response #12). Also, according to CONSORT guidelines, sealed envelopes are an acceptable method for randomization, and their explanation is as follow: "Enclosing assignments in sequentially numbered, opaque, sealed envelopes can be a good allocation concealment mechanism if it is developed and monitored diligently. This method can be corrupted, however, particularly if it is poorly executed. Investigators should ensure that the envelopes are opaque when held to the light, and opened sequentially and only after the participant's name and other details are written on the appropriate envelope 11,12 .
Conclusion 25. Authors have a more clear conclusion in this section (first sentence). That is what should be used in the abstract and discussion.
Response: Yes, noted. We have edited our conclusion in the abstract (see response #2), and also edited our conclusion in the discussion based in the three reviewer's suggestions, which now reads: