Do baseline participant characteristics impact the effectiveness of a mobile health intervention for depressive symptoms? A post-hoc subgroup analysis of the CONEMO trials

Objective: To ascertain whether sociodemographic and health-related characteristics known from previous research to have a substantive impact on recovery from depression modified the effect of a digital intervention designed to improve depressive symptoms (CONEMO). Methods: The CONEMO study consisted of two randomized controlled trials, one conducted in Lima, Peru, and one in São Paulo, Brazil. As a secondary trial plan analysis, mixed logistic regression was used to explore interactions between the treatment arm and subgroups of interest defined by characteristics measured before randomization – suicidal ideation, race/color, age, gender, income, type of mobile phone, alcohol misuse, tobacco use, and diabetes/hypertension – in both trials. We estimated interaction effects between the treatment group and these subgroup factors for the secondary outcomes using linear mixed regression models. Results: Increased effects of the CONEMO intervention on the primary outcome (reduction of at least 50% in depressive symptom scores at 3-month follow-up) were observed among older and wealthier participants in the Lima trial (p = 0.030 and p = 0.001, respectively). Conclusion: There was no evidence of such differential effects in São Paulo, and no evidence of impact of any other secondary outcomes in either trial. Clinical trial registration: NCT02846662 (São Paulo, Brazil – SP), NCT03026426 (Lima, Peru – LI). Funded by the U.S. National Institute of Mental Health (grant U19MH098780).


Introduction
In low-and middle-income countries (LMICs), only 4.7% of people needing mental health care receive even ''minimally adequate'' services.This treatment gap needs to be addressed as a priority. 1,2][8][9][10][11][12][13][14][15] There is much less evidence regarding subgroups for which such interventions may be differentially effective.Formal subgroup analyses can be used to ascertain whether subgroups modify the effects of such interventions based on demographic variables.To be adequately powered, these analyses require large sample sizes 15,16 and, hence, whether pre-specified or otherwise, are essentially exploratory.This applies to the trials analyzed herein, even though they are, to our knowledge, the most extensive trials targeting depression comorbid with chronic diseases in LMIC.
CONEMO (Emotional Control in English, Controle Emocional in Portuguese, or Control Emocional in Spanish) is a low-intensity, mobile application-based intervention designed to reduce depressive symptoms.The CONEMO study comprises two trials, one conducted in Sa ˜o Paulo (SP), Brazil, and one conducted in Lima (LI), Peru.As part of a secondary trial plan analysis, we examined different subgroups according to baseline values in a sample of people who self-reported being treated for diabetes and hypertension and who also have depressive symptoms -Patient Health Questionnaire (PHQ-9) scores X 10 -to investigate whether there is heterogeneity in the study outcomes.These analyses add to the two prespecified subgroup analyses published in the main trial paper, covering the effects of educational attainment and baseline PHQ-9 on the main hypothesis test. 17The secondary data analysis reported herein assesses the impact of baseline variables on reduction of PHQ-9 scores (mean difference at least 50%) after 3 months of follow-up compared to the first wave of data collection.
Participant flow is described in the main paper. 17Briefly, we approached 7,597 candidates in LI.Of these, 5,785 (76.1%) accepted to be pre-screened, 787 (10.4%) fulfilled screening criteria, and 432 (5.7%) agreed to enter the study and were individually randomized either to CONEMO or enhanced usual care (EUC).In SP, we approached 11,604 candidates.Of these, 10,688 (92.1%) accepted to be pre-screened, 1,180 (10.1%) fulfilled screening criteria, and 880 (7.5%) agreed to enter the study.

Methods
Two trials -a multicenter randomized controlled trial in LI with individual randomization and a cluster randomized controlled trial in SP with family health units as the unit of randomization -make up the CONEMO study.In SP, randomization was stratified by services with residency programs, while in LI, it was stratified by each health service and a dichotomous baseline PHQ-9 severity variable (PHQ-9 o 15 or PHQ-9 X 15).In both trials, the CONEMO intervention targeted depressive symptoms in individuals with hypertension, diabetes, or both. 17

Ethics statement
For the SP trial, the research protocol received institutional approval on May

Hypothesis
We hypothesized that there would be heterogeneity in the effectiveness of CONEMO in reducing PHQ-9 depressive symptom scores by at least 50% after 3 months of followup compared to the first wave of data collection, in the LI and SP arms, across subgroups of participants.The variables we tested for subgroup effects were suicidal symptoms at baseline, race/color, age, gender, income, type of mobile phone, alcohol misuse, tobacco use, and whether the participant had diabetes, hypertension, or other chronic diseases.

Participants
We recruited participants between September 2016 and September 2017.In SP, we approached 11,604 to reach the target of 880 participants who scored at least 10 on the PHQ-9 scale, our main inclusion criterion.We only included people aged 21 or older, under treatment for diabetes and hypertension (except if gestational), who were able to read a brief text on the research assistant's tablet. 17

EUC group
Participants from the control group (EUC) received physical and mental health care management in family health units (in SP) and management of diabetes or hypertension (or both) in LI health services.For ethical reasons, every participant from both groups that had clinically significant depressive symptoms (PHQ-9 scores of 10 or higher) was referred back to the facility at which they were already receiving care for diabetes, hypertension, or both.Also, EUC participants were assessed for depressive symptoms throughout the study (at least four times) and referred to mental health care services when considered high-risk (PHQ-9 score X 20) or at risk of suicide, according to the safety protocol.

CONEMO plus EUC group
CONEMO is delivered by a smartphone application supported by a nurse or nurse assistant. 18The application consists of 18 automated sessions, delivered over 6 weeks at a rate of three sessions per week.Information on the app's use was captured and sent to a server where data monitoring participants' access and progress were collected.The CONEMO group participants were also referred to treatment within their original health systems, as in the EUC group.

Assessments
The PHQ-9 was administered in person by trained research assistants (RAs) at screening and follow-ups.
We also collected data on other self-reported scales and sociodemographic parameters at baseline and after 3 months.The Suicide Risk Assessment Protocol (S-RAP) was used to measure the suicide risk of potentially eligible participants and monitor each participant's risk longitudinally. 19t baseline and follow-ups, RAs assessed the participants using the PHQ-9 for depressive symptoms, 20 the European Quality of Life, 5 Dimensions, Three Levels (EQ-5D-3L) for quality of life, 21 the Behavioral Activation for Depression Scale-Short Form (BADS-S.F.) for behavior activation changes, 22 and the World Health Organization Disability Assessment Schedule-II (WHODAS-II) for levels of disability. 23[21][22][23][24]

Outcomes
The primary outcome is the dichotomous variable of whether the participant's treatment was successful (a reduction in PHQ-9 score of at least 50% from baseline).We also investigated the impact of CONEMO on secondary outcomes (EQ-5D-3L, BADS-SF, WHODAS-II).
All data were collected by RAs, who also collected selfreported data on sociodemographics, diagnosis and treatment of chronic conditions (hypertension, diabetes, or both), health-services utilization (number of outpatient consultations and visits, hospitalizations reported at baseline and follow-ups, and so on), as well as some validated scales to measure specific outcomes.

Statistical analysis
We performed all statistical analyses in Stata 15. 25 All models have a mixed effects structure with the health unit services as random intercepts; all other variables are treated as fixed.The significance level was set to alpha = 0.05 for all statistical tests.The interaction between the treatment arm and the relevant subgroup variable and likelihood-ratio tests for the overall interaction effects are reported.
We used mixed logistic regression analyses with the dichotomous variable of success (achieved at least 50% PHQ-9 reduction at follow-up of 3 months vs. did not) as our primary outcome, and subgroup variables as explanatory variables interacting with treatment arms.Adjustments to the stratification variables used in the random sampling of treatment arms were implemented in the statistical model.
The estimated mixed logistic regression model can be written as where y ij is the probability of treatment success, X i corresponds to the randomized treatment, and S i is the subgroup.
We performed linear mixed regression models for the secondary continuous BADS-SF, EQ-5D-3L, and WHO-DAS-II variables with health unit services as random intercepts, using the baseline score as a covariate, and all other variables considered as fixed.
The estimated linear mixed regression model can be written as where y ij is the expected score of the secondary variable, X i corresponds to the randomized treatment, S i to the subgroup, and Y i is the baseline score as covariate.
For both models, V is the vector of covariates -the stratification variables -and e ij is the error term: the subscript i indicates the individual; j, the health facilities; and a j , the random-intercept by health facility.

Results
Tables 1 to 4 present the overall (likelihood ratio) test of the interaction effects for the various outcomes and subgroup characteristics, as well as descriptive statistics plus regression coefficients of treatment effects and their 95%CIs for each subgroup.

Subgroups
We investigated differential effects across the following subgroups: suicidal symptoms (never had/had in the past 2 weeks/had before past 2 weeks), ethnicity (white/nonwhite, SP only), age (up to 60 years old/60 or older), gender (female/male), household income (up to two times the minimum wage/more than two times the minimum wage, in local currency: soles in LI and real in SP), type of phone owned by the participant (not the research borrowed phone but their own: smartphone, non-smart mobile phone, or neither), alcohol misuse (AUDIT-C X 2), tobacco use (yes/no), and chronic condition (diabetes, hypertension, or both).We could not analyze the effect of race in LI, since the population in Peru is distinct in many cultural aspects from that of other Latin American countries. 26

Participant characteristics
The control and treatment arms were balanced on several characteristics, such as gender, age, educational level, income, marital status, chronic diseases, and depression severity at baseline, 17 as expected in randomized controlled trials such as the present studies (Table 1).Most participants were female, had a partner, and earned low incomes.Participants in LI were older and had higher educational attainment than the SP sample.Among LI participants, 185 were treated for diabetes, while 471 in SP were treated for hypertension.The severity of participants' depressive symptoms was categorized as moderate (337 [42.39%]   [10.19%] in LI), with a higher proportion of severe cases in SP. 17 In the SP trial, 334 of the 440 intervention participants (75.9%) borrowed mobile phones for the duration of the study, as did 209 of the 216 intervention participants (96.3%) in LI.The remaining participants from the intervention group either never received the research phone because they did not participate (73 [16.6%]    clustering by primary care unit as the unit of randomization. y p-values derived for the interaction terms (intervention effect by baseline characteristic) by the likelihood ratio F-test in the random-effects regression model.

Subgroup analyses
The main CONEMO trials analysis 17 showed that the intervention affected the primary outcome, i.e., the dichotomous success variable of whether the participant reached a reduction of at least 50% in PHQ-9 score at 3-month follow-up when compared to baseline values.
While depressive symptoms decreased in both trial arms (CONEMO and EUC), there were between-group differences in the primary outcome.Specifically, in the CONEMO group compared with EUC, the odds ratio (OR) for successful treatment was 1.6 in SP and 2.1 in LI.
There was no evidence of differential effects on the primary outcome according to educational level or baseline severity of depressive symptoms. 17e subgroup analyses for treatment success, reported in the present paper, found no evidence of any subgroup effects in SP (Table 1).
In LI, the effects of the intervention were stronger for older participants (interaction p = 0.030) and those in the higher income category (interaction p = 0.001).For age, the OR of treatment success was larger in participants over 60 (OR = 3.4, 95%CI 1.9-6.1)than in those under 60 (OR = 1.4,95%CI 0.8-2.5);for income, an even greater differential effect was observed, with a much larger OR in the higher income group (OR = 8.1, 95%CI 3.4-19.3)compared with the lower income group (OR = 1.4,95%CI 0.9-2.3)(Table 1).
Analyses of three secondary outcomes (Tables 2, 3, and 4) yielded no evidence of interaction effects for these w BADS-SF scores range from 0 to 54; higher scores represent a higher level of activation.= Estimated difference in means for intervention for each subgroup with respective 95%CI, from the relevant random-effects logistic regression model adjusting for stratification and (for Sa ˜o Paulo) clustering by primary care unit as the unit of randomization.y p-values derived for the interaction terms (intervention effect by baseline characteristic) by the likelihood ratio F-test in the random-effects regression model.
Braz J Psychiatry. 2024;46:e20233172 CONEMO trials subgroup analysis measures, apart from an isolated finding for technology in LI (interaction p = 0.015); given the multiple tests conducted for the secondary outcomes, this observation could well be a chance finding.

Discussion
This study examined the potential impact of nine baseline participant characteristics on the effectiveness of CON-EMO, a technological intervention for depressive symptoms trialed in Sa ˜o Paulo, Brazil, and Lima, Peru.For the primary outcome (depressive symptoms), we found that CONEMO had subgroup effects for age and income in one of the trials (LI), with older age and higher income associated with greater intervention success.There is virtually no evidence of any interaction effects for the three secondary outcomes of interest in either of the trial sites.
To our knowledge, these are the first trials to examine the association of such recruitment variables with the effectiveness of a technological intervention for depression, quality of life, disability, behavioral activation, and service utilization.
It is essential to highlight that the interaction tests are underpowered, which means that the results of all such exploratory analyses require further investigation before being seen as robust, especially since neither of the two prespecified subgroup analyses yielded evidence of differential effects.
We found the effects of CONEMO to be stronger in LI for older participants (OR = 3.4, 95%CI 1.9-6, interaction p-value = 0.030) than in those under 60 (OR = 1.4,95%CI 3) compared to those in the lower income group (OR = 1.4,95%CI 0.9-2.3,interaction p = 0.001).Previous studies using digital gamification interventions focused on reducing depressive symptoms and improving well-being also encountered better effectiveness among older adults when compared to younger people or the general population in subgroup analyses of a systematic review and metaanalysis. 27This result contradicts the common-sense hypothesis that older people do not benefit from digitally facilitated treatment.The literature also provides evidence that older adults are often satisfied and report well-being and perceived changes in several outcomes related to mental health after interacting with digital interventions. 28egarding income interaction, several baseline features -including income -have been reported to be associated with higher depression remission rates in other studies.3][34] Indeed, we found a lower effect in participants with severe symptoms in LI and for participants with a lifetime history of suicidal ideation in LI, as also observed in other studies. 35,36These results suggest that those in LI with severe symptoms, both in terms of depression scores and history of suicidal ideation, respond less to treatment than participants with less severe symptoms.
As observed in other studies, white people in SP, 37 people without smartphones in SP or no mobile phone at all in LI, 38 smokers, 39 people at risk of harmful alcohol w EQ-5D-3L scores range from 0 to 1; higher scores represent better quality of life.= Estimated difference in means for intervention for each subgroup with respective 95%CIs from the relevant random-effects logistic regression model, adjusting for stratification and (for Sa ˜o Paulo) clustering by primary care unit as the unit of randomization.y p-values derived for the interaction terms (intervention effect by baseline characteristic) by the likelihood ratio F-test in the random-effects regression model.
Braz J Psychiatry. 2024;46:e20233172 CONEMO trials subgroup analysis consumption in SP, 40 people diagnosed with both hypertension and diabetes in SP, and those diagnosed with only diabetes in LI 40,41 appeared to have no success related to the intervention.However, we could not detect effects of the interactions corresponding to these suggestions in our sample sizes for the trials presented herein.Further studies are needed to confirm these null results.When subgroup analyses are not well justified from the literature and established expectations, they may yield misleading conclusions.This study proposed interaction analyses with variables well established in the literature as potentially influencing the effects of interventions designed to relieve depressive symptoms.However, these results are limited to the CONEMO intervention as trialed in SP and LI.
Our initial hypothesis of heterogeneous effects regarding the effectiveness of CONEMO in improving depressive symptoms in participants among the various subgroups was largely unconfirmed.
Taken together, these findings allow us to conclude that the CONEMO intervention can be effective for most subgroups studied.However, additional replications with participants who use tobacco or other drugs or belong to different subgroups would be desirable, as would studies designed to investigate differential and interaction effects.
In trials of technological interventions such as this, promoting digital access and technological literacy is vital to optimize potential benefits.Technological interventions can increase access to treatment, and studies that help provide a deeper understanding of their effects should be encouraged.
Development and improvement of technological interventions need evidence-based information.To obtain such information, more clinical trials, systematic reviews, and meta-analyses of the effectiveness of these interventions in the general population -including interaction analyses -are needed; their results can help tailor technological interventions for different groups.
Although our overall sample size was large compared to that of similar studies, some subgroups (such as tobacco users) were small.Thus, our results should be considered carefully, and other studies examining the interaction of tobacco, alcohol, and drug use with the outcome of technological interventions are needed.Indeed, the power to detect interactions from studies designed to detect overall intervention effects is likely low. 16Hence, the absence of evidence of differential effects in general in this study; the few interactions we observed need to be treated with caution in exploratory analyses.
Additionally, we could not analyze the effect of race in LI, since the population of Peru is distinct in many cultural aspects from those of other Latin American countries. 26otential effects should be analyzed carefully, considering cultural and historical elements not addressed in this paper. 26

--
Data presented as n (%), unless otherwise specified.EUC = enhanced usual care; MW = minimum wage; OR = odds ratio; PHQ-9 = Patient Health Questionnaire.wDescriptive statistics for each subgroup by trial intervention (digital intervention and EUC).=Estimated OR of the primary outcome for each subgroup with respective 95%CI from the relevant random-effects logistic regression model adjusting for stratification and (for Sa ˜o Paulo)

Table 1
Descriptive statistics, subgroup-specific (adjusted) ORs with 95%CIs, and interaction likelihood-ratio tests for differential effects of various baseline characteristics on the primary outcome (reduction of at least 50% in PHQ-9 score at 3 months after inclusion) for each of the two trials

Table 2
Subgroup-specific (adjusted) differences in means with 95%CIs and interaction tests for differential effects of various baseline characteristics on the secondary outcome continuous BADS-SF scores at 3 months after inclusion, for each of the two trials BADS-SF = Behavioral Activation for Depression Scale -Short Form; MW = minimum wage.

Table 3
Subgroup-specific (adjusted) differences in means with 95%CIs and interaction tests for differential effects of various baseline characteristics on the secondary outcome continuous WHODAS-II scores at 3 months after inclusion, for each of the two trials Estimated difference in means for intervention for each subgroup with respective 95%CI from the relevant random-effects logistic regression model, adjusting for stratification and (for Sa ˜o Paulo) clustering by primary care unit as the unit of randomization.yp-valuesderived for the interaction terms (intervention effect by baseline characteristic) by the likelihood ratio F-test in the random-effects regression model.||Estimatedby a fixed-effects linear model.
w WHODAS-II scores range from 0 to 100; higher scores represent more severe disability.=

Table 4
Subgroup-specific (adjusted) differences in means with 95%CIs and interaction tests for differential effects of various baseline characteristics on the secondary outcome continuous EQ-5D-3L scores at 3 months after inclusion, for each of the two trials