Current State and Future Directions of Technology-Based Ecological Momentary Assessment and Intervention for Major Depressive Disorder: A Systematic Review

Ecological momentary assessment (EMA) and ecological momentary intervention (EMI) are alternative approaches to retrospective self-reports and face-to-face treatments, and they make it possible to repeatedly assess patients in naturalistic settings and extend psychological support into real life. The increase in smartphone applications and the availability of low-cost wearable biosensors have further improved the potential of EMA and EMI, which, however, have not yet been applied in clinical practice. Here, we conducted a systematic review, using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, to explore the state of the art of technology-based EMA and EMI for major depressive disorder (MDD). A total of 33 articles were included (EMA = 26; EMI = 7). First, we provide a detailed analysis of the included studies from technical (sampling methods, duration, prompts), clinical (fields of application, adherence rates, dropouts, intervention effectiveness), and technological (adopted devices) perspectives. Then, we identify the advantages of using information and communications technologies (ICTs) to extend the potential of these approaches to the understanding, assessment, and intervention in depression. Furthermore, we point out the relevant issues that still need to be addressed within this field, and we discuss how EMA and EMI could benefit from the use of sensors and biosensors, along with recent advances in machine learning for affective modelling.


Introduction
Major depressive disorder (MDD) is a common debilitating psychiatric disease characterized by mood disturbances, loss of interest and pleasure in daily activities, disturbed appetite and sleep, loss of energy, and psychomotor retardation or agitation. According to the World Health Organization, depression is one of the leading causes of disease and disability in the world, annually affecting 4.4% of the general adult population [1]. In addition to producing high costs for the public health system, depression seriously impairs patients' functioning, leading to increased mortality, high suicide rates, exacerbated medical conditions, and high consumption of alcohol and illegal drugs [2][3][4][5].
As a result of the increased availability of smartphones and portable and wearable devices, a growing body of research has begun to explore new digital technologies as potential tools to foster assessments and interventions in clinical practice. More specifically, technology-based ecological momentary assessment (EMA) and ecological momentary intervention (EMI) have been proposed as alternative strategies to assess patients ecologically in naturalistic settings and deliver psychological support in daily life.

Ecological Momentary Assessment
Traditional clinical assessments are based on retrospective self-reports in which patients are asked to summarize their symptoms and affective experiences over the past few weeks. Nevertheless, increasing evidence shows that these tools are not able to capture MDD dynamics, such as symptom fluctuations or mood shifts over time [6,7]. Likewise, self-reports are affected by recall bias. In other words, depressed patients have been found to alter the content of past experiences when asked to retrieve them retrospectively [8,9], judging symptoms as more severe [10] or increasing the elaboration of negative information [11].
EMA emerged as an alternative assessment strategy to better grasp affective and behavioural dynamics in daily life [12][13][14]. Not surprisingly, a growing body of research has applied this approach to exploring mood disorders [15,16]. On the one hand, the term "ecological" refers to the environment where the data are collected. Behaviours, thoughts, and affect are repeatedly written down in real-world contexts. On the other hand, the term "momentary" refers to the focus of the assessment, i.e., close in time to the experience. The first studies to use this approach adopted paper-and-pencil daily diaries, but the discomfort, low compliance, and low experimental control over backfilling made them not very efficacious [14]. The exponential progress of information and communication technologies (ICTs) and the increasing availability of smartphones offered novel opportunities to ecologically assess patients. On the one hand, mobile technologies allow the shortcomings of traditional diaries to be overcome by eliminating the need for manual data entry and by increasing control on backfilling, thus obtaining more accurate data. On the other hand, all the necessary processes can be integrated in one tool, for instance, a smartphone, thus decreasing intrusiveness and increasing users' comfort, and providing a more engaging and dynamic experience. During the day, indeed, patients are automatically prompted by the device to fill in self-reports that are subsequently stored and safely sent to clinicians and/or researchers. More recently, the potential of EMA was extended due to the integration of self-reports with data gathered from embedded sensors and wearable biosensors, hence allowing for a multimodal approach. Unobtrusive wearable biosensors can continuously monitor physiological parameters throughout the day with high precision [17], whereas smartphone embedded sensors make it possible to indirectly collect data about patients' behaviours and habits, such as their social media use, physical activity, or social interactions [18,19]. Overall, the integration of these tools has the potential to revolutionize traditional assessments, leading to the exploration of new facets of MDD obtained in daily life contexts that are often difficult to capture in laboratory settings.

Ecological Momentary Intervention
According to statistics, 70% of people suffering from mental disorders do not receive adequate psychological treatment or reach complete clinical remission [20]. Affordances of technological developments, as Kazdin and Blase suggested, may facilitate new solutions for disseminating evidence-based psychotherapy [21].
The same "ecological" and "momentary" principles have been applied to the development of innovative interventions (EMI) [22] that go beyond traditional clinical settings and extend the delivery of psychological support into real life [23]. EMI has the advantage of providing psychological support directly on hand-held mobile technologies during the flow of daily experiences, in real-time settings, and at specific time points in the day, without the need for face-to-face meetings with a clinician [24]. EMIs can be delivered both as stand-alone treatments or in combination with other treatments. Moreover, similarly to EMAs, the use of data gathered from biosensors and embedded sensors along with machine learning techniques can increase the customization of the proposed interventions [16,25].

Objectives
Recent studies have confirmed the feasibility of mobile health (mHealth) applications and patients' interest in and adherence to these technologies, suggesting the great potential of this approach in the clinical field [23,26]. Nevertheless, no systematic review has explored technology-based EMA and EMI for MDD to date. Although two reviews focused on EMAs for mood disorders [15,16], most of the included studies were based on paper-and-pencil daily diaries, and the target population included adults and adolescents with bipolar disorder (BD) and borderline personality disorder (BPD).
Coinciding with our field of interest, the aim of this systematic review is to provide an overview of the state of the art of technology-based EMA and EMI for MDD from both a clinical and technological point of view. Our final objective is to show how and why clinical practice could benefit from the use of these approaches. In doing so, we will describe the potential of new technologies in this field, and we will discuss how EMAs and EMIs could be performed with sensors and biosensors along with recent advances in machine learning for affective modelling.

Methods
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria [27] were followed. For the systematic review protocol, see [28].
This search produced a total of 4993 articles. After eliminating duplicate papers, we made a first selection by reading titles and abstracts, and 401 articles were retrieved. We finally selected publications by applying the selection criteria described in the following paragraph, obtaining 40 papers.
Three individual researchers (D.C., J.F.-Á., and M.S.) performed the search for publications in the English language. More details are provided in Table 1 and in the flow diagram (Figure 1), in order to make this search replicable in the future.

Selection Criteria
We included all studies involving a sample of adults with a primary (both current or past) diagnosis of MDD, using recognised diagnostic criteria (Diagnostic and Statistical Manual of Mental Disorders-DSM; International Classification of Disease-ICD). We excluded non-English papers and studies that did not meet the inclusion criteria. We also excluded articles that did not have fulltext available, and the following types of manuscripts: Conference papers, reviews and systematic reviews, metanalyses, meeting abstracts, notes, case reports, letters to the editor, editor's notes, extended abstracts, proceedings, patents, editorials, and other editorial materials. We tried to contact the corresponding authors, when necessary, to obtain missing or supplementary data.
Ecological momentary assessment: We included studies that adopted an ecological momentary assessment by means of hand-held technologies (such as smartphones, personal digital assistants, or hand-held computers) for the collection of daily self-reports, thus excluding studies that used paperand-pencil diaries. Additionally, we included studies that integrated daily self-reports with data supplied by sensors and biosensors.
Ecological momentary interventions: We included EMIs that were provided to patients through hand-held technologies. We selected studies in which the proposed EMI was either a stand-alone intervention or combined with other types of treatment. We also included EMI that collected data from wearable biosensors or device-embedded sensors. Because providing continuous feedback to patients has been shown to be a valuable therapeutic procedure [29], we also included studies that adopted EMA-based feedback as a therapeutic tool for clinically depressed patients.

Selection Criteria
We included all studies involving a sample of adults with a primary (both current or past) diagnosis of MDD, using recognised diagnostic criteria (Diagnostic and Statistical Manual of Mental Disorders-DSM; International Classification of Disease-ICD). We excluded non-English papers and studies that did not meet the inclusion criteria. We also excluded articles that did not have full-text available, and the following types of manuscripts: Conference papers, reviews and systematic reviews, metanalyses, meeting abstracts, notes, case reports, letters to the editor, editor's notes, extended abstracts, proceedings, patents, editorials, and other editorial materials. We tried to contact the corresponding authors, when necessary, to obtain missing or supplementary data.
Ecological momentary assessment: We included studies that adopted an ecological momentary assessment by means of hand-held technologies (such as smartphones, personal digital assistants, or hand-held computers) for the collection of daily self-reports, thus excluding studies that used paper-and-pencil diaries. Additionally, we included studies that integrated daily self-reports with data supplied by sensors and biosensors.
Ecological momentary interventions: We included EMIs that were provided to patients through hand-held technologies. We selected studies in which the proposed EMI was either a stand-alone intervention or combined with other types of treatment. We also included EMI that collected data from wearable biosensors or device-embedded sensors. Because providing continuous feedback to patients has been shown to be a valuable therapeutic procedure [29], we also included studies that adopted EMA-based feedback as a therapeutic tool for clinically depressed patients.

Quality Assessment and Data Abstraction
To control for the risk of bias, PRISMA recommendations for systematic literature analysis were followed. Studies were independently selected by three different authors (D.C., M.S., and J.F.-Á.), who first analysed titles and abstracts and subsequently selected the full papers that met the inclusion criteria, resolving disagreements through consensus. For what concerns the EMA included studies, the main aim was to provide a perspective of clinical, technical, and technological issues related to this approach: In other words, we were interested in EMA as a clinical and experimental tool to be used in the psychological field, regardless of the study design or variables of outcome . No risk of bias assessment was therefore performed. Differently, risk of bias of EMI studies was assessed by two independent reviewers (D.C and J.F.-Á.). As both randomized and non-randomized controlled trials were included, quality assessment was assessed with the Downs and Black quality index [30].
The data extracted from each study were as follows: Author(s), sample(s), variable(s), device(s), sensor(s), duration, prompt(s) per day, sampling schema, primary outcome(s) for the selected studies on EMA (Table 2); and author(s), name of the intervention, sample(s), content of the intervention, duration, device(s), sensor(s), and primary outcome(s) for the studies proposing an EMI (Table 3).

Ecological Momentary Assessment in MDD
After applying the inclusion criteria, 32 studies were retrieved that investigated and assessed MDD through a technology-based EMA.
A synthesis of the results is provided in Table 2.

Electronic Devices and Use of Sensors
Most of the selected studies administered daily self-reports either through a personal digital assistant (PDA) or a smartphone. Only three studies adopted different technological solutions that allowed them to collect both self-reports and data gathered from sensors and biosensors. Conrad et al. [31] used the LifeShirt System (Vivometrics, Inc., Ventura, CA, USA), a comfortable garment with integrated biosensors that can continuously monitor various cardiopulmonary parameters, including heart rate (HR), respiration, and posture. With an embedded hand-held computer, patients can also complete self-reports following daily beep signals. In another study, Kim and colleagues adopted ECOLOG [32], a watch-type computer characterized by an 8-direction joystick and an integrated actimetry sensor. Via a beep signal, the wristwatch prompts patients to complete momentary assessments directly on the watch screen. Similarly, a compact wrist-worn electronic diary was used by Littlewood et al. to collect both self-reports and sleep/wake cycles with an embedded actimetry sensor [33].
Although a growing number of studies analyse data from embedded-sensors and biosensors in research on mental health disorders [34], their use in association with EMA has been low in the field of MDD. Among our selected studies, only seven of the 32 studies collected physiological measures in addition to self-reports. Conrad and colleagues collected cardiac and respiratory measures as indices of vagal activity, along with physical activity measured through an embedded actimetry sensor [31], whereas Ottaviani and colleagues collected ambulatory HR [35]. The remaining five articles investigated the association of depressive symptoms with sleep/wake cycles [32,33,36] and physical activity [37,38] using actimetry sensors. Even if strongly correlated, the PHQ-9 scores collected through the mobile application are significantly higher than those obtained though the retrospective paper-and-pencil PHQ-9.

Sampling Methods
Currently, different EMA designs can be used to define prompt scheduling, depending on the main purpose of the study. It is possible to prompt participants using fixed time periods or randomized/semi-randomized samplings (time-based sampling). Alternatively, participants can be asked to personally fill in the assessment after the occurrence of a specific behaviour or event (event-based sampling). Whereas time-based samplings depend on a signal emitted by the device (signal-contingent), event-based samplings are not preceded by a prompt (event-contingent). Signal-contingent schemas are useful when repeated measures are needed to obtain a representative value of a variable or when the objective is to capture dynamic variables (e.g., mood), whereas event-contingent schemas are more likely to be adopted when the main focus is on a specific behaviour that occurs randomly or less frequently during the day (e.g., smoking a cigarette). Regarding our selected studies, none of them adopted event-based sampling. Most of the studies collected data using randomized or semi-randomized schemas, whereas nine studies prompted participants to note information at fixed time points during the day. This latter approach was adopted especially by the studies that investigated the association between cortisol or melatonin and depression, i.e., when the assessed variable required greater temporal precision and accuracy.
The duration of the data collection showed great variability. Some studies collected self-reports for a brief time period (less than 3 days); this choice was especially observed in the field of cortisol and sleep pattern research. Other studies required longer periods of assessment, where participants were involved for one or two months. This was especially true for studies investigating physical activity and its association with depressive symptoms. The same high variability was observed in the number of prompts, which varied from 1 to 20 prompts per day.

Compliance and Dropout Rates
With the term "compliance", we refer to the percentage of answered prompts. A few studies did not report this information [35,[40][41][42]51,57,60]. However, the majority clearly addressed this issue. Sixteen studies reported compliance rates higher than 85%, five studies showed rates between 84% and 70%, and four studies collected 65% of the total possible answers. Patient dropout was related to diagnosis change, subjective burden, technical problems, incomplete data, retrospective completion of the electronic diary, missed prompts, worsening of symptoms, or non-attendance at follow-up sessions.
To prevent backfilling, different solutions were adopted. In most of the studies, participants could complete self-reports for a fixed time period after the prompt, ranging from a few minutes to a maximum of one hour. To increase compliance, two studies also gave participants the possibility of postponing prompts.

Contribution of EMA to the Study of MDD
As Table 4 shows, so far EMA has been applied to seven different fields. In the following paragraph, we will provide an overview of EMA's contribution to the understanding and assessment of MDD. Continuous monitoring (symptoms assessment, treatment progress); real-time feedback to clinicians (e.g., crisis plan) and users (e.g., patterns visualization).

Recall Bias
Increasing evidence shows that memories often have inaccurate and imprecise content due to recall bias. In the case of EMAs, two studies were carried out to investigate this bias, comparing EMA daily data to retrospective assessments. Ben-Zeev and colleagues compared positive (PA) and negative (NA) affect collected through an EMA to scores obtained by means of traditional paper-and-pencil retrospective questionnaires [8]. When retrospectively recalled, both PA and NA were overestimated, regardless of the diagnosis. Interestingly, the control group was more likely to exaggerate the retrieval of PA rather than NA, but this trend was not observed in depressed patients. By contrast, Torous and colleagues developed a smartphone application to administer randomized subsets of items taken from the Patient Health Questionnaire (PHQ-9) [69], compared to the traditional paper-based PHQ-9. Symptoms were evaluated as more severe in daily EMA evaluations, compared to the retrospective PHQ-9 assessment. According to the authors, this discrepancy could be due to different factors, such as recall bias or stigma.

Symptom Monitoring
Unexpectedly, we could only retrieve three studies within this research field, i.e., studies that actually applied EMA to monitor clinically depressed patients. Husky and colleagues investigated the acceptability of a three-days computerized ambulatory monitoring on MDD and BD patients, showing encouraging compliance and acceptance rates among both samples. Practice effects were observed (faster response time over the course of the study), thus suggesting the importance of considering the potential effects of EMA duration on self-reports [39]. Schaffer et al. developed a system called "Mental Health Telemetry" to monitor symptoms of patients receiving pharmacological treatment [40]. According to the results, a reduction in depressive symptoms was already observable one day after beginning the treatment, and symptoms on day 7 were predictive of treatment outcome. Similarly, iHOPE is a smartphone application for the daily monitoring of depressive symptoms and sleep patterns [41]. EMA assessments of depression, sleep quality, and anxiety were highly associated with the Hamilton Depression Rating Scale (HAM-D), administered at baseline. Nevertheless, application use decreased significantly over the weeks, from 3.4 days per week to 0.4 days per week after 8 weeks, highlighting the important issue of compliance in EMA assessments.

Cortisol Secretion
Stetler and colleagues investigated the associations among cortisol and sleep patterns, social interactions [43], and daily activities [42]. Not only were cortisol levels after awakening different in depressed and healthy participants, but the impact of psychosocial variables on cortisol secretion was also dissimilar. Consistently, the Hypothalamic-pituitary-adrenal (HPA) axis of depressed patients was no longer able to respond to the timing of the sleep-wake cycle, daily routines, and external social experiences. One study explored the impact of cortisol on affect, showing a bidirectional association between PA and NA and daily cortisol levels [45]. Nevertheless, high variability was observed among participants regarding the timing, direction, and sign of this association. For instance, NA was positively associated with cortisol 50% of the time, while the association between cortisol and PA was almost always negative. Booij et al. identified higher cortisol and α-amylase levels among depressed individuals [36]. Similarly, when applying individual correction for lifestyle factors, the association of depression to cortisol and the ratio of α-amylase over cortisol was no longer significant, suggesting that generalization from groups does not always reflect the single individual. Nevertheless, Conrad and colleagues could not find cortisol differences between depressed and non-depressed participants. Interestingly, a negative correlation between NA and heart rate variability (HRV) was observed only in the control group, suggesting that constant NA may alter the normal interaction between affectivity and the autonomic nervous system [31].
Finally, interesting outcomes were also observed among remitted MDD patients [44]. Despite remission, patients showed reduced cortisol levels throughout the day and a different interaction between affect and cortisol, thus suggesting a reduction in the HPA axis' responsiveness as a potential marker of recurrent depression.

Sleep Patterns
According to our search, six studies adopted an EMA to explore sleep disturbances in depression. Through the daily administration of morning self-reports about sleep patterns, O'Leary et al. found that depression was associated with lower perceived sleep quality, which in turn affected negative emotional reactivity to both neutral and unpleasant events during the day [47]. However, in healthy participants, sleep disturbances only affected emotional reactivity to unpleasant events. In other words, depression could be a factor affecting the relationship between sleep quality and emotional reactivity. Similarly, two studies analysed the influence of sleep quality on daily affect [46,48]. As expected, higher sleep quality was associated with higher PA in both healthy and depressed participants. Surprisingly, there was no evidence of the moderating role of depression in the association between sleep and affect. Nevertheless, sleep quality affected daily mood, but not vice versa, because higher sleep quality was associated with increased PA and decreased NA the following day. This association did not differ between depressed and healthy participants. Similarly, sleep duration was found to affect next-day physical activity, but again, no difference between depressed and non-depressed individuals was observed [38]. An EMA was finally adopted to investigate the association between sleep patterns and suicide ideation in a sample of depressed patients [33]. Poor sleep quality, both at subjective and objective levels, was associated with increased suicide ideations the following day. However, suicidal thoughts did not predict sleep patterns the following night.
Bouwmans and colleagues also collected repeated saliva samples to analyse the association of depression with melatonin, an important hormone related to sleep onset [49]. A bidirectional relationship between affect and fatigue, and melatonin was pointed out: Melatonin is associated with changes in affect and fatigue; however, affect and fatigue are also predictors of melatonin levels. Participants that did not show this association were likely to report higher rates of depression, worse sleep quality, and lower energy expenditure.

Physical Activity
In order to analyse the effect of self-initiated physical activity on mood, clinically depressed patients were asked to report their daily physical activity [50]. Both healthy and depressed participants showed higher levels of PA following physical activity, but no decrease in NA. Notably, the increase in PA after physical exercise was greater in depressed patients, which is consistent with the ample evidence supporting behavioural activation in general, and physical activity in particular, for the treatment of depression. Confirming these results, another study found that physical activity was associated with subsequent increased PA, regardless of the diagnosis [37]. However, the analysis also revealed high subjective variability in the association between physical activity and mood in terms of strength, direction, and temporal aspects. Finally, Kim and colleagues developed a statistical model with cross validity that identified a significant association between higher intermittency of locomotor activity and worse mood ratings [32], suggesting the possibility of predicting patients' moods through the analysis of momentary locomotor patterns. According to their model, a worsening of depressive mood was associated with increased intermittency of locomotor activity.

Rumination
Ruscio and colleagues investigated the relationship between stressful events and rumination in MDD and GAD patients [52]. Both clinical samples showed higher levels of rumination in response to stressful situations, which were further worsened by symptom severity and extensive comorbidity. In addition, rumination significantly mediated the impact of stress on symptoms and affect; that is, higher rumination after a stressful event predicted greater NA and more maladaptive behaviours. Putman and colleagues investigated rumination and self-esteem through the assessment of resting baseline PFC alpha activity, along with the momentary assessment of affect and depressive symptoms, in a sample of clinically depressed individuals [51]. Rumination was found to be associated with an increased alpha signal in the bilateral prefrontal cortex (i.e., decreased neural activation), whereas an increased alpha signal in the right prefrontal cortex was positively correlated with higher self-esteem ratings. One study investigated perseverative thoughts (i.e., depressive rumination, worry, and reactive rumination) in relation to mind wandering [35]. Participants were instructed to complete a smartphone diary every 30 min for one day, and these self-reports were integrated with continuous HR monitoring. Confirming the hypothesis that mind wandering is not a maladaptive behaviour per se, only perseverative cognition was associated with health risk factors, such as lower HRV, worse mood, and higher interference in daily functioning. Finally, one study examined the dynamics of worry and rumination in daily life [53]. Contrary to the hypothesis, levels of worry were not significantly associated with the occurrence of significant events, whereas rumination was significantly higher in response to these circumstances. Compared to the control group, clinically depressed individuals showed decreased PA and increased NA as a consequence of high rumination levels.

Affect and Emotional Reactivity
Thompson and colleagues investigated emotional reactivity, emotional inertia, and emotional instability in depressed patients [50]. Compared to healthy participants, clinically depressed patients showed higher NA instability, whereas no differences in PA instability were observed. Both samples reported increased NA after a negative event; however, depressed patients showed a greater decrease in NA and increase in PA after a positive event. These results were confirmed by another study that showed a greater reduction in NA following positive events in depressed individuals [55]. When considering BPD comorbidity, depressed patients were found to be less emotionally influenced by events, and to perceive themselves as less emotionally reactive [56]. Other factors that affect emotional reactivity are gender and past depression [54]. In one study, women and remitted patients evaluated daily events as more negative than men, and they showed worse mood and higher emotional reactivity in response to daily stressors. Finally, a smartphone application was developed to assess visual mental imagery and its impact on mood and affective reactivity in healthy people and remitted MDD patients [57]. Participants were asked to focus on their mental representations, i.e., what they had in mind, eight times per day. Imagery-based processing was associated with better mood, regardless of the valence of the mental representation. This pattern was similar in healthy and depressed participants. However, no association between mental imagery and affective reactivity was observed.
Regarding daily affect, one study explored the impact of gambling desire on mood in a sample of depressed individuals [59]. Higher levels of sadness and arousal were associated with higher rates of gambling desire. Consistently, depressed participants were also likely to perform gambling behaviours to increase their current PA levels. However, momentary affect did not predict actual gambling behaviours. An EMA was also used to investigate the influence of social rejection and disagreement on daily affect in MDD and BPD patients [58]. As expected, momentary and daily negative interpersonal events triggered higher NA (fear, hostility, and sadness) in both groups. High levels of hostility predicted rejection and disagreements, whereas sadness was only a predictor of social rejection. The aforementioned relationships were stronger in BPD patients than in depressed participants.
Finally, one study investigated the topology and temporal dynamics of depression and anxiety symptoms using contemporaneous and temporal network models [60]. Positive (positive, content, enthusiastic, energetic) and negative (down) mood were the most representative variables of patients' core symptoms. While "worried" and "down" did not show temporal influence, "positive mood", "hopelessness", "anger", and "irritability" were the strongest drivers of moment-to-moment symptomatology.

Ecological Momentary Intervention in MDD
The selection process resulted in eight studies that administered an EMI to clinically depressed patients. In all, four different interventions were identified: Psymate, Mobylize, Hel4Mood, and Medlink.

General Overview of the Interventions
Psymate is a PDA-based EMA for symptom monitoring that aims to increase awareness about depression and the dynamics that characterize this disorder [62][63][64]67,68]. Psymate allows patients to record daily symptoms and affect. Based on these daily assessments, patients meet a clinician weekly and receive graphical feedback on the association between PA levels and daily life activities, events, or social interactions, as well as on the association between PA changes and the number of depressive complaints. In this way, patients have the chance to reflect on their affective state and the relationship between symptoms and contextual variables with a professional. According to Heron's definition, "the key feature of all EMIs is that the treatment is provided to people during their everyday lives (i.e., in real time) and settings (i.e., real world)" [22]. Therefore, Psymate does not meet all the criteria for an EMI, as EMA-feedbacks are provided during weekly face-to-face sessions. However, we decided to include this intervention because we think it provides important insights about the potential of self-monitoring EMA as a therapeutic tool.
Likewise, Mobylize! constitutes an ecological intervention composed of a mobile application, an interactive website, and a system for email/telephone support [61]. The most innovative aspect of this application is the integration of self-reports with data from smartphone sensors. Mobylize! is provided with a context-aware system. Thanks to a machine learning algorithm, the application can predict the state of the patient (mood, emotions, cognitive/motivational states, activities, environmental context, and social context). Specifically, the system works in three different phases: (1) Data collection, during which 38 sensors collect sensor information; (2) learners, during which prompted self-reports are matched and paired with simultaneously labelled state data to develop predictive models; and (3) action components, a continuous process that analyses sensor data in order to update previous predictive models without the direct input of the user. Mobylize! is designed to prompt patients to assess mood, intensity of emotions, fatigue, pleasure, accomplishment, concentration, engagement, perceived control, location, and interactions five or more times a day. To accommodate new data, every new self-report is subsequently associated with the generation and modification of previous models. Thanks to this complex system, the mobile application sends tailored feedback to participants. Through the website, users can graphically visualize self-report patterns, read theoretical lessons, and use interactive tools, such as tailored plans and calendars, for monitoring daily activities. Lastly, a trained clinician contacts users periodically by phone or email to provide technical support, reinforce adherence, and enhance motivation.
Help4Mood is a web-platform to self-monitor daily symptoms, mood, activities, and thoughts [66]. Based on a Cognitive Behavioural Therapy (CBT) approach, Help4Mood helps patients to reflect on the emotional and cognitive patterns related to depression. In addition to collecting daily self-reports, the application receives data from an actimetry sensor and acoustic analysis of speech. The innovative aspect of Help4Mood is the use of a virtual agent, completely customizable in terms of voice, clothing style, sex, and language, that communicates with users to provide tailored exercises and activities and guide them through the daily questionnaires. The application also has an emergency section called the "crisis plan": As soon as symptom worsening is detected, the application prompts users to contact a professional or a relative.
Finally, Medlink is a mobile application to support and monitor MDD patients taking antidepressant medication [65]. The main purpose of the app is to address the failure points that usually occur between professionals and newly diagnosed patients. On the one hand, the application provides users with weekly psychoeducation material and sends suggestions about medication management and how to deal with depressive symptoms. On the other hand, it monitors patients' treatment and depressive symptoms. Every four weeks, personal communication with a professional is scheduled to give patients monthly feedback about disease progression.

Effectiveness of the Intervention
Psymate was tested in a sample of 102 clinically depressed patients in a three-arm randomized controlled trial [62][63][64]67] with an experimental condition (treatment as usual -TAU -and six-week Psymate treatment, with weekly face-to-face feedback sessions), a pseudo-experimental condition (TAU and Psymate without EMA face-to-face feedbacks), and a control condition (TAU). Three different categories of weekly feedback were provided: (1) Positive affect, (2) positive affect in relation to events appraised with an internal versus external locus of control, and (3) positive affect in relation to social interactions. Results showed a significant reduction in depressive symptoms in the experimental group that was maintained in the follow-up assessment. Participants in the pseudo-experimental condition reported decreased depressive symptoms in the first weeks of the treatment, but this gain was not maintained across the weeks. Notably, the use of Psymate was associated with increased levels of perceived empowerment, regardless of the presence of weekly feedback, and with increased experienced PA throughout the treatment. Decreased depressive symptoms were also associated with increased positive daily behaviours. Finally, Widdershoven and colleagues observed a significant improvement in negative emotions' differentiation and a close-to-significance improvement in positive emotions' differentiation after 6-weeks of self-monitoring, regardless of EMA-derived feedbacks [68].
Mobylize! was tested in a small pilot study with a sample of 7 MMD patients [61]. According to the results, the use of Mobylize! significantly reduced depressive symptoms, both on a self-rated measure (PHQ-9) and a clinician-based evaluation (Quick Inventory of Depressive Symptomatology-Clinician Rating, QUIDS-C), as well as anxiety symptoms, measured with the Generalized Anxiety Disorder Scale (GAD-7). At the end of the treatment, participants were also less likely to meet MDD diagnostic criteria. Nevertheless, the accuracy of the predictive model was low, especially for mood; higher accuracy was achieved by models that predicted location, conversational state, and social interactions (accuracy between 60% and 90%).
A randomized controlled trial was conducted to evaluate Help4Mood [66]. Twenty-eight depressed patients were recruited and randomized into two treatment groups: Help4Mood and TAU. Outcome measures, which included the Beck Depression Inventory (BDI) and Quick Inventory of Depressive Symptomatology-Self Report (QIDS-SR), indicated reduced symptoms in both samples. Nevertheless, patients in the TAU group achieved greater clinical improvement compared to patients who used the application. Notably, regular users were more likely to obtain greater clinical improvement compared to users with low compliance.
Finally, a preliminary study tested the efficacy of Medlink with 8 MDD patients [65]. On the one hand, medication monitoring showed promising outcomes. Patients reported taking 84% of their medication, which is significantly higher than medication adherence rates reported in the literature. On the other hand, depressive symptoms significantly decreased over the course of 4 weeks.

Compliance and Dropout Rates
Regarding Psymate, the number of answered prompts in both the experimental and pseudo-experimental groups was 135.5 out of 180 (75.3%); participants completed 39.7 out of 50 pre-assessments (79.4%) and 23.7 out of 30 (79%) post-assessment observations. Moreover, 27 of the 33 participants (81.9%) allocated to the experimental group completed the intervention, whereas 32 out of 36 participants (88.89%) allocated to the pseudo-experimental group completed it.
Throughout the 8-week treatment with Mobylize, the mean number of log-ins to the mobile application was 7.9 (approximately one per week), whereas the number of completed lessons on the website was 4.8 out of 9 (53.3%). The number of answered prompts drastically decreased throughout the treatment, from 15.3 in the first week to 4.8 in the last week, due to technical difficulties and connectivity problems. Seven out of eight participants (87.5%) completed the intervention: The only dropout was caused by technical problems with the smartphone.
Regarding Help4Mood, the authors indicated great variability in terms of time of use. Two participants used the application for one or two days, whereas three participants used it between 3 and 7 days. The remaining six participants used it more than 10 times, approximately twice a week. The mean use was 134 min. Eleven out of 13 (84.6%) participants completed the protocol and were assessed for the follow-up. One participant withdrew due to worsening mood.
Finally, participants entered the Medlink application approximately 17.4 times during the 4 weeks of data collection and answered 96% of the prompts. Seven out of nine users read the psychoeducation lessons from the first and second week, whereas only half of them read the third and fourth lessons. No dropouts were reported.

Participants' Feedback and Satisfaction
Using Likert scales ranging from 1 to 7, participants found that Psymate was very simple to use and provided clear instructions (verbal instructions = 6.6 ± 0.7; written instructions = 6.5 ± 1.0; Psymate answers = 2.6 ± 1.5). The number of daily prompts and the time needed to complete assessments was not stressful (number of beeps per day = 3.1 ± 1.6; time to answer = 2.5 ± 1.5). Finally, satisfaction with its most important feature, i.e., receiving EMA-derived feedback, indicated that the feedback was highly appreciated (usefulness of feedback = 6.2 ± 0.7) and considered valuable (feedback to improve daily skills = 5.4 ± 1.1). However, participants would have appreciated receiving more specific and practical advice related to the EMA-based feedback (3.2 ± 2.0).
Regarding Mobylize, satisfaction with the application was rated as 5.71 on a scale from 1 to 7. Criticism was related to technical problems, such as loss of connectivity and subsequent failure to receive prompts. Interestingly, 86% of the participants reported that the intervention was particularly helpful for identifying NA triggers and avoiding distressing and maladaptive behaviours. Participants also suggested lengthening the intervention and adding more activities, such as a blog to talk with other users or a message service between patients and coaches.
Participants involved in the Help4Mood study were quite satisfied with the application. Most of them would use it in everyday life and suggest it to other patients. The idea of a virtual agent to guide participants in completing the assessments was appreciated; however, some participants perceived the agent as too cold, repetitive, and not sufficiently realistic. Among the limitations, patients reported sometimes being bored by excessively long sessions. They would have appreciated receiving more psychoeducational material and a more tailored experience, allowing them to access their preferred materials and activities without restrictions.
Medlink's usability was assessed using 4 items from the Usefulness, Satisfaction, and Ease of Use Questionnaire (USE). On a scale from 1 to 7, participants reported encouraging scores for ease of use (mean = 5.7 ± 1.1) and learnability (mean = 6.1 ± 1.5), but low scores for perceived usefulness (mean = 4.6 ± 1.0) and satisfaction (mean = 4.8 ± 0.8). Furthermore, encouraging ratings were observed for the weekly psychoeducation lessons (liking = 6.0 ± 1.1; ease of use = 6.6 ± 0.5; learnability = 6.6 ± 0.5; and usefulness = 5.8 ± 1.7), which were also reported to be the most interesting and useful parts of the application. Finally, feedback interviews showed neutral comments regarding daily self-reports, that were perceived as not very useful; contrasting opinions were collected regarding feedback graphs.

Discussion
To date, the scientific literature has mostly been based on studies conducted in laboratory settings, thus understudying the daily dynamics of psychopathology [70]. Therefore, unobtrusively monitoring behavioural (i.e., sensors), physiological (i.e., biosensors), and cognitive/emotional (i.e., self-reports) factors in ecological settings collected through portable and wearable devices can provide new information about elusive psychological constructs that are usually defined by the complex dynamics of contexts and variability. Accordingly, the research field could benefit from the use of novel technologies to better explore MDD mechanisms and delineate new theoretical models based on ecological observations. Compared to paper and pencil daily diaries, the use of electronic devices, and especially smartphones, could further increase the six EMA advantages identified by Ebner-Premier (Table 5) [16]: (a) The automation of the entire process directly on a mobile device, such as a smartphone, can provide greater control over backfilling and higher temporal precision in the administration, planning, and randomization of prompts; (b) the use of ICTs can offer additional possibilities for multimodal assessments, with data supplied by embedded sensors and wearable unobstructed biosensors that can automatically be coordinated with the collection of self-reports; (c) the use of mobile devices reduces the effort required of users in completing daily assessments and prevents errors by researchers and clinicians due to manual data entry; (d) smartphones offer the possibility of providing real-time EMA-derived feedback that can be an important therapeutic tool for patients' self-monitoring, in addition to the possibility of sending real-time alerts to clinicians in case of need. In this regard, smartphones have the potential of becoming global low-cost tools that can also be adopted in the clinical field. Currently, 2.32 billion people in the world use smartphones, and it has been estimated that, by 2020, 70% of the world's population will own one [71]. The potential of these devices is also supported by the evidence showing that people with serious mental and physical illnesses own and regularly use smartphones [72] and are interested in using applications for their health [26].
As pointed out in this review, the widespread adoption of EMA for the investigation of depression has led to novel insights into different aspects of the disease, including emotion reactivity, cortisol patterns, or daily rumination. We discussed different sampling methods that can be used in EMA protocols, showing that the signal-contingent design with prompt randomization or semi-randomization is the most widely adopted option when dealing with variables, such as affect and symptom monitoring. We also reported compliance and dropout rates, which showed encouraging results, with most of the studies reporting more than 70% adherence. Nevertheless, the gap between clinical practice and research is still quite wide, as revealed by the low number of studies that adopt this approach to assess and monitor patients for clinical purposes or implement EMA in clinical settings. Accordingly, many issues still need to be addressed. To date, no standard and validated sets of items have been developed for EMA protocols, raising the problem of context validity. Moreover, further research should be conducted to improve patients' compliance and reduce dropout. Due to the intrinsic nature of the disease, depressed patients could be less likely to consistently complete daily assessments. In a previous study, we observed that compliance was higher in EMA administered through a smartphone and when patients were prompted less than 8 times a day [73]. However, a meta-analysis should be conducted to more precisely identify the factors that improve adherence (see, for example, [74]), thus providing some sort of guideline for the design of EMA. Indeed, we strongly believe that clinical practice could benefit from the use of EMAs for several reasons. First, EMAs can be useful for diagnostic purposes. Traditional diagnostic procedures usually involve a static moment in time, including semi-structured interviews (e.g., Mini-International Neuropsychiatric Interview) complemented by self-report measures. However, ample evidence shows the dynamic nature of affective states and mood [75]. Furthermore, these dynamics greatly vary from person to person, reasons for which ideographic approaches may shed light upon the structure of individual symptom dynamics [60]. Consequently, by means of EMAs, a more accurate diagnostic process could be pursued. Likewise, the continuous monitoring of patients' symptoms would allow clinicians to monitor the efficacy of a treatment over time [76], predict short-term mood changes [77], detect symptoms' worsening in an early stage [78], and create continuous communication between clinicians and patients. On the other hand, the use of daily mood and symptom self-ratings could provide more ecological assessments, overcoming recall bias and capturing the dynamics of human functioning in daily life that cannot be detected with traditional tools. Table 5. Benefits of using EMA for mood dysregulation and mood disorders as described by Ebner-Premier [16].

1
Real-time assessments Reduction in retrospective bias and increase in accuracy.

Repeated measurements
Better comprehension of time-dependent processes and dynamic changes in symptoms.

Multimodal assessments
Contemporary analysis of behaviours, physiological signals, and subjective experiences. 4 Context-specific information Assessment of symptoms as context-dependent.

5
Interactive assessments Real-time customizable and interactive feedback.

Generalizability
Higher ecological validity and collection of more representative data.
Our results also highlight the existence of a small number of EMIs for depression. In the current literature, only four ecological interventions have been developed, and only two of them were tested in a randomized-controlled trial (RCT). Our review showed promising results in terms of patient satisfaction and clinical efficacy, further supporting the need for more efforts in this direction. However, compliance rates were sometimes not encouraging, and a major challenge is to encourage regular use of these technologies throughout the entire treatment process [79]. Accordingly, future research should focus on the concept of users' motivation and engagement, taking into consideration the adoption of focus groups with patients during treatments, using mixed quantitative and qualitative designs to obtain as much information as possible to guide future developments, and extending the effects of gamification features on adherence and compliance [80]. In other words, greater attention should be paid to the needs and characteristics of the target population. Considering feedback from users, here, we were able to identify three EMI features that were highly appreciated: The possibility of receiving visual feedback about daily assessments and, therefore, self-monitoring of daily patterns; the availability of psychoeducational material on depression and its mechanisms; and the opportunity to have continuous or periodic communication with a trained clinician.
In this review, we found that most of the EMAs were based only on self-reports, whereas more attempts to integrate this information with data gathered from sensors and biosensors were observed for EMIs. Recent advances in sensor technologies have had an impact on applications for remote health [81], such as postoperative recovery [82], treatment for chronic patients [83], and monitoring of elderly individuals [84]. Consistently, the hierarchical sensing model proposed by Mohr highlights the great revolution that new sensors and biosensors can bring to the field of mental health [85], making it possible to collect raw sensor data (i.e., the lower level of the hierarchy) that can be converted into "behavioural markers" through machine learning and data mining methods [18].
Smartphone sensors further increase the potentially collectable information, allowing the reconstruction of people's habits, sleep patterns, or social life by using embedded sensors, such as accelerometers, calls, short message service (SMS), social network data, or geolocation. In other words, it is now possible to infer and collect behavioural information without necessarily asking the person to report it.
Even though they were not investigated in the studies targeted at MDD patients discussed here, several opportunities can be found in the integration of EMA and EMI platforms with behavioural and physiological signal processing, further mediated by machine learning algorithms. On the one hand, several behavioural signals are readily collectable with the use of smartphone sensors, even though they may lack the required specificity for mood recognition and prediction, as found by the Mobylize! study [61]. On the other hand, due to recent advancements in sensor technologies, physiological signals can be nowadays recorded unobtrusively by means of, for example, smartwatches and chest bands. These could provide an EMA and/or EMI platform with additional markers that more closely correlate to a person's affective state, and that can be used as input to the analysis performed [61]. Consistently, models can be automatically learned that continuously estimate the patient's affective state by extracting and analysing salient features of physiological signals [86]. For instance, electrodermal activity (EDA) and heart rate variability (HRV) have been extensively investigated as correlates of users' affective state, and they are considered non-invasive. They do not involve recording sensitive information (as opposed to, for example, cameras and acoustic signals), and associated sensors do not interfere with users' daily routines. Consistently, patient-specific models can be automatically learned that continuously estimate the patient's affective state by extracting and analysing salient features of physiological signals [86].
Unfortunately, the relation between physiological signals and affective states is not trivial and mixed results are discussed in the literature [87]. Building on recent advances of machine learning, recent studies obtained promising results by means of model personalisation for stress recognition [88] and deep learning for mood prediction [89] using a combination of behavioural and physiological markers in non-clinical populations. If thoroughly tested and consolidated through experimental validations in EMA settings, a model of this type could provide a finer-grained description of the evolution of the patient's disorder throughout a long-term study, compared to surveys that are usually filled in just a few times a day. It can be considered less obstructive to the patient's life because physiological data are recorded passively and do not require extra effort from the patient. Furthermore, in EMI settings, if the recognition algorithm detects that the patient is in a critical state, it can automatically trigger an intervention module associated with the platform or open a communication channel between the patient and his/her therapist. Alternatively, predictive models that combine information from physiological and behavioural signals to estimate the patient's future mood, stress level, and self-reported health (one or a few days in advance) can be automatically inferred [89]. After identifying a risk threshold, these models would make it possible to plan interventions (or involve the therapist) in advance, that is, before the patient's affective state reaches a critical state.
We should, however, recognize that the use of EMAs and EMIs has some limitations. These approaches are time-consuming and may be perceived as invasive by users. Patients are required to complete multiple assessments throughout a day, and protocols often last weeks. Moreover, people might not be willing to share personal information. Finally, in terms of more ecological validity, they may be advantageous for clinical purposes, but disadvantageous for research aims, because they imply less experimental control. Because the data are collected during everyday life and in naturalistic environments, it becomes hard or even impossible to have complete control over the setting, and, therefore, it is not possible to rule out the role of confounding variables. Nevertheless, due to the implementation of novel statistical procedures, a balance between research necessities and clinical utility could be achieved [90]. If this were the case in the near future, EMAs and EMIs would undoubtedly transform the field of mental health, greatly contributing to the bridging of science and practice [91,92].
Overall, this systematic review clearly shows the emergence of ecological assessment and intervention as a promising avenue for clinical psychology. The focus of the review was limited to a specific clinical population. Still, promising results have been already shown also regarding the application of EMA and EMI to anxiety disorders [93,94] and stress-related disorders [95,96], highlighting the potential of these tools to provide psychological support in daily life and to investigate symptom fluctuations across time. However, similar limitations and burning issues were also evidenced, including the need for more high-quality trials, the gap between the clinical and research field, and the importance of making EMAs and EMIs as engaging and tailored as possible. Altogether, there is evidence showing the feasibility and preliminary efficacy of these approaches, but much more research should be conducted before drawing definite conclusions.