Validating the use of a smartphone app for remote administration of a fear conditioning paradigm.

Fear conditioning models key processes related to the development, maintenance and treatment of anxiety disorders and is associated with group differences in anxiety. However, laboratory administration of tasks is time and cost intensive, precluding assessment in large samplesnecessary for the analysis of individual differences. This study introduces a newly developed smartphone app that delivers a fear conditioning paradigm remotely using a loud human scream as an aversive stimulus. Three groups of participants (total n = 152) took part in three studies involving a differential fear conditioning experiment to assess the reliability and validity of a smartphone administered fear conditioning paradigm. This comprised of fear acquisition, generalisation, extinction, and renewal phases during which online US-expectancy ratings were collected during every trial with evaluative ratings of negative affect at three time points. We show that smartphone app delivery of a fear conditioning paradigm results in a pattern of fear learning comparable to traditional laboratory delivery and is able to detect individual differences in performance that show comparable associations with anxiety to the prior group differences literature.

saw a pop-up notification reminding them to continue making their online expectancy ratings.
US onset was at 7.5 seconds and co-terminated with the CS+. Intertrial intervals varied randomly between 5,10 and 15 seconds. During acquisition, participants were shown 12 presentations each of the CS+ and CS-. The CS+ was reinforced with the US on 75% of trials. During generalisation, each of the 6 total stimuli were shown two times each. The CS+ was reinforced with the US on 50% of the trials. During extinction, the CS+ and CS-were shown 16 times each, with no US reinforcement.
A forced ten-minute break occurred between generalisation and extinction. During this time, participants were redirected to another laptop (lab administration), or external site (mobile administration) to complete questionnaires. First, participant's contingency awareness was assessed: participants were shown the question "Did you happen to notice whether the scream occurred with one of the shapes?", if they indicated they did, they were asked to indicate which circle size the scream had occurred with. If participants responded in the affirmative to question 1, and correctly identified the circle size of their CS+, they were considered contingency aware. Participants were then presented with questionnaires, including the Generalised Anxiety Disorder-7 (GAD-7) (Spitzer, Kroenke, Williams, & Löwe, 2006), Anxiety Sensitivity index (Peterson & Heilbronner, 1987), and the Trait subscale of the State and Trait Anxiety Inventory (STAI -T) (Spielberger, 1983).
After generalisation, during the ten-minute break before extinction, participants were asked to rate how unpleasant they found the US (1not at all unpleasant; 9highly aversive and unpleasant), and whether they noticed if the scream followed a particular shape, and if so which. This established whether they were aware of the contingency between the scream and the CS+. After extinction, participants were asked to rate all six experimental stimuli for familiarity, valence, arousal and fear for a second time.

Counterbalancing Validation study
The order of session (laboratory or app) was randomly assigned, such that approximately half of the participants underwent mobile administration first, and half underwent lab administration first. The colour of the circles used as the conditioned stimuli were always blue when the task was administered by app, and orange when administration was in the laboratory. The size of the circle that served as the CS+ was counterbalanced across all participants.
Laboratory and app test-retest.
The colour of the circles used as the conditioned stimuli were counterbalanced across all participants, with the condition that the colour differed between week one and two.
The size of the circle that served as the CS+ was counterbalanced across all participants. sFigure 1. Screenshots of the FLARe app set up instructions sFigure 1. Figure showing screen shots from the FLARe app set up phase. Users see these instructions after logging into the app and before the fear conditioning task commences. Panels should be read from left to right for each consecutive row. sFigure 2. Screenshots of the FLARe app task instructions sFigure 2. Figure showing screen shots from the FLARe app experimental instruction phase. Users see these instructions after logging in and going through initial set up guidance (see sFigure 1). These instructions immediately precede the beginning of the fear conditioning experiment.

Stimuli
Six circles varying in size served as the CS+, CS-and GS. Each circle increased in size by 15%, in this way the relative size difference between stimuli was maintained despite differences in screen sizes. For the validation study, during lab administration circles were orange. During mobile administration circles were blue. For the lab and app test-retest studies the circle colour at week one and week two was counterbalanced for all participants.
During acquisition and generalisation phases, circles were presented on a context image consisting of an outdoors garden scene. During extinction, circles were presented on a context image of an indoor living room scene.
The US was a human scream, played for a duration of 500ms. The scream was played using a set of over-ear headphones. The same headphones were used for both lab and mobile administration. Headphones were provided to all participants prior to the mobile administration. During lab administration, the scream was played at 100db. During mobile administration, the scream was played at the individual phone's maximum volume.
Data processing.

Imputation of missed values for US-expectancy ratings
If the first rating of any given stimulus type during acquisition or extinction phase was missed, the starting value was set as an average of all participants starting values for that stimulus type, administration method and phase.
If the final rating of any given stimulus type during acquisition or extinction phase was missed, the value was set by carrying forward of the last rating made for that stimulus type, administration method and phase If any rating between the first and last rating for any stimulus was missed during acquisition or extinction, the value was calculated as an average of the preceding and proceeding rating.
Creation of anxiety composite score Trait anxiety was assessed using each participant's total score on the 20 item trait scale from the Spielberger State-Trait Anxiety Inventory (Spielberger, 1983). Generalised anxiety was assessed using total score from the GAD-7 (Spitzer et al., 2006). Anxiety sensitivity was measured using total scores from the Anxiety Sensitivity Index (Peterson & Heilbronner, 1987). To create the total anxiety composite, z-scores were derived from the total scores from each of the three anxiety measures for laboratory and app task administration data separately. Final composite score was a mean of these three z-scores for each participant.

Exclusion for differing contingency awareness
Participants were excluded if they were not contingency aware for either one or both of the two testing sessions. This resulted in the loss of six participants who were only aware of the contingency between the US and the CS+ during one testing session. No participants were contingency unaware for both sessions. Of these six participants, five were not contingency aware during the first test session but were by the end of session two. The remaining individual was contingency aware for the first, but not second test session. CS+ / CS-differential analyses Secondary analyses were performed using the differential between the CS+ and CS-for each rating type for each phase (calculated as subtracting the mean CS-value from the mean CS+ value). This was to investigate the stability and construct validity of the ability of individuals to discriminate between stimuli.

Results
Data processing.
A summary of proportion of participants who missed any values per phase by mode of administration is presented in sTable 1 and 2 below, in addition to a summary of average and modal number of trials missed per person.  (n=69). Shaded portion shows the absolute difference between US-expectancy rating to the CS+ and CS-and results of t-test comparing the mean US-expectancy for the CS+ and CS-for laboratory and app administration respectively. Significant differences are indicated by a "*". US-expectancy; Average self-reported US-expectancy rating per stimulus across all trials for each phase. Affective ratings; Composite affective rating comprising of self-reported feelings of anxiety, fear and unpleasantness for each stimulus at three time points i) before the experiment begins (baseline), after the extinction phase (post-extinction) and after day two renewal (post-renewal). CS+; the conditioned stimulus that is paired with the aversive sound during acquisition and generalisation. CS-; the conditioned stimulus that is never paired with an aversive sound.

sTable 1. Proportion of participants who missed any trials
Cross-modal validation sTable 5. Within-person intraclass correlation between week one and week two for the CS differential for all studies Composite affective rating comprising of self-reported feelings of anxiety, fear and unpleasantness for each stimulus at three time points i) before the experiment begins (baseline), after the extinction phase (post-extinction) and after day two renewal (post-renewal). CS differential; the difference between the CS+ conditioned stimulus that is paired with the aversive sound during acquisition and generalisation and the CS-conditioned stimulus that is never paired with an aversive sound.

US unpleasantness comparisons
We compared the self-reported unpleasantness of the scream US between app and lab administrations of the task and found that the mean rating of unpleasantness ( Table showing the mean (standard error) US-expectancy rating for the CS+/ CS-differential (CS-subtracted from the CS+) stimulus averaged across all trials of each phase for laboratory and app administration for the remote validation study (n=69). Shaded portion shows the absolute difference between US-expectancy rating to the CS+ and CS-and results of t-test comparing the mean US-expectancy for the CS+ and CS-for laboratory and app administration respectively. Significant differences are indicated by a "*". US-expectancy; Average self-reported US-expectancy rating per stimulus across all trials for each phase. Affective ratings; Composite affective rating comprising of self-reported feelings of anxiety, fear and unpleasantness for each stimulus at three time points i) before the experiment begins (baseline), after the extinction phase (post-extinction) and after day two renewal (post-renewal). CS+; the conditioned stimulus that is paired with the aversive sound during acquisition and generalisation. CS-; the conditioned stimulus that is never paired with an aversive sound.
Associations with anxiety sFigure 6. Correlations between fear conditioning outcomes as CS differential and composite anxiety score sFigure 6. Plots visualising the correlation between the mean difference between the CS+ and CS-per experimental phase and composite anxiety for the first week only in validation, app test-retest or Laboratory test-retest. Correlations presented for the app (n = 89) and Laboratory (n = 91) based testing separately. Negative correlations are indicated by the "-" symbol. Error bars represent the bootstrapped 95% confidence intervals. Significant correlations (after correcting for the effective number of independent tests) indicated by a single asterisk ("*"). Panel A presents bar plots showing the Pearson's correlation between average participant expectancy rating (subtracting CS-from the CS+) during acquisition, generalisation, extinction and renewal testing phases for Laboratory (left) and App (right) sessions respectively. Panel B presents bar plots showing the Pearson's correlations between average participant affective composite (CS-subtracted from the CS+) ratings made after the extinction phases (Post extinction) and after the renewal phase (Post renewal) for Laboratory (left) and App (right) testing sessions respectively.