Success in reaching affect self-regulation goals through everyday music listening

While music listening on mobile phones can serve many affect-regulatory goals, success in reaching these goals is yet to be empirically assessed. This study aimed to determine how frequently listeners successfully reach their affect-regulatory goals, and the predictors of this success. Data were collected using the experience sampling app MuPsych, from 293 Finnish participants. Goals were successfully reached in less than half of cases, with adults more successful than adolescents. Success was determined largely within contexts, and strongly predicted by an initial low-valenced emotional state of the listener, suggesting that music listening is particularly useful for those in negative states.


Introduction
Young people are avid music listeners. Especially during this age, music listening goes beyond being a pleasant, aesthetic experience -it is an activity deeply connected to the development and negotiation of emotional, social, and cognitive aspects (Baltazar, 2019;Saarikallio, 2017Saarikallio, , 2019Saarikallio & Baltazar, 2018). During adolescence, there is a peak in music engagement (North et al., 2000) and music listening occurs in a significantly wider variety of contexts than during adulthood (Bonneville-Roussy et al., 2013). Previous scholars have argued that the unique connection that young people have with music reflects its contribution to key developmental tasks (e.g. identity and agency, emotional regulation, social connectedness; Laiho, 2004;Miranda, 2013;Schwartz & Fouts, 2003).
Music has been identified as a useful tool for regulating affective states in contexts ranging from stress management (Pelletier, 2004) to sport and exercise (Terry et al., 2020). For adolescents, one of the most relevant functions of music is the regulation of affective states (Baltazar, 2019;Saarikallio & Erkkilä, 2007). Affect regulation through music is a process defined by the engagement with music in order to alter experienced states or create new ones. Since these states can be of different durations and intensities (e.g. moods, emotions, energy levels, stress responses), affect is used as an umbrella term (Baltazar & Saarikallio, 2016;Juslin & Sloboda, 2010). Adolescence, especially in its early years, is a period marked by frequent negative states and increases in affective instability (Larson et al., 2002), which represents a CONTACT William M. Randall will.m.randall@gmail.com Department of Music, Art and Culture Studies, University of Jyväskylä, Jyväskylä, Finland particular need for affect self-regulation. The ability to self-regulate is strengthened during adolescence by the cognitive development characteristic of this stage (Campos et al., 1989) and the repertoire of regulation strategies (i.e. the tools used to achieve affective goals; Koole, 2009) is widened. However, the development of this repertoire is not necessarily linear, and during some years of adolescence the usage of regulation strategies might actually be impaired, leaving adolescents in a more fragile situation (Kovacs et al., 2019). Lennarz et al. (2019) found in their experience-sampling study that, when experiencing negative affect, adolescents used the strategy acceptance the most, followed by problem-solving, rumination, distraction, avoidance, reappraisal, social support, and suppression. Of these, problem-solving, acceptance, and reappraisal were more successful in downregulating negative affect (when compared to rumination). Individual and contextual differences are major influences in the acquisition of regulation skills, such that some adolescents are faced with dysregulation and emotional distress (Cole et al., 1994;Diamond & Aspinwall, 2003;Fox & Calkins, 2003;Rubin et al., 2001). It has been argued that the heightened brain plasticity observed during adolescence can mean both increased opportunities and increased vulnerabilities (Fuhrmann et al., 2015). Importantly, consequences of poor self-regulation reach varied areas of functioning, such as social integration, mental health, and learning (e.g. Chervonsky & Hunt, 2019;Eisenberg et al., 2000;Hatzenbuehler et al., 2008;Silk et al., 2003). Despite the impact of emotion regulation on adolescents' well-being, at the time of publishing, no previous research reporting success rates on adolescents' emotion regulation in daily life could be found.
Published studies report, mainly, the predictive power of regulation strategies, context, and affective variables such as intensity (DeFrance & Hollenstein, 2022;Hiekkaranta et al., 2021;Lennarz et al., 2019), whereas the percentages of observed change or self-perceived success are omitted. During this developmental stage marked by a constant negotiation between increments in skills and in challenges/responsibilities, music listening seems to offer adolescents a rich platform to explore, extend, and foster self-regulation skills (Baltazar, 2019;Saarikallio, 2019;Saarikallio & Baltazar, 2018). Previous work has shown that adolescents engage in music to achieve their affective goals, be it to feel better, cope with feelings, intensify experienced states, or feel something new (Baltazar & Saarikallio, 2019;McFerran et al., 2018;Miranda, 2019;Saarikallio & Erkkilä, 2007;ter Bogt et al., 2017). As the psychology literature postulates, affect regulation can serve different goals: including to change, create, enhance, or maintain current states (Gross, 1999;Koole & Aldao, 2016;Tamir, 2009), both positive and negative.
Although there is no standard categorisation of affective goals in the literature of musical regulation, one way of characterising them is in terms of desired changes in valence and arousal. Valence and arousal are two components of affect that efficiently map out experiences in a two-dimensional space (affect circumplex; Russell, 1980). Valence represents the hedonic component of affect and is commonly polarised as positive-negative or pleasurable-displeasurable, whereas arousal represents the activation component of affect and is commonly polarised as low arousal-high arousal or calm-energetic (Barrett et al., 2007;Russell, 2003). In regards to valence, young people most commonly use music to improve affective states (towards positive valence) or to maintain their (usually positive) states by listening to affect-congruent music (Bishop et al., 2007;McFerran et al., 2015;Papinczak et al., 2015;Saarikallio & Erkkilä, 2007). In regards to arousal, one of the most common affect-regulatory uses of music is relaxation ( Van Goethem & Sloboda, 2011). Goals relating to changes in arousal level also appear to be something that people are consciously relatively aware of, such as calming down for sleep or raising energy before a sports training session (Saarikallio, 2011;Saarikallio et al., 2017;Van Goethem & Sloboda, 2011).
Overall, adolescents regulate their affect through music in a diverse way, by employing a variety of regulation strategies (e.g. coping, problem-solving, diversion, entertainment, solace, venting;Miranda, 2019;Papinczak et al., 2015;Saarikallio & Erkkilä, 2007). It can be argued that music acts as a beneficial resource for affective selfregulation, assisting and supporting it in a multitude of ways. Due to its unique properties, music listening may even facilitate, 'scaffold', or extend regulatory skills that otherwise would not be so easily reachable (DeNora, 2000;Elvers et al., 2018;Krueger, 2018;Saarikallio, 2019). However, depending on certain factors, music listening is sometimes linked to deleterious outcomes (e.g. Miranda et al., 2012;Miranda & Claes, 2004;Reybrouck et al., 2020). Indeed, music listening for affect regulation has been identified in the literature as both a protective and a risk factor.
Previous research has investigated whether the beneficial outcomes of music listening are influenced by individual differences. As a specific tactic of self-regulation (Baltazar & Saarikallio, 2016;Van Goethem & Sloboda, 2011), music listening has been considered to reflect the individual's competences, needs, and vulnerabilities (McFerran & Saarikallio, 2014;Miranda et al., 2012). Even though some researchers have presented concerns in regards to some genres such as heavy metal music, further studies have argued that the observed associations between music genres and behavioural and affective problems are rather explained by underlying vulnerabilities or individual traits (Baker & Bor, 2008;Lacourse et al., 2001;Martin et al., 1993;Mulder et al., 2010;Schwartz & Fouts, 2003). For sad music, for instance, the relationship between music listening and experienced emotions seems to relate to individual differences (Eerola et al., 2018;Garrido & Schubert, 2013). The effect of sad music on adolescents ranges from consolation and solace (Hanser et al., 2016;Saarikallio & Erkkilä, 2007;ter Bogt et al., 2017; to increased sadness (ter Bogt et al., 2021). The differentiating factor seems to be underlying affective difficulties, as found by ter Bogt et al. (2021) and supported by studies with the adult population (Garrido & Schubert, 2015;Schubert et al., 2018). Similarly, McFerran et al. (2015) found that, amongst the collected sample, distressed participants were more likely to report feeling worse after listening to music. Thus, listening to sad or aggressive music may be detrimental in terms of intensifying negative states and strengthening maladaptive patterns particularly for the ones who already suffer from depression or other affective disorders (Baker & Bor, 2008;McFerran & Saarikallio, 2014;Stewart et al., 2019).
However, individual differences are not the only predictor of music listening outcomes. A growing approach to music and wellbeing has been an integrative one, aiming to take into account the complexity of individual, musical, and also contextual factors that shape this relationship. Recent work using experience sampling methodology (ESM; Csikszentmihalyi & LeFevre, 1989) shows that the emotional outcomes of daily music listening are largely explained by contextual factors such as the situational mood of the listener and activities related to the listening context (Randall & Rickard, 2017a). Marik and Stegemann (2016) further point out that isolated uses of maladaptive or ineffective strategies most likely are harmless, whereas their repeated use in day-to-day listening situations leads to the accumulation of negative outcomes. This is something Miranda et al. (2012) call the gradual development effect, referring to small yet cumulative effects -everyday experiences with music that throughout time shape the adolescents' development and mental health.
Taken together, it seems plausible that both individual and contextual factors shape the efficacy and health-impact of musical choices and listening strategies. However, the current literature is particularly sparse concerning the impact of contextual and situational factors. Furthermore, it has been relatively common to address musical affect self-regulation patterns using surveys, questionnaires, and interviews that can identify general tendencies and retrospective personal interpretations, but do not manage to actively track moment-tomoment changes in affective states across the multitude of music listening situations. Two recent studies have addressed this concern, both of which utilised experience sampling of everyday music listening, and multilevel modelling to determine the effects of listening within contexts. The first of these, undertaken by Randall and Rickard (2017a), did not directly assess regulatory success, while the second, performed by Greb et al. (2019), focused on random sampling of music selection behaviours, rather than real-time affective outcomes. As such, no previous study has investigated the ecological and real-time success of regulation strategies within everyday music listening contexts.

Aim and research questions
The current study aimed to investigate the frequencies and predictors of success in affect regulation through personal music listening. In particular, it focused on the situational determinants of affect-regulatory success during moment-to-moment listening episodes of everyday life. Success was defined as alignment/coherence of the self-reported affective goal and the self-reported affective outcome in a given listening episode. Our research questions were: RQ1. How frequently are adolescents and adults successful in reaching their stated affect regulation goals during everyday music listening?
RQ2. What are the significant predictors of affectregulatory success for both adolescents and adults, including regulation strategies utilized, initial mood, contextual variables, and individual trait variables?
Due to the lack of any prior research on this, we did not have predictions for exact percentages for success concerning different affect-regulatory goals. However, as prior research generally identifies young people as being competent in using music as a tool for their affect regulation, we generally hypothesised that listeners would more often succeed than fail in their regulatory attempts (there would be congruence between goal and outcome). Furthermore, based on prior research (Saarikallio, 2011) we hypothesised that they would have higher success concerning the arousal-related than valence/intensity related aspects of regulation. For the second research question, based on prior research (Randall et al., 2014) we expected that the success in reaching different regulation goals would depend particularly on the contextual variables, such as the initial mood of the listener, their listening context, and their reason for listening.

Participants
The main sample consisted of 293 participants -all of Finnish nationality -who were divided into two separate groups: adolescents and adults. The adolescent group (n = 205) were aged from 13 to 19 years (M = 14.8, SD = 1.7), were 64.4% female (33.7% male, 2.0% nonbinary), and predominantly secondary school students (93.5%; 5.5% tertiary students, 1.0% employed parttime). The adult group (n = 88) were aged from 20 to 52 years (M = 26.8, SD = 6.5), were 78.4% female (19.3% male, 2.3% non-binary), and consisted of 63.2% tertiary students (21.8% employed full time, 9.2% employed part-time). The majority of adolescent participants were recruited through secondary schools within the Jyväskylä sub-region of Finland, while all other participants were recruited through social media platforms and email lists. Any Finnish speaking person was eligible to participate if they used a mobile phone with the Android operating system to listen to music. A third group (n = 268) consisted of adolescents (M age = 15.3, SD = 1.7) who did not use an Android phone to listen to music, and instead completed an online survey, for comparative purposes. Participation was incentivised through a draw to win subscriptions to music streaming services. Informed consent was acquired from each participant when they downloaded the data collection app or opened the online survey.

Materials
For the first two groups, all research materials were presented to participants using MuPsych, an app which utilises the experience sampling method (ESM; Csikszentmihalyi & LeFevre, 1989) to measure real-time responses to music listening on mobile phones (Randall & Rickard, 2013). The app collected data in two ways: through the event-based presentation of questions during music listening -referred to as experience sampling reports (ESRs) -and through a set of psychological surveys.

Music ESRs: questions at start of listening
Music ESRs consisted of a series of screens presented to participants immediately when they listened to any music on their phone. The first of these screens assessed their subjective affective state, with responses given on two 7-point slider scales: the first titled 'Mood' (labelled at points 1, 4, 7: Negative-Neutral-Positive) and the second titled 'Energy' (labelled at points 1, 7: Very low-Very high). These variables were labelled 'Initial valence' and 'Initial arousal', respectively. The dimensions of valence and arousal have been demonstrated to be efficient and reliable measures of music-induced emotion, explaining a high proportion of variance (Thoma et al., 2012;Vuoskoski & Eerola, 2011). Following this screen, a second screen (titled 'How do you feel?') presented a list of discrete emotional states, from which participants selected a single state that best matched their current mood. This list -taken from a total of 27 emotional states -was shortened to approximately 10 options (presented in alphabetical order), based on the valence and arousal ratings given on the previous screen. On a third screen, participants rated how strongly they felt this selected emotional state, with another 7point slider, titled 'How < EmoState > do you feel?' (in which < EmoState > was replaced by the discrete emotional state selected by the listener; labelled Not at all-Very). This variable -labelled 'Initial Intensity' -was therefore a measure of how strongly the listener felt the emotional state they selected (e.g. intensity of anger or excitement). The following three screens used an optionlist format to assess the music listening context, with lists for who was listening to the music ('Listeners'), where they were ('Location'), and what they were doing while listening (' Activity'; shortened from a total of 30 options based on selected responses of 'Listeners' and 'Location'). Time of day was also recorded, and categorised into: Morning (4am-12pm), Afternoon (12pm-5pm), Evening (5pm-8pm), and Night (8pm-4am). Following these three initial affective state screens and three context screens, participants were left to listen to their music for a period of five minutes.

Music ESRs: questions after five minutes of listening
If music was still playing on the phone after this fiveminute period, a second set of ESR screens was presented to participants. The first of these screens assessed Valence, Arousal, and Intensity of the selected discrete state, with the same 7-point slider format and labels as the initial measures. The differences between each of these initial and secondary measures were used to determine how affect changed over the listening experience, and were recorded as the variables 'Valence change', 'Arousal change', and 'Intensity change'. Following this was a screen of questions related to the music, also measured on 7-point slider scales: the subjectively perceived valence of the music (titled 'Mood of the music'; labelled Negative-Neutral-Positive), and arousal of the music ('Energy of the music'; Very low-Very high); along with level of Attention ('How much attention are you paying to the music?'; None-Complete); Enjoyment ('How much are you enjoying this music?'; Not at all-Very much); and Familiarity ('How familiar are you with this music?'; Not at all-Very familiar).
Finally, music ESRs ascertained the main reason participants had for listening. This was assessed using a branching option-list format, with an initial list of 9 reason categories, which branched off to 55 specific primary reasons for listening. Of these primary reasons for listening, six were considered emotion regulation goals: 'To maintain a(n) < EmoState > mood' (labelled Maintain), 'To feel less < EmoState > ' (Diminish), 'To feel more < EmoState > ' (Enhance), 'To feel better' (Improve), 'To raise/boost energy' (Raise), and 'To relax/calm down' (Relax). If one of these primary regulation goals was selected, the next screen asked participants to select the emotion regulation strategy they used in order to reach their emotional goal. If any other primary reason was selected, the next screen instead asked 'Was there also an emotional reason?', with the six emotion regulation goals presented as list options, along with 'There was no emotional reason' (recorded as secondary regulation goal).

Surveys
A set of psychological surveys was presented to participants within the app, which were available to complete at any time. A 'Basic details' survey collected demographic details such as age, gender, and nationality. Personality traits were assessed using a short version of the Big Five Inventory, which retains significant levels of reliability and validity when compared to the longer versions (BFI-10; Rammstedt & John, 2007). Previous research has revealed various links between these personality traits and both how music is used (Chamorro-Premuzic & Furnham, 2007), and emotional reactions to everyday listening . Psychological well-being and flourishing were assessed using the Flourishing Scale (FS; Diener et al., 2010), which measures self-perceived success in important life areas. Prior research has shown that psychological well-being, though measured with other scales, relates positively with musical affect regulation through reappraisal (Chin & Rickard, 2014), but negatively with musical rumination and avoidance (Saarikallio et al., 2015). Trait empathy was measured using the Interpersonal Reactivity Index (IRI; Davis, 1980); specifically, the subscales of 'Fantasy' and 'Empathic Concern', which have been shown to be related to the intensity of music-induced emotions (Vuoskoski & Eerola, 2012). Previous research has demonstrated that scales presented within MuPsych produce similar Cronbach alpha scores to those published from the standard questionnaires (Randall & Rickard, 2013). All music ESR items and surveys were presented in Finnish, then backtranslated to English.

Procedure
Participants were instructed -either in-person or through recruitment material -to install the MuPsych app on their mobile phone, and launch the 'OmaMusa' study. All components of the study were presented to participants through the app, including the information statement and consent form, all music ESRs, and the set of psychological surveys. Following installation and consent, music ESRs were presented automatically when the participant next started playing music on their phone, in any music player app. Once a music ESR had been completed (including the questions at the start of listening and those after five minutes of listening), no new music ESRs were presented for a period of three hours, to avoid respondent fatigue. Surveys were available within the app, and could be completed at any time of convenience. Data collection continued for a period of one week, after which no more music ESRs were presented. The online survey group received information statements and consent at the beginning of the survey and responded only to the demographic and psychological surveys for comparative purposes.

Data analyses
In comparing the adolescent ESM group (Android users only) and the online survey group, no difference was found on any of the five BFI measures of personality (all

Aggregate analyses
On the listener level, aggregate scores were created for each individual participant, producing overall success scores for each of the regulation goals, along with means for additional measures such as changes in valence, arousal, and mood intensity. For these analyses, only participants with five or more music ESRs with a regulation goal were included (N = 136). This aggregate approach is recommended for ESM (Hektner et al., 2007), and has been utilised in previous ESM studies of music use Randall & Rickard, 2017b). Due to the differences in numbers and distributions, the two age groups remained separated for all listener-level analyses, with non-parametric tests performed to indicate inter-group differences. Firstly, the frequency at which each of the regulation goals was selected for primary and secondary reasons was determined for each of the age groups (Table 1). Chisquared tests of independence were performed to determine any association between age group and goal type (primary/secondary) for each of the regulation goals. In addition, as data collection for the adolescent group took place both before and during the national lockdown due to the COVID-19 virus starting in March 2020, total frequencies were compared between these two periods. Secondly, aggregate scores were ascertained for initial states (valence, arousal, and intensity), music measures (valence and arousal), and experience measures (attention, familiarity, enjoyment), for both age groups and each of the regulation goals (primary and secondary combined; Table 2). All initial states, music and experience measures were standardised to provide a score between −1 and +1. Mann-Whitney U tests were performed to determine any difference between the two age groups on these measures. Thirdly, the success of each regulation goal was determined by performing Wilcoxon signed-rank tests on the initial and changed states for each success measure (Table 3). For the regulation goal of Improve, success was defined as an increase in valence over the listening experience. For Diminish and Enhance, success was defined by a decrease or increase in emotional state intensity, respectively. For Relax and Raise, success was defined by a decrease or increase in arousal, respectively. Failure for each of these five goals was defined as the success measure changing in the opposite direction. For Maintain, success was defined as no change in intensity, and failure as any decrease in intensity. Success frequencies were calculated, and Chi-squared tests of independence were used to determine associations between age group and success for each of the regulation goals. Fourthly, Kruskal Wallis H Tests were performed to determine if there was any difference in success across (the 10 most frequent) regulation strategies, listener emotional states (Table 4a), or concurrent activities (Table 4b). Finally, a supplementary analysis was performed on the listener level to determine if valence, arousal, and intensity significantly changed from their respective initial states (regardless of regulation goal). Mann-Whitney U tests were performed using absolute change scores, for both age groups (adolescent/adult), and each of the goal types (primary/secondary/no goal; Appendix).

Multilevel analyses
The multilevel structural equation models (SEM) nested music listening experiences within individual listeners, and were implemented using Mplus statistical software (version 7.4: Muthén & Muthén, 2015). Separate models were created predicting success for each of the goals Improve, Relax, Raise, and Enhance (Diminish was removed due to insufficient data, and Maintain was removed due to a non-continuous success measure). An overall model was also created, combining the continuous success measures for Improve, Relax, Raise, Diminish, and Enhance. For each model, the continuous measure of success was the only outcome, with the experience level predictors: initial affective states, context (listeners, location, time, activity), music variables, regulation strategies and goals; and listener level predictors: age group (adolescent/adult), gender, flourishing, and subscales from the Interpersonal Reactivity Index and Big Five Index. All musical experiences with either primary or secondary regulation goals were included in the models, with the overall model consisting of 2,032 music listening experiences nested within 293 listeners. The models utilised maximum likelihood estimation with robust standard errors (MLR), with an accelerated expectation-maximisation (EMA) optimisation algorithm.

Aggregate analyses
The frequencies of primary and secondary regulation goals are presented in Table 1. Results show that adolescents more frequently had a primary regulation goal when listening, while adults more frequently had    a secondary regulation goal. This association was supported by Chi-squared testing performed on the related counts (χ 2 = 13.20, p < .001). Additional Chi-squared tests revealed that adolescents more frequently used Improve (χ 2 = 3.96, p = .047), Diminish (χ 2 = 3.99, p = .046), and Maintain (χ 2 = 7.21, p = .007) as a primary goal, when compared to adults.  Table 1 are the pre-Covid and Covid total (primary + secondary) frequencies for adolescents. A significant association between goals and Covid timing was observed (p < .001), with increased use of Improve and regulation goals overall apparent for those listening during the pandemic. The initial affect states, music variables, and experience variables related to each regulation goal are shown in Table 2. Mann-Whitney U tests revealed that adolescents were in a more positive initial state than adults when their regulation goal was Improved (U = 10,163, p = .002), and had both higher initial valence (U = 4000, p < .001) and mood intensity (U = 4358, p = .002) when their goal was Maintain. Adolescents also selected more positively valenced music than adults when their goal was Enhance (U = 3160, p < .001) or Maintain (U = 4070, p < .001), and lower arousal music than adults when their goal was Relax (U = 10,277, p = .001) or Enhance (U = 5547, p = .003). Furthermore, they listened to more familiar music than adults when their goal was Improve (U = 8815, p < .001), Relax (U = 4685, p < .001), Enhance (U = 3300, p = .001), or Maintain (U = 4197, p < .001). Table 2 also shows the two most frequent initial emotional states for each (primary) regulation goal for both adolescents and adults. These initial emotional states are relatively similar between adolescents and adults, but only adults report initial anxiety when aiming to relax and initial tiredness when aiming to raise energy. For the goal Diminish, adolescents were more frequently bored or tired, while adults were tired or annoyed.
The changes in success measures and frequencies of success for each regulation goal are shown in Table 3. Wilcoxon signed-rank tests revealed a significant increase in valence for the goal Improve for both adolescents and adults, and significant changes for the goals Relax (decrease in arousal) and Diminish (decrease in emotional state intensity) for adults. Both age groups were successful for the goal Maintain, as there was no significant change in intensity for either group. Overall success frequency across all regulation goals was higher for adults (47.7% success; 18.5% failure) than for adolescents (42.2% success; 16.0% failure). Supporting these findings, chi-squared testing revealed that adults were more frequently successful in reaching their regulation goals (primary and secondary combined) for Relax (χ 2 = 6.76, p = .034), Diminish (χ 2 = 9.09, p = .011), and for regulation overall (χ 2 = 6.81, p = .033). For adolescents, overall success was 40.4% pre-Covid (16.9% failure), and 44.5% during the Covid lockdown (14.4% failure).
Kruskal Wallis H Tests were performed to determine if there was any difference in success across the 10 most frequent regulation strategies, listener emotional states, or concurrent activities. No emotion regulation strategy was found to result in higher success, for either adolescents or adults. For both age groups, the emotional state 'Tired' was most successfully regulated, followed by 'Annoyed' and 'Bored' for adolescents, and 'Calm' and 'Annoyed' for adults (Table 4a) No concurrent activity resulted in higher success for adults, while for adolescents, 'Focussed music listening' resulted in significantly higher success, followed by 'Housework' and 'Walking' (Table 4b).
To determine if five minutes of music listening was able to cause changes in emotional states, Mann-Whitney U tests were performed on the absolute change scores for valence, arousal, and intensity. Across all emotional goal conditions (primary, secondary, and no emotional goal), the mean absolute change of each of these affective measures was significantly greater than zero (p < .001 for all; see Appendix A).

Multilevel analyses
Results of the multilevel structural equation models of success are presented in Table 5. Predictors with no significant effect on success for any regulation goal were removed from the table, including gender, and many of the individual activities, strategies, and trait subscales. Success for the goal Improve was predicted by high initial arousal and intensity, along with music familiarity and attention. For Relax, success was strongly predicted by a negative initial valence, with success lower in the evening (5-8pm), and higher while going to sleep. Success for Raise was positively predicted by adolescence, agreeability, initial valence, music arousal, and the activity 'Nothing/waiting'. In the overall model (combining Improve, Relax, Raise, Diminish, and Enhance), success was negatively predicted by initial valence and the goal Enhance, and positively predicted by initial arousal, initial intensity, and music arousal. Intraclass correlations (ICCs) were notably low for all models except for Raise, suggesting that success was predicted almost entirely on the experience level.

Discussion
The current study investigated the frequency and predictors of short-term success in reaching affective goals while listening to music on mobile phones. Generally, both adolescent and adult listeners had difficulty in reaching their stated regulation goals within the observed five minutes. While levels of valence, arousal and affective state intensity were changing -regardless of goal typethese changes were often not in the intended directions. In terms of predictors of success, the affective state of the listener at the moment they started listening to music emerged as a key variable. Specifically, more negative initial valence and specific low-valence states led to higher levels of success in reaching goals.

RQ1: regulation goals were successfully reached in less than half of cases
The first research question was: how successful are young people in reaching different affect regulation goals? Results revealed overall success rates to be low for both age groups, with listeners reaching their regulation goals in less than half of listening episodes in which they had one (42.2% for adolescents, 47.8% for adults). This was also the case for most individual goals when regulation was the primary reason for listening, with the only exceptions being Maintain ('To maintain a < EmoState > mood'), and for adult listeners - Improve ('To feel better'). The only goal for which adolescents significantly changed their state in the intended direction was Improve, while for the adult group this change was significant for Improve, Relax ('To relax/calm down'), and Diminish ('To feel less < EmoState > '; while both groups had success for Maintain). Taken together, these results indicate that when listening to music, people more frequently fail to meet their regulation goals in the short-term, rather than succeed. These low regulatory success rates may have implications in terms of mental health outcomes. Maladaptive emotion regulation is a central component in the development of many forms of psychopathology, including mood and personality disorders (E.g. Berking et al., 2012;Gross, 2002). Inability to reach explicitly declared regulation goals may be an indicator of regulatory deficits, particularly in cases in which affect measures move in the opposite direction to what the listener intended. These cases could have adverse consequences for mental health, particularly when a listener is unable to use music to make themselves feel better, or to reduce stress. As these regulation goals were specified by the listeners during the listening episode, failure to reach them -or having the opposite outcome to that intended -is of some concern.
There are alternate explanations for the observation of low regulation success rates. It is possible that five minutes is not sufficient time for these regulation goals to be fulfilled. This concern of insufficient time is countered by the finding that affective measures significantly changed from their initial states -across all goal types -suggesting that states did have enough time to change, but that this change was not always in the intended direction. Furthermore, as the mechanisms that induce emotions from music listening do so over a timeframe of a few seconds , and most experiments that induce emotions with music use excerpts of less than two minutes (median time of 90 s; Eerola & Vuoskoski, 2013), it is expected that five minutes is sufficient time to induce new emotional states. However, specific regulation processes such as the reduction of stress has been shown to occur over a longer timeframe (Linnemann et al., 2018), so the current results must be interpreted in terms of short-term success only.
A related possibility is that over time, listeners may reverse the direction of affective change, so what may be considered a failure in the short term could become a success over an extended listening session. Related to this, it is important to clarify that short-term hedonic success may not predict long-term benefits of music listening. For example, regulation strategies such as distraction may provide short-term relief from negative experiences, but may be maladaptive in the long-term, as they prevent deep processing of emotional stimuli (Sheppes & Gross, 2011). Conversely, strategies such as reappraisal may involve the painful confrontation of negative feelings, but may be beneficial over time, as they provide semantic meaning to emotional information (Sheppes & Gross, 2011).
For some regulation goals, the lack of successful outcomes could be partially due to a ceiling effect. This is particularly relevant for Raise ('To raise/boost energy') and Enhance ('To feel more < EmoState > '), for which respective levels of arousal and intensity were already high at the start of listening. This potential ceiling may also explain how 'Confident' and 'Excited' were the only emotional states that had mean negative success scoresindicating general failure -for adolescent listeners. For the goal Relax, low success rates may be related to an ambiguity in how adolescents define relaxation. While success in relaxation in this study was defined as a reduction of arousal, adolescents may actually conceptualise it as a sense of decreasing feelings of stress and tension while becoming energised: it has been reported that when adolescents are asked to listen to their own relaxation music they report 'energize' almost as commonly as 'calm down' as an outcome of their music listening . Several current findings support this notion that success in relaxation is not as simple as decreasing arousal: 30% of experiences in which Relax was the primary goal resulted in an increase in arousal, 'Tired' was the most frequent state, and adults were more successful in relaxing when listening to music with significantly higher energy.

RQ2: success was predicted by negative initial affective states
The second research question was: what are the significant determinants of affect-regulatory success? Results from the combined multilevel model revealed the strongest predictor of success to be low initial valence, indicating that those who had a regulation goal when in a negative affective state had greater success in reaching that goal. This finding supports previous ESM research on reasons for listening (Randall & Rickard, 2017b), which concluded that personal music listening is utilised to fulfil specific emotional needs that are largely determined by initial mood. Initial affective states also played a central role in the multilevel models of individual goals, with initial arousal and initial intensity the strongest predictors of success for Improve. Furthermore, initial valence positively predicted success for Relax, and negatively predicted success for Raise. This finding suggests that arousal regulation is linked to initial valence -with those in a positive state more able to relax and less able to raise energy -and should be investigated further in future research. Low valence states also featured in the comparison of success across the most frequent emotional states, which revealed that 'Tired' was the most successfully regulated state for both age groups. This was followed by 'Annoyed' and 'Bored' for adolescents, and by 'Calm' and 'Annoyed' for adults. Each of these states were associated with a negative initial valence level. This result was also reflected in the goal frequencies during the Covid pandemic -a time of increased negative affect -which saw an increase in the use of Improve, and regulation goals overall. This is aligned with previous findings that people experiencing increased negative emotions during the pandemic used music more for solitary emotional regulation (Fink et al., 2021).
The main finding that listeners in negative initial states had more success in reaching their regulation goals carries several implications. It supports the notion that listeners use music specifically and deliberately for regulating their negative affective states, as has been found in previous ESM studies on mobile phones (Randall & Rickard, 2017b). It seems logical that musical affect regulation is at its most successful when the initial state is negative. An initially negative state is likely to present a concrete need to act upon that state, diminish it or change it for the better. In contrast, an initially positive state may not need a particular action; regulation (conscious or unconscious) is not necessary as the current state is already acceptable. Although success was only assessed for listening episodes in which a regulation goal was stated by the listener, reaching these goals may be of less importance when in an already positive state. This brings us back to the question of how music listening may relate to mental health and wellbeing, and presents an important perspective to our first research question and our observation of relatively low success rates. These findings may not be as alarming after all, when we consider the context of the experiences. These experiences represent mundane everyday situations that, for the majority of time, are not likely to consist of intense negative emotions. Thus, regulation attempts in these situations may generally be mild and non-critical for mental health. However, in cases when music episodes start with negative affect, music does seem to work as a regulatory aid in a desirable manner.
The one apparent exception to the finding that negative initial states were successfully regulated was the emotional state ' Anxious'. While all other negative states were successfully regulated by both groups, ' Anxious' was the only state with a mean negative success score for adults, and the least successfully regulated negative state for adolescents. This finding suggests that anxiety may be more difficult to self-regulate through music listening, when compared to other negative states. This appears to go against a meta-analysis of anxiety reduction through music listening, which found decreases in self-reported anxiety levels (Panteleeva et al., 2018), although research from the Covid pandemic found that high anxiety was related to having negative responses to music (Carlson et al., 2021). While both the current and latter findings were based on a small number of cases, this is an interesting finding that warrants further investigation.
The multilevel models revealed that individual-level variables such as personality traits were not significant predictors of success, which was reflected in notably low intraclass correlations (ICCs). The only exception to this was observed for the regulation goal Raise, for which success was predicted by the adolescent age group, and higher levels of agreeableness. Previous experience sampling studies on everyday music listening have shown that ICCs are generally very low for emotional outcomes ( ∼ .05; Randall & Rickard, 2017a), and also low for the characteristics of selected music ( ∼ .15; Randall & Rickard, 2017a;Greb et al., 2019). This finding is in line with these previous studies, and suggests that success in emotion regulation through music listening is determined largely within the listening context, and has little to do with the traits of the listener. This calls into question previous research that has attempted to explain regulation outcomes through their association with trait measures (E.g. Garrido & Schubert, 2015), and suggests that surveys -with their retrospective recall and lack of contextual considerations -are limited in their assessment of everyday self-regulation with music.

Regulation may require more effort for adolescents
A finding of particular interest that emerged from the results is the association between age group and goal type. Adolescent listeners more frequently had a regulation goal as their primary reason for listening, and when this was the case, they had overall higher success ratesand lower failure rates -than adults. Conversely, adults more frequently had a secondary regulation goal, and were more successful in reaching these goals. The effect of this was apparent for the goals Relax and Diminish, for which only adults were able to successfully regulate their states, largely due to their higher secondary success frequencies. A possible interpretation is that as regulation abilities develop with age, regulation requires less active effort, and is able to occur passively, without being the specific focus of the listening experience. Support for this notion is found in the results for concurrent activities: regulation success for adults was unrelated to their activity, while adolescents had greater success during focused music listening. Following this, success was highest for housework and walking -activities that may not require active cognitive effort -while the only negative mean success score was recorded for gaming, which usually requires attention and effort. Increased efforts for regulation may also be related to the finding that adolescents enjoyed music significantly more than adults when they were listening with no regulation goal. However, it would appear that this effect is not directly related to attention, with no significant differences between age groups in attention for any goal, and no major influence of attention seen in the multilevel models (only a weak effect in the Improve model). Therefore, regulation success for adolescents might not require attention to the music, but rather a lack of distraction from concurrent activities. This may involve a greater focus given to regulatory thoughts and cognitions, although this was not directly assessed in the current study, so requires further investigation.

Limitations and future research
There are several design and sampling issues that leave room for improvement in future studies. As previously discussed, some results or lack thereof could have been due to measurement of success over only five minutes. Future ESM studies can extend this time-frame, and include a third-time point, to investigate how affective states change beyond this short-term hedonic change. Another potential design concern is that all regulation goals were assessed at the end of this five-minute period, rather than at the start of listening. This design choice was made to minimise intrusiveness on the initial listening experience; however it may have introduced some potential retrospective recall biases. It is possible that listeners reported how they think their states changed, rather than how they intended them to change, which may have been particularly influential on secondary regulation goals. However, despite this potential for retroactive goal changing, the main finding that overall success rates were low suggests that the impacts were not substantial. A recent addition to MuPsych studies is a button interface presented at the start of listening, to quickly assess intention to change valence (with buttons for: 'Worse'/'Same'/'Better'), arousal ('Less energetic'/'Same'/'More energetic'), and intensity ('Less < EmoState > '/'Same'/'More < EmoState > '). In addition to the recall issue, this interface helps to address the previous finding that music listening is used to fulfil different functions simultaneously (an average of 3 per episode; Greasley & Lamont, 2011). Furthermore, the use of specific wording for arousal goals ('Less energetic'/'More energetic'), and coinciding intensity goals (E.g. 'Less Anxious') will allow for more direct measures of success, and avoid problematic definitions such as that for Relax as only a decrease in arousal. Future experience sampling research on regulatory success could also incorporate musical feature analysis, to determine the effects of these more objective music variables on determining success.
Some potential sampling issues were also evident in this study, with the sample consisting entirely of Finnish Android users. While the differences between Android (ESM) and non-Android (survey) groups may have only been age-and gender-based, future studies should recruit a larger sample from a wider range of countries and demographic groups. A specific issue for this study is that while data collection occurred both prior to and during the COVID-19 pandemic for the adolescent group, all data from adults were collected during the pandemic period. Finally, while the analysis grouped listening experiences within listeners, it did not consider the longitudinal effects of these episodes. Gradual development -the cumulative effects of several subsequent listening episodes -should be taken into account, as these may indicate particularly unhealthy listening patterns (Miranda et al., 2012).

Conclusions
This study investigated the frequency and predictors of success in reaching regulatory goals during music listening. It found that while adults were more successful in reducing unwanted states, listeners were able to reach their affect regulation goals in less than half of listening episodes in which they declared one. An association between age group and goal type was observed, suggesting that regulation with music may become more passive as skills develop. The finding that listeners may not be selecting music to reach their affective goals provides support for the use of hyper-personalised curation algorithms, that take into account initial affective state, and specific goals. Success was strongly predicted by negatively-valenced emotional states at the start of listening, and was higher for those experiencing negative initial emotional states, with the notable exception of those in an anxious state. Listener level variables had very little effect on success, meaning that success is largely determined within contexts, and not pre-determined by listener traits. These findings indicate that when listeners are in a negative mood, music listening can become a powerful self-regulation resource to reach their affective goals.

Disclosure statement
No potential conflict of interest was reported by the author(s).