The neuroscience of music – towards ecological validity

Studies in the neuroscience of music gained momentum in the 1990s as an integrated part of the well-controlled experimental research tradition. However, during the past two decades, these studies have moved toward more naturalistic, ecologically valid paradigms. Here, I introduce this move in three frameworks: (i) sound stimulation and empirical paradigms, (ii) study participants, and (iii) methods and contexts of data acquisition. I wish to provide a narrative historical overview of the development of the field and, in parallel, to stimulate innovative thinking to further advance the ecological validity of the studies without overlooking experimental rigor.

Studies in the neuroscience of music gained momentum in the 1990s as an integrated part of the well-controlled experimental research tradition. However, during the past two decades, these studies have moved toward more naturalistic, ecologically valid paradigms. Here, I introduce this move in three frameworks: (i) sound stimulation and empirical paradigms, (ii) study participants, and (iii) methods and contexts of data acquisition. I wish to provide a narrative historical overview of the development of the field and, in parallel, to stimulate innovative thinking to further advance the ecological validity of the studies without overlooking experimental rigor.

From experimental laboratory studies toward real-world settings
Studies in the neuroscience of music were initiated about 30 years ago in the framework of the well-controlled, laboratory-based scientific tradition in the auditory modality. More recently, these studies have begun to move toward the practices and procedures of naturalistic, ecologically valid real-world neuroscience. This move is well justified, as generalizability of the studies conducted in the most rigid auditory paradigms may be equivocalfor instance, from the neuroscientific viewpoint, there are several brain areas that are activated only or more rigorously by multimodal input. In parallel, from the music psychological viewpoint, live music performance and listening contexts in real life are far from laboratory contexts. Finally, from societal inclusion viewpoints, having study participants from different age groups and from various ethnicities, educational backgrounds, and health conditions is of fundamental importance.
In this paper, I introduce the move from the laboratory-based tradition toward real-world neuroscience in three frameworks: empirical paradigms and their stimulation, study participants, and the methods and contexts used in data acquisition. My emphasis in the current opinion paper is on music studies with some reflections from social and affective neuroscience. These reflections are motivated by many commonalities shared between these fields: According to prevailing theories, music has survived in our culture throughout history thanks to its implicit social and affective functions, even if it has no explicit evolutionary function.

Sound stimulation and empirical paradigms
In the pioneering years of the neuroscience of music in the 1990s, the most important brain research methods were electroencephalogram (EEG) and positron emission tomography (PET) (see, e.g., [1,2], respectively). These early studies differed greatly in their acoustic and musical attributes of the stimulation material. The EEG method, when used as a platform of event-related potentials (ERPs) time locked to sound encoding and cognition, requires repetition of one or several tones of interest for tens or hundreds of times. This procedure unavoidably leads to unnatural sound sequences that have little or no resemblance to music in actual practice. However, the EEG method is free of this limitation when used as a platform of frequency spectrum, phase, and coherence analyses. Using these analytical techniques, the experimental stimulation can Sound stimulation and empirical paradigms are now closer to real music, regardless of the research method being used.
Study participants cover the whole lifespan and include participants with typical and atypical behavioral and neuropsychiatric profiles.
Methods and contexts of data acquisition are more versatile and also include everyday settings, such as daycare centers and schools.
Further needs for improved practices have been identified, such as selfreports based on digital data collection tools and inclusion of music cultures not based on Western music practices.
consist of actual musical works (e.g., see the pioneering work [3], whose participants were listening to classical music). Already in the earliest ERP studies, using highly complicated technical procedures in preparing and producing the stimuli, naturalistic music excerpts were also in use [4,5]. Similarly, in PET [and later in studies using functional magnetic resonance imaging (fMRI)], the participants can be listening to entire songs and long excerpts of music (e.g., see foundational works on the brain basis of musical pleasure [6] and tonality [7]). Due to these differences in the stimulation and the brain processes indexed by the resulting data, these methods have complementary contributions in the scientific literature (described next).
Thanks to the excellent millisecond-accurate time resolution of ERP, ERP studies have illuminated aspects of not only initial auditory processing but also subsequent stages of auditory prediction [8] and music cognition [9]. Moreover, when time-locked brain signals are investigated with magnetoencephalography (MEG), their origin in the human neocortex can be reliably modeled [10]. In parallel, studies using spectral analyses of the EEG as averaged across a longer time (e.g., 1 or 5 min) enabled recognition and separation of various subsequent stages of music cognition (e.g., [11]) and monitoring their development [12]. PET and fMRI, in turn, are powerful tools to identify cortical and subcortical areas of the human brain involved in cognitive and emotional processing of music excerpts (for reviews, see, e.g., [13,14]). However, the temporal resolution of these methods based on detection of radiotracers or monitoring hemodynamic signals is considerably lower than that of ERPs.
Two influential paradigms were developed with the aim of investigating the brain basis of musicsyntactic processing using ERP methodology. They were established by presenting the participants with musical phrases [15] or chord cadences [16] that had sound or chords either fulfilling or violating the predictions of the participants during brain recordings. The results of these studies showed that the tones violating the predictions evoked distinct responses (P600, ending tones in musical phrases) and an early right anterior negativity (ending chords in chord cadences). These findings indicated the existence of the musical predictions resembling those generated in the linguistic domain. In parallel, this line of research also brought the ERP paradigms a significant step closer to naturalistic music.
In parallel, the neural accuracy of sound perception and cognition was mapped to a large extent using an ERP called mismatch negativity (MMN), which (in the context of auditory processing) is an index of a memory-based sound-prediction mechanism. To evoke MMN, the sound sequence must include a deviant, a sound that violates the prediction. In traditional oddball paradigms, each sequence had between one and three deviants, making the studies rather long and repetitive. The stimulation was also far from music or even music-like stimulation. As recently summarized [17], to improve the ecological validity of the paradigms, systematic development of the MMN stimulation paradigms was initiated in the 2010s. The first paradigm was based on Alberti bass, which refers to an arpeggiated chordal accompaniment used in the classical era. In the subsequent experimental MMN paradigm, the sounds are presented in a looped manner in randomly varying keys with six different deviants [18]. Notably, an atonal version of the paradigm has also recently been launched [19]. The second MMN paradigm with a similar justification is based on a brief repetitive melody [20]. This melody also includes a total of six deviants, three of which modulate the structure of the melody for its successive presentations. Data collection in both paradigms is notably faster than in traditional oddball paradigms using single stimuli. Despite their repetitiveness, the chordal and melodic paradigms are also closer to musical context. Moreover, studies employing these paradigms have shown that the MMN reflects the musical expertise and background of the participants in a genre-specific manner: the sound parameters that are most important in the musical genre of a musician evoke the largest MMN or P3a response (for a review, see [21]). Together, these paradigms thus offer sensitive means to probe (at group level) the expertise profile of musicians that would not be available if using a traditional oddball paradigm.
To support audio and music-related studies more generally, the music information retrieval (MIR) toolbox was generated some 15 years ago, and its use since then has been widening [22]. This MATLAB-based platform enables researchers to quantify over 100 acoustical and musical features, such as pitch, tonality, rhythm, and structures, from an audio excerpt. In an early study using the MIR toolbox in a brain study, the authors quantified the audio excerpt that was presented to participants during EEG recording and analyzed the resulting EEG as a function of the sound content in a given context [23]. For instance, this was conducted in terms of loudness or pitch after computationally comparing sound attributes of each sound with its context. (Note that in traditional ERP analyses, there are codes in the sound sequence files to index the sound content.) This procedure was recently used in an advanced manner to improve the data quality further by increasing the number of the ERP epochs for each analysis [24]. In another, even earlier study using the MIR toolbox, the authors analyzed their fMRI data as a function of the sound parameters extracted from entire pieces of, for instance, a Piazzolla tango [25] (for video visualization, see i ).
This section on the development of auditory stimulation paradigms is concluded by discussing the music genre that was used in those studies using complete music excerpts or longer sound sequences, mainly in fMRI. The tradition of using classical music in music emotion and music psychology studies was broadened by brain-scanning studies on chills as the ultimate index of music pleasure [7,26]. In these studies, the participants had the freedom to choose the music to be listened to in the laboratory, provided they could predict the music that produces chills. These studies implicated limbic and paralimbic brain areas as integral elements of the neural basis of music preference and emotion. In a more recent study along a similar vein, the authors instructed their participants to listen to self-selected music when investigating the intersection of music preference and music-induced emotions [27]. It appeared that the participants had remarkable variation in the repertoire they chose to listen to in terms of genre, as well as language of the lyrics (for participants who chose vocal music rather than instrumental music). Continuous (preselected) music was also used in pioneering EEG studies investigating music-induced emotions with music excerpts from various genres alone and in interaction with visual stimuli [28][29][30]. Accordingly, in neighboring fields such as social and affective neuroscience, the use of movies has become increasingly common; indeed, movies (in either original or edited forms) and narratives offer a good means to investigate the brain basis of emotional processes and social interaction during brain scanning when the laboratory setting prevents the presence of other people during the investigation [31].
Accordingly, in the future, it is likely that the experimental stimulus will more often be real music (either live or recorded) instead of isolated sounds or repetitive computer-generated sound sequences. By this arrangement, integration of auditory and motor processes can be traced in a naturalistic context [32]. This is an important improvement also when considering the ecological validity of the emotional processes active during the studies; live music can evoke stronger music emotions than recorded music even in terms of body movement while listening to music [33]. However, to add some necessary control upon the natural music used in brain studies, it is recommended that there are behavioral ratings for each music excerpt (either during the brain recording or before or after it) to quantify the affective and aesthetic musical experience.

Study participants
I now illustrate how the profile of the participants recruited in the studies has changed during the past decades. My examples originate from the studies illuminating the brain basis of musical expertise; these studies had a central role in the 1990s, when the field of neuroscience of music emerged. At that time, the participants were typically healthy adult volunteers. There were seminal studies comparing adult classical musicians and participants without musical expertise as a performer [34][35][36][37][38][39]. Similarly, musicians with and without absolute pitch ability were compared using ERPs ('absolute pitch' refers to the ability to name a sound pitch in the Western musical scale) [40][41][42][43]. Later, classical amateur musicians were also recruited [44], followed by amateur rock musicians [45]. More recently, comparisons between self-taught musicians, traditionally trained musicians, and nonmusicians have been reported as well [46]. Musicians specialized in global (folk, world) music, rock, classical, and jazz music were invited to various studies in 2010s [18,47,48]. Originally, studies on music performers focused primarily on instrumental musicians, but later singers were also recruited into EEG and fMRI studies [49][50][51][52]. More recently, participants have even included specialists with less common music performance skills, such as beat boxing [53]. Thus, current recruitment practices include musicians of various backgrounds, reflecting the broad variety of musical expertise profiles.
In addition to musically advanced participants, adult participants with congenital amusia ('tone deafness') [54][55][56][57][58][59] or with neurological or psychiatric disorders have been target populations in several pioneering neuroscientific studies. These disorders include dystonia [60,61], stroke with or without acquired amusia [62][63][64], and depression [65]. Later, large cross-sectional and longitudinal studies of these and other populations have been launched in music medicine and music therapy contexts with the methods of cognitive neuroscience (for a review, see [66]). In some cases, the behavioral methods (including observations made by the healthcare professionals and family members) may be sufficient indicators of rehabilitation and therapy efficacy. Yet, the use of neuroscientific methods is invaluable when the brain basis of recovery is the focus. Here, it is also worth noting that not only individuals with neurocognitive or psychiatric disorders but also healthy elderly participants are currently of increasing interest (e.g., [67,68]; for a review, see [69]).
Similarly, an increasing number of studies have also been initiated in children to investigate the brain basis of music learning [70] and to understand the transfer effects caused by music learning in improving nonmusical auditory and cognitive skills [71][72][73]. These approaches have been extended to language-learning studies [74] and to studies on underprivileged populations [75,76]. Subsequently, studies have assessed the impact of music in the treatment of neurocognitive disorders, such as dyslexia [77] and hearing impairment [78]. In terms of age groups, some of the studies were performed with children aged 4-5 years [70], later extended to toddlers [79] and to older school-aged children [20], up to high school age [80].
When the earliest aspects of auditory learning were investigated, the studies first focused on fullterm infants (e.g., [81]) and more recently on preterm infants with and without music-enriched care [82,83]. In addition, mothers and their infants have been recruited into studies to determine whether auditory learning in the fetal phase manifests in the EEG of the newborn infants in ERPs [84,85] and in subcortical frequency following responses [86].
Thus, data collection has currently been extended to cover the entire lifespan and to include both neurotypical and neuroatypical individuals. However, it is worth noting that most of these studies have been and are being conducted in Western countries and in some non-Western countries with sufficient resources and research infrastructure. The contributions of non-Western countries have been crucial, for instance, for studies involving tonal language speakers in music intervention studies (e.g., see [87,88]). The limited global inclusion of participants and research organizations has been noticed particularly within the community of music psychologists with behavioral research methodology (e.g., [89,90]). Conducting cross-cultural studies within the neuroscience of music will take time, because there are many non-Western countries that lack the necessary research infrastructure for studies in auditory cognitive neuroscience or in neuroscience of music. However, thanks to the current rapid development of mobile technology that records physiological signals (EEG and near-infrared spectroscopy [91,92]), it will be possible to conduct relevant data collection from a more global perspective.
As discussed next, technical advancements have promoted the usefulness of EEG outside the laboratory. Such nonlaboratory settings are more readily accessible than academic or clinical recording, which may be of relevance to certain populations (e.g., children and individuals from underserved populations). Thus, particularly for longitudinal intervention studies, in which several recordings are a prerequisite for successful data collection, such settings may minimize the number of dropout participants (for discussion, see [93]). Moreover, as discussed above, mobile technology will also considerably increase the inclusion of participants from broader geographical regions.

Methods and contexts of data acquisition
Most of the EEG studies described in this paper were conducted in laboratory settings because, at that time, the EEG recording technology required electrical isolation to achieve a sufficient signal-to-noise ratio. However, such an environment can be considered suboptimal, particularly for studies with children and with clinical populations. For these populations, an unfamiliar environment may be more disturbing than it would, for instance, for healthy adult volunteers.
Fortunately, auditory ERP studies have recently been conducted successfully in other recordings sites, such as schools [73,94,95] and daycare centers [96]. This was made possible by novel recording technology; each EEG electrode has a preamplifier, making the recordings more tolerant to electrical interference. As a result, the availability of electrical shielding is no longer necessary. In parallel, current amplifiers are considerably smaller than those previously used and are often similar in size to a mobile phone. Moreover, commonly available wireless data transfer makes technical preparations considerably less complicated during data collection than in the past.
Live music has been performed during EEG recordings to small audiences in concert-like settings [97,98]. Other biosignals, such as electrocardiogram, skin conductance, and facial muscle responses, have recently been recorded during a live classical music concert from large groups of participants [99]. The physiological concomitants of stress reduction caused by music listening have also been investigated at the homes of participants using saliva cortisol measurements [100,101]. Perhaps the most challenging environment for EEG recordings was established in the field of social neuroscience. In one study, the authors reported data from close to 2000 museum visitors about the impact of facial interaction on their brain synchrony [102]. Even though the study lacks a musical element, it is worth mentioning in this context because the paradigm and procedures would be applicable to joint music listening studies in the future.
Of note, many of the studies in the field reviewed above have focused on the perceptual, cognitive, and emotional aspects of music listening (paralleled with an emerging interest in investigating mental imagery [for review, see 103], musicians' brain activity in silence [104], and while listening music at the background [105]). In traditional settings, most studies were performed while the participants were instructed to listen to experimental stimuli or, to a lesser extent, live or recorded music, often doing so alone. This research approach has been sometimes criticized because it overlooks the essential aspect of music making. However, this emphasis on empirical efforts particularly on music perception is a consequence, in part, of the unfortunate limitations of brain research methods; if the participants are performing music during the recordings, excessive muscle artefacts reduce the signal-to-noise ratio of the data, thus compromising the ability to infer refined insights about the brain functions of interest. In the context of EEG studies, progress has been made in recent decades, allowing recording and analyzing EEG data during a greater range of participants' movement than in the past and, in some studies, even while participants were performing live music [97,106,107] (for further discussion of advances in data analyses in this context, see [108]). Multimethod recordings were also successfully conducted in studies where participants repeated well-learned sequences on the keyboard while their finger movements, Musical Instrument Digital Interface (MIDI) data, and EEG were monitored [109]. Moreover, even though MEG or fMRI scanning cannot be performed with musical instruments that have metal objects embedded in them, special instruments have been constructed for use with MEG (new musical instruments without metal objects [110]) and fMRI (e.g., keyboard [111][112][113]; cello [52]). Studies using PET and fMRI in singing participants should be mentioned in this context as well (e.g., [49,52]).
New avenues in studies of music listening and performance are possible in modern concert halls that include technology for recording physiological signals during live performances. Among the places that harbor such halls are the LIVELab ii in McMaster University, Hamilton, Ontario, Canada, and the ArtLab iii at the Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany. Both these locations offer the possibility to measure EEG and physiological data (such as electromyography, heart rate, breathing rate, and skin conductance) in the auditorium and on stage. Other labs focus on multimodal data collection in real-world venues, such as the fourMs Lab at the University of Oslo iv and Casa Paganini at the University of Genova v . One recent initiative, Magics vi in the Aalto University, Espoo, Finland, has a focus on the use of virtual reality as part of artistic performance and study. Both in-lab and out-of-lab procedures allow for recording video and audio alongside the physiological measurements and collecting self-reported data, for example, about emotional state and immersion of the artist and the audience. Although data collection and analyses in such innovative contexts are challenging, future work using these approaches would offer stimulating insights into both the artistic and scientific aspects of music (see Outstanding questions).
Thus, even if there are several constraints when planning and conducting brain and physiological studies during music performance, there has also been notable development of the field in many ways. This includes development of mobile research technology that can be used during music performance, musical instruments that meet the challenges of data acquisition, and advanced data analyses that enable analysis of a noisy brain signal.

Concluding remarks
As discussed above, the neuroscience of music is likely to be more versatile in the future than it used to be. Among the possible changes are a move from mostly passive to active listening paradigms and broader inclusion of various types of live music performances. It also hoped that greater diversity in terms of study participants will become commonplace.
Some of the key factors for further progress include technical development of EEG recording setups, more advanced data analysis techniques, and new musical instruments without metal objects for MEG and fMRI. Moreover, recordings of physiological signals can now be conducted in nearly every setting and location, allowing collection of additional useful data. Last, even if a given methodology requires some technical solutions feasible only in a fixed space, such as in the case of (research-based) motion-capture systems, their innovative use can lead to breakthroughs in the field of the neuroscience of music. Notably, merging these researcher-driven

Outstanding questions
What are the commonalities between the neural imprints of musical expertise in perception and performance? In other words, when investigated using perceptual, cognitive, and motor paradigms in naturalistic settings in a within-participant design, are musicians advanced in all of these domains as indexed by enhanced brain functions when compared with participants without training in musical performance?
Can we determine the most reliable near-and far-transfer effects in cognition and socioemotional processing caused by music training by using neuroscientific methods in conjunction with behavioral indices? In this context, near-transfer effects are the benefits of music training that remain within auditory or motor modalities; far-transfer effects are the benefits of music training that go beyond the music training as such (e.g., memory and attentional functions).
Can we identify and optimize the neuroscience-based means to improve cognitive, socioemotional, and motor processing by music learning, music education, and music-based neurorehabilitation? By this very generic research question, I encourage scientists to design research paradigms that would enable conclusions beyond one specific domain of human actions.
What are the neural determinants of optimal music skill learning for typical and atypical learners? In other words, how would music educational practices (such as individual or group based) be tailored based on music-specific neural functions for individuals with different learning capabilities and goals of their music making?
What music functions are based on universal perceptual, cognitive, and neural processes? What music functions are (more) specific to a given culture? By these questions, I encourage scientists to adopt a global mindset to include a variety of musical cultures and behaviors in theoretical and empirical endeavors. techniques with user-driven data collection techniques will open a new era in music neuroscience. These new forms of data collection include digital data indicating the time course and content of music listening and learning to play, social media activity, and 'everyday' physiological data (such as heartbeat) acquired from activity wristbands and rings. By these means, a holistic picture on the use of music in daily life paralleled with the neural and physiological processes can emerge. However, careful consideration of ethical aspects related to data privacy and ownership is needed in the design of studies involving large digital user-driven datasets in parallel to sensitive physiological data collection.
To conclude, together with other already well-established neighboring fields, such as social and affective neuroscience as well as emerging ones such as educational neuroscience [114,115] and the neuroscience of dance [23,116,117], it is likely that music neuroscience will continue to flourish in revealing the secrets of brain functions behind music perception, cognition, emotions, and appreciation. The progress of this field depends on methodological development that enables easy-to-conduct empirical investigations without laboratory-like environments and with naturalistic stimuli. By using the means discussed in this opinion paper in a holistic manner, it is possible to continue taking the studies closer to real-life settings while maintaining sufficient control over research quality.

Declaration of interests
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.