Creating a common language for soundscape research

Much of the work into the understanding of our auditory environment, referred to as soundscape research, has emerged from international and interdisciplinary research. This has enabled growth in understanding and increased opportunities for optimising shared environments but has also formed one major obstacle: a lack of a common language to describe soundscapes. Therefore, the purpose of this study is to validate translated soundscape descriptors in Dutch as part of the Soundscape Attributes Translation Project (SATP). For this


Introduction
Human auditory perception is integral to how we attend, interact, respond, and transform our surroundings [1].Therefore it is no wonder that much of the work into perception and understanding of the auditory environment (referred to as soundscape research) has emerged from interdisciplinary research in the fields of acoustics, architecture, environmental studies, and psychology [2].Such an interdisciplinary approach has enabled growth in understanding soundscape perception, increasing opportunities for application by urban planners and others involved in constructing and optimising shared environments.For example, the acoustic environments of schools, parks and other facilities can be investigated to create optimal soundscapes that facilitate the purpose of learning, relaxation and more [2][3][4].Besides creating optimal environments for various purposes, soundscapes allow for research on sociocultural, attitudinal, and physiological factors in sound perception.For instance, when the relationship between attitudes toward COVID-19 and outdoor soundscape appraisal was investigated, people that were more concerned with COVID-19 were also found to be more sensitive to sound exposure [5].
Research on the perception of soundscapes faces a major obstacle: the absence of a common language.Previously, Axelsson and colleagues developed eight attributes to describe soundscapes in Swedish and translated them into English [6,7].They became the de-facto standard of soundscape description as noted by Nagahata [8].These attributes were translated to English as "Pleasant, Annoying, Eventful, Uneventful, Vibrant, Calm, Chaotic, and Monotonous".They were subsequently translated into over ten languages including e.g.Portuguese, Indonesian, and Greek.However, several soundscape researchers noted the lack of validation of this standard across these languages [9,10].Validation of such measures is crucial to ensure their cross-cultural and crosslinguistic applicability, since evidence suggests that similar words even within the same language can be used in functionally different ways to describe different scenarios [11].For example, previous research into the interlingual compatibility of the descriptors "noise" and "sound" found that the neutral term "sound" could obtain a negative connotation when directly translated into Japanese where it was observed to be synonymous with the traditionally negative descriptor of "noise" [9].Further, a study by Cao & Gross [12] supports the idea of cultural differences between participants in the processing of sensory stimuli.This is rooted in the social orientation hypothesis describing cognition as a source of cultural differences and social context as a determinant of the perception of sensory consequences.In a study with Dutch and Norwegian participants, it was found that differing political, historical, and cultural contexts influence the understanding of apparently straightforward notions [13].Linguistic equivalence of words does therefore not straightforwardly imply equivalence in meaning.
Given the lack of validated cross-linguistic descriptors to assess soundscapes, the Soundscape Attributes Translation Project (SATP) was conceived to validate the above-mentioned attributes as proposed in ISO/TS 12913-2:2018 [14,15].Although English is the most spoken language on the planet, it is spoken only by 18 percent of people, of which fewer than a third are native speakers [16].With all intended translations within the SATP, about 2.53 billion native speakers from all around the world would be represented, enhancing the international utilisation of validated soundscape attributes in research, as well as application [14].
The current study is concerned with the validation of the Dutch soundscape attributes as part of the SATP, through a standardised listening experiment following ISO recommendations and the SATP protocols.The field of soundscape studies has been avidly adopted in The Netherlands and Flanders (the Dutch speaking part of Belgium).Publications (to name a few) range from applications in healthcare for people with disabilities [17] and dementia [18,19], to urban planning [20,21].But also numerous studies related to public health [22] and noise abatement policies [23] were published.Even historical soundscape studies were conducted in The Netherlands [24].While some of these studies use Dutch translations of the ISO soundscape attributes and Weber [23] included a principal component analysis as part of one of them, it seems there are no explicit formal attempts at validating the vernacular.Therefore, the aim of this study is to test the conjecture that the proposed Dutch translation of the soundscape attributes are employed similarly to the English attributes.If they indeed match on perceived meaning, the Dutch translations should lead to similar average ratings of soundscape appraisal compared to the English attributes, indicating that they are suitable to be employed in future soundscape research in The Netherlands and Flanders.

Participants
The study involved 32 participants (22 women, 10 men).All participants were adult Dutch native speakers and first-year psychology students at the University of Groningen, in the Netherlands.Thirteen participants were below the age of 20, 18 participants were between the ages of 20 and 25, and one participant was between 31 and 35.The recruitment of participants took place through the SONA participant pool used by the Psychology department of the University of Groningen.All participants signed up voluntarily.Participation was compensated through the assignment of SONA credits, which the participants needed to fulfil their study program's requirements.All participants indicated having no history of hearing loss, though normal hearing was not assessed through audiometry.Many participants reported that the stimuli were louder than they had expected.

Translations
The first step of the standardisation process involved translating the attributes from English to Dutch.To obtain these translations two expert panels with soundscape researchers were formed, one in The Netherlands (N = 4) and one in Flanders (N = 3).All members had previous experience with translating or employing the Dutch attributes in scientific studies.Both expert panels independently of each other held group discussions and provided two or three translations per attribute.Subsequently, the chairs of the Flemish and Dutch groups discussed the proposed translations, which contained as many differences as similarities.These differences were somewhat expected, as Flemish can be considered a dialect of Standard Dutch with some lexical and grammatical differences.When the translations between the groups overlapped, those words were immediately selected for further use in the validation procedure (see Table 1).After thoughtful consideration, consensus was reached on the other attributes as well.For this, the chairs of the expert groups chose words that are common in both Standard Dutch and Flemish and were likely to be used by laypersons (as opposed to picking highly technical terms).
Throughout the process, the main focus of the translation was to secure the contextual meaning of the initial English attributes, not the literal linguistic meaning.For example, the adjective "eventful" may be applicable to sounds in English but the literal translation "veelbewogen" may not be prevalent in Dutch.This approach was supposed to eliminate the variation in soundscape description due to translation errors and word interpretation.Therefore, two translations for each attribute were developed.Furthermore, effort was made to capture the antipodal nature of the attribute pairs belonging to the orthogonal dimensions in the circumplex model (e.g.eventful -uneventful, see Fig. 1; [6,7].For one of the attributes this led to more neutral terminology in Dutch than in English: the term annoying could be literally translated as "irritant" or "hinderlijk", but both expert groups suggested translations that are closer to antonyms of pleasant, than literal translations of annoying, namely "onaangenaam" and "onprettig".

Stimuli
The SATP is led by the University College in London (UCL), which provided the audio files of multiple soundscapes that were used in this study.The final 27 stereo audio files were recorded between the spring and autumn of 2019 in London.They were set out to consist of a broad range of different sources and compositions of sound, such as a quiet park, a busy shopping street, or a construction site [25].In a pilot study, the final 27 audio files were selected from over 50 recordings to evenly cover all soundscape attributes.The duration of all audio files was 30 s.The audio files captured auditory environments containing mechanical, natural, and human activity sounds.Each SATP team was encouraged to use the same audio stimuli [26].

Headphones
Following the SATP guidelines of standardisation, we utilised the Sennheiser HD650, over-ear, open headphones, connected to a Windows PC.To calibrate the headphone's playback level, we connected the Focusrite Scarlett 2i2 as an external sound card to the headphones and used a multimeter to set the volume of the system to a standardised voltage of 355 mV.

Rating scales
The web-based software Qualtrics was used for the response collection and was displayed on a monitor.The instructions provided at the top of the page were in line with the ISO recommendations and read as follows "For each of the 8 scales below, to what extent do you agree or disagree that the present surrounding sound environment is…".The eight slider scales were labelled with the Dutch translations of the soundscape attributes (see Table 1).The order of the attributes on the eight aforementioned scales were the same for all participants.A 100step slider was used ranging from 0 to 100 for every scale with a default position of the slider on 50.

Procedure
The procedure of the study was standardised and a detailed guide was provided by the UCL [14].Ethical procedures were followed, with formal ethical approval to conduct this study obtained from the Ethics Committee of the Psychology Department at the University of Groningen, in the Netherlands.The flowchart in Fig. 2 provides a visual overview of the entire methodological process.
The study took place in a sound-attenuated room.When arriving at the lab, the participants were asked to leave their phones outside and then sit down in front of a screen on which the study would be displayed.After obtaining informed consent, a small test run was conducted to provide the participants with an impression of the study's trial procedure and rating scales and an opportunity to ask questions.Then, the participants were informed that the actual study would start.It began with a short survey assessing the participants' age, gender, nationality, and which languages they spoke.After that, each participant listened to the 27 audio files in random order and rated them on the eight different attributes.They were required to listen to each stimulus for thirty seconds before proceeding to their evaluation.During the display of the rating scales, it was allowed to replay the stimulus as many times as desired.After the rating, the participants sat in 30 s of silence before they could proceed to the next sound stimulus to reduce interference between two consecutive stimuli.A timer was visible to them.The study was completed when each audio file was listened to once; subsequently, the participants were asked questions regarding their experience and thoughts on the experiment.Except for the informed consent form shown at the very beginning, the whole study was conducted in Dutch.Most participants completed the study within 50 min, with a few participants taking a few minutes longer.

Results
Table 3 shows the mean and standard deviations for the ratings on each attribute in Dutch for each audio file.These results are also visualised in Fig. 3, along with the average ratings of the English sample obtained in the study by Oberman and colleagues [26].Visual inspection shows that, with a few minor exceptions, the ratings are fairly well matched between the two languages, indicating that the Dutch translations were used in a similar fashion to the original English attributes,  Furthermore, formulas 1 and 2 were employed to calculate a pleasantness score and eventfulness score for each audio file.These formulas are a manipulation of the formula provided in ISO/TS 12919-3:2019 [27] (see Formula 1 & 2), which converts five-point Likert scale responses into coordinates.Our manipulation consisted of adjusting the formula to function with 100-point Likert scales.We performed the same transformation to the original English data collected by the University College London [26].
Plotting both sets of scores shows small differences between the Dutch and English data, but overall presents a coherent pattern between both languages (Fig. 4).For illustration, the data from the Dutch sample were subtracted from the English sample to create a difference plot, showing that Dutch participants rated the audio files slightly more pleasant and eventful than the English participants, driven by lower ratings on the attributes Annoying and Uneventful (see Fig. 5; mind the scale-difference for readability).To further inspect which audio files led to the largest differences in ratings between the two samples, the difference plot in Fig. 5 was rendered.It shows four purple markers that indicate audio files where the difference score resulted in a change in quadrant in the circumplex.In all four cases, the change in quadrant was borderline and not significant: an audio file was never rated categorically different on the Pleasantness or Eventfulness attribute.The files in question are W09, W15, W23a, and E10, of which the former two are dominated by mechanical and traffic sounds and the latter two clearly feature human sounds.The six audio files the largest differences between the translations were (in order of largest to smallest Euclidean distance, with a cut-off point >0.15)W09, CT301, W01, E01b, E12b, and HR01.All these audio files primarily feature (monotonous) mechanical sounds.Consistent with the overall trend, these files are rated as more pleasant and eventful by the Dutch sample.
Since a Null Hypothesis Significance Testing (NHST) framework makes the interpretation of a null effect difficult, we also compared the Pleasantness and Eventfulness ratings in the two samples using independent t-tests from a Bayesian framework (which does provide the possibility to evaluate the relative evidence for the null and alternative hypotheses).We performed this analysis with JASP [28].In both cases, there was moderate evidence [14] in favour of the null hypothesis (of no difference), with Bayes Factors of BF 10 = 0.302 and BF 10 = 0.294 for Pleasantness and Eventfulness, respectively.We continued this for all eight attributes, of which the results are shown in Table 2.For most of the attributes there is moderate evidence in favour of the nullhypothesis, except for the attributes Uneventful (BF 10 = 0.361) and Annoying (BF 10 = 0,377) for which anecdotal evidence was found.

Discussion
The outcomes of this study show moderate evidence in favour of the conjecture that the Dutch translations of the soundscape attributes met their objective of being employed similarly to the English attributes, indicating that the contextual meaning of the attributes has largely been preserved during the translation process.Albeit not statistically significant, some differences were found indicating that the Dutch sample rated the audio files as slightly more pleasant and eventful compared to the English sample, driven by lower ratings on the opposing attributes Annoying and Uneventful.These differences are largest on audio files featuring monotonous and mechanical sounds, which is supported by the outcomes of the Bayesian analysis showing that the translations for the attributes Uneventful and Annoying don't fit as well as the others.We hypothesise that this relates to the translation of the attribute Annoying as "onaangenaam" and "onprettig" (which are closer to "unpleasant" as a more neutral antonym of "pleasant"), rather than the more literal translations "irritant" or "hinderlijk".A categorical principal component analysis included in a study by Weber [23] suggests that the translation "hinderlijk" indeed might be a good candidate.Further research could focus on more rigorous analysis of the translations of these specific attributes.
At the moment of writing, researchers within the SATP have published five papers on the translation process, each adopting their own language and methodology.While the SATP provided standardised protocols and materials for the listening experiments [14,26], each research group was free to obtain the translations of the soundscape attributes as they saw fit, leading to large differences in methodologies.Some studies describe a rather straightforward approach of some kind of (expert) panel discussion like our own, such as the papers on the Indonesian [29] German translations [30].Other papers critically evaluate such approaches, which could be prone to translation errors and deviations in non-expert settings.Therefore, they address procedures that are more elaborate.For example, the authors of the Malaysian study [31] mention a combination of qualified translators, focus group discussions, in situ evaluations, and quantitative analysis on the most accurate translations.The paper on the Thai translations [32] presents a quantitative evaluation method to assess the psychometric equivalence between original and translated attributes, or in other words the translation quality.Within this mathematical endeavour, the authors focus on various evaluation criteria such as understability, clarity and anonymity.Like Gudmundsson [33] they advise that in translating psychometric instruments specific translation protocols should be implemented.Lastly, the Greek team [34] proposes elaborate crosscultural adaptation methodology, specifically meant to maintain meaning between both languages.It consists of using bilinguals in a combination of forward and backward translations, synthesis, pre-tests and a committee approach, recommended to employ prior to listening experiments.In the light of these rigorous methodologies it may be a fair criticism to question to what extent our expert panels were appropriate for translating the soundscape attributes, since the participants did not have a professional background in translating or interpreting, nor were they representative of the target audience.Suboptimal selections made by the expert panels in the initial translation process could thus have very well led to the differences found in this study.
Furthermore, considering that the translation process was a joint effort of The Netherlands and Flanders (Belgium) and that it was specifically designed to suit the populations of both regions, we advise to include Flemish participants as they were not part of the listening experiments in the present study.Studies on demographic factors found that factors like gender and age are related to soundscape perception [35,36], but also that factors like social interaction and noise sensitivity influence the way people perceive their surroundings [37,38].As our sample could be rather homogeneous, it would also be advisable to continue these listening experiments with a generally more heterogeneous sample.

Conclusion
The purpose of this study to translate eight attributes to describe soundscapes from English to Dutch and to ascertain the validity of these translations, as part of a large international effort to establish a common language within the field of soundscape research.After comparison of the data between the Dutch sample and the English sample, results show modest evidence indicating that the Dutch translations were used similarly to the original English attributes when rating 27 audio files.These findings imply that the contextual meaning of the attributes was largely preserved during the translation process.Despite some limitations and while further research is necessary (specifically for the attributes Uneventful and Annoying), our findings are encouraging.They suggest that, although not perfect, the Dutch translations of the English soundscape attributes could already be useful for describing the general appraisal of a person's soundscape in The Netherlands.

Fig. 1 .
Fig. 1.Circumplex Model of Soundscape Attributes, including the English and Dutch terminology.

Fig. 3 .
Fig.3.Average Attribute Ratings for Each Audio File Note.Average ratings in the eight attributes for each of the 27 audio files, as a function of language (Dutch in blue, English in red).The attributes are labelled only for the last radar plot, showing the mean overlap of all audio files combined.

Fig. 4 .
Fig. 4. Average Dutch and English Ratings of Soundscapes in Comparison Note.Ratings of the 27 audio files plotted onto the Eventfulness and Pleasantness attributes.Each dot represents the mean rating of an audio file (Dutch in blue, English in red).Crosses represent means across the 27 audio files.

Fig. 5 .
Fig. 5. Average Differences Between Dutch and English Ratings of Soundscapes.Note.The data of the Dutch sample were subtracted from the English sample.The cross represents mean difference across all audio files and indicates the average tendency in rating difference on the Eventfulness and Pleasantness attributes.Purple markers indicate audio files where the difference score resulted in a change in quadrant in the circumplex.

Table 1
Dutch Translations of the English Soundscape Attributes.Table includes the initial proposals of the two expert groups.Overlap between the groups is indicated in bold.

Table 2
Outcomes of Independent T-Test between the Dutch and English samples using a Bayesian framework on each Attribute.
K.A. van den Bosch et al.

Table 3
Mean and Standard Deviations for the Ratings in Dutch on each Attribute for each Audio File.