Mandarin Chinese translation of the ISO-12913 soundscape attributes to investigate the mechanism of soundscape perception in urban open spaces

This study is part of a global collaboration to translate the soundscape attributes proposed in ISO/TS 12913-2:2018, which aims to standardise soundscape research globally. Cross-referencing results from two independent expert panels produced a set of eight soundscape attributes that were subsequently experimentally veriﬁed, forming a reliable questionnaire for soundscape characterisation in Mandarin Chinese. Employing the as-developed questionnaire, ex situ auditory-only soundscape perception experiments were carried out with 27 soundscape recordings from urban open spaces in the United Kingdom. The soundscape perception scale was used to evaluate participants’ experiences, which were then projected into two dimensions of soundscape perception, pleasantness and eventfulness, following protocols from ISO 12913-3:2019. Physical and psychoacoustic parameters, as well as the characteristics of the sound sources, were extracted from the recordings. These parameters were used together to describe the soundscape characteristics of the recordings. Principal component analysis revealed that, when individuals are exposed to urban open spaces, the salient sound source becomes the foreground focus of attention, informing them to perceive the soundscape. Beyond this, perception stimulation is further based on the acoustic characteristics of the soundscape. Regression analysis investigated factors for pleasantness and eventfulness. For pleasantness, the overall S 95 had a signiﬁcant negative eﬀect while birdsong was beneﬁcial. With regard to eventfulness, mechanical sound had a detrimental impact, while the number of salient sound source types and the overall F 50 had a positive impact. Furthermore, this study founds that certain types of sound sources make the sound more recognisable as a foreground sound, thereby stimulating perception, while others may be ignored as background sounds but still contribute to the perception through their acoustic characteristics.


Introduction
There is growing interest in research on the ability of urban open spaces to enhance visitors' experience and promote general public health while ensuring equal access for urban residents.Soundscape is defined as the "acoustic environment as perceived or experienced and/or understood by a person or people, in context" [1].An acoustic environment is described by the sounds in it and their physical char-Extensive research has been conducted on the feelings that the soundscape of urban open spaces can impart onto individuals, in which dimensions these feelings cluster and what factors affect these dimensions.In a circular model proposed by Axelsson et al., soundscape has two perceptual dimensions, with pleasantness on the X axis and eventfulness on the Y axis [5].ISO/TS 12913 formally adopted this model, defined pleasantness and eventfulness [6], and then gave their calculation methods from the eight soundscape attributes, namely , ,  , , , ℎ,  , and  [7].
Additionally, a pressing issue impeding the global acceptance of this model and thereby the standardisation of soundscape perception studies is the accurate translation of the eight soundscape attributes into various languages.In response to this, an international collaboration was launched, the "Soundscape Attribute Translation Project" (SATP)", which brought together soundscape researchers from all over the world fluent in 19 languages [8,9].Accordingly, as the Mandarin Chinese language working group of the project, this work would start with the aim of producing a verified translation for the benefit of the Mandairn Chinese soundscape research community, before proceeding to a more detailed perception study with the translated tool.

Soundscape and auditory streams
With the development of cities, the auditory environment in urban open spaces is becoming increasingly complex and non-directing.The sounds that individuals notice are not solely determined by the composition of the auditory environment, but are also influenced by factors such as their attention, current activity, expectations, and prior knowledge of sounds [2].Hearing is a complex process that involves multiple levels of attention and higher-level cognitive functions [10,5].Gestalt psychologists believe that there are certain organisational rules for perception.Initially, this rule of perceptual organisation was used to explain visual phenomena, but it has gradually been shown to play a role in the auditory field as well [11].They argue that when people listen to complex sounds, they use physical cues from the environment to group the sounds into multiple auditory streams.These auditory streams can be categorised into attention stream and non-attention stream, and this phenomenon is called the "foreground-background phenomenon" [12].In non-directing environments, where individuals are immersed in multiple sounds, their physical and psychological systems selectively focus on specific sounds [2,13].These sounds become the main source of stimulation in soundscape perception and possess certain soundscape traits.The formation of an auditory stream restricts the individual's attention, which, conversely, also influences the formation of the auditory stream.At the same time, this bidirectional influence process between sound and attention is also influenced by physical environmental cues [14].
For the intricate and diverse soundscapes of urban open spaces, might there be certain traits that can be named, characterised, or quantified which serve as cues for triggering soundscape perceptions?Identification of such traits could help elucidate why people focus their auditory attention on particular elements within the complex composition of soundscapes in urban open spaces, leading to the "foregroundbackground" phenomenon of soundscapes.Additionally, it is vital to identify appropriate and concise multivariate acoustic indices that encompass the complex acoustic factors influencing soundscape perception [15].Potential acoustic parameters are often interrelated and convey overlapping information [16].For example, the degree to which a certain sound is perceived, in addition to its inherent characteristics, is also influenced by the presence and characteristics of other sounds due to auditory masking [17].The overall acoustic characteristics of the environment, depicted by physical acoustic parameters and psychoacoustic parameters, are predominantly determined by the compositional relationships and characteristics of the sound sources [18].By making use of standardised measurement tools and experiments, finding a suit-able suite of indices to characterise the acoustic factors affecting urban open space soundscape perception is an effective approach to soundscape assessment, prediction, and optimisation [19].

Sound sources and their influence on soundscape perception
Sound sources and the information they contain play a crucial role in the perception of the soundscape [20].In studies of soundscape perception in urban open spaces, when researchers add related questions about sound source identification to the questionnaire, the participants' perception evaluation results may become more regular because they have considered the information contained in the sound sources [21].Sound sources are usually classified as negative, positive, or neutral according to the emotional experience they bring to people.For example, natural sounds such as birdsong [22][23][24] and water sounds [25,26,17] are considered positive components of the soundscape.On the contrary, mechanical sounds such as traffic sounds [23,[27][28][29] and construction sounds [30] are usually considered negative sounds.The sounds of people talking and children playing are generally positive components of pleasantness [24] but neutral in terms of acoustic comfort [31].The same type of sound may also have different degrees of positive or negative impact on people, which may be due to differences in the acoustic performance of particular sound sources.For example, among water sounds, splashing water can be more pleasant than purling water due to its higher loudness [17].Among different types of construction machinery on a construction site, although electric drills and jackhammers have similar levels of loudness, the sounds of electric drills could be more annoying due to their higher sharpness, making it more unbearable [30].
In addition to the type attribute of the sound source, its presence characteristics also have a non-negligible impact on people's perception.The saliencies of sound sources [32,33] and temporal existence [33][34][35] are usually considered, and are measured primarily by the assessment of the researcher [32] or by questionnaires [33,35].Some researchers have also paid attention to the influence of the distance between individuals and sound sources [33].It is worth noting that the parameters that characterise the presence characteristics of sound sources usually come from the subjective judgements of researchers or participants, and people tend to misunderstand the meanings of different features when making such judgements.For example, subjective judgements such as sound source saliency are likely to be influenced by the duration of the sound or the proximity of the listener to the sound source.In their questionnaire, Lavandier et al. asked relevant questions about the three aforementioned presence characteristics and explained the meaning of each characteristic.With this prompting, the independence of the participants' evaluation of the presence characteristics of different sound sources was ensured [33].
Furthermore, the physical acoustic properties of the environment are important objective factors in perception.In the early days, the sound pressure level was the central parameter for evaluating the urban acoustic environment [36].L Aeq was commonly used to evaluate the acoustic environment of urban streets [37].Its 95 th percentile exceedance level L 95 was interpreted as the background sound pressure level, which has also been shown to be an important parameter to measure the quality of the acoustic environment of urban open spaces [38].However, as research advanced and the acoustic environment became increasingly complex with urban development, the use of sound pressure level alone is not sufficient to fully describe the impact of urban acoustic environments on residents [39].Therefore, researchers tried to incorporate additional psychoacoustic parameters into soundscape perception models.
Psychoacoustic parameters were originally developed to describe steady-state sounds and are often used to evaluate the sound quality of a specific sound source.For the complex and dynamic sounds in urban areas, whether psychoacoustic parameters can adequately describe and predict soundscape perception has become a new research focus.
Raimbault et al. used these parameters, such as sharpness and roughness, to describe the most typical sounds in cities [40].It was found that both human activity sounds and birdsong have a high degree of fluctuation.At similar loudness, birdsong is easier to identify from urban sounds due to its higher sharpness.Traffic sounds, water sounds, and wind sounds usually have similar degrees of fluctuation.These findings imply that while it is difficult to identify sounds in cities using a single psychoacoustic parameter, it could be achieved using a combination of different psychoacoustic parameters, including fluctuation strength, loudness, and sharpness [41].Hong et al. observed that a radial basis function (RBF) model trained on ambient sounds that were louder and had higher sharpness could more consistently predict people's perception results [42].Some studies have also attempted to describe the acoustic characteristics of the environment using percentile exceedance levels of psychoacoustic parameters.It was found that the traffic sound of a street with faster vehicles has a higher 95 th percentile exceedance level of sharpness (S 95 ) [43] than a slower one and that the sound of people talking will increase the 50 th percentile exceedance level of fluctuation strength (F 50 ) of the overall environment.This shows that the time-domain percentile exceedance levels of psychoacoustic parameters can more accurately describe the fluctuating urban acoustic environment.

Perceptual dimensions of the soundscape
Since the publication of ISO/TS 12913-2:2018, researchers used this model to explore the influencing factors of these two important perceptual dimensions.Some researchers have also tried to establish a soundscape perception model targeting these two dimensions, with the aim of finding a set of soundscape description indices suitable for urban open spaces, which could be used to predict soundscape perception.For example, some researchers have established a vibrancy descriptor prediction model based on aural and visual parameters, which showed a statistically significant correlation [44] with eventfulness.There were also studies based on a series of acoustic indices suitable for describing forest soundscapes (such as the Acoustic Complexity Index ACI, etc.) to predict pleasantness and eventfulness while considering temporal variations [45].
Regarding the effect of various sound events on the two dimensions of soundscape perception, natural sounds such as water sound [26,17] and birdsong [46][47][48] have been shown to have a significant positive impact on pleasantness, whereas birdsong [46] and human activity sound [5] have a greater correlation with eventfulness.Researchers also found that sound sources in different scenes may have different effects on people's perception.On streets, traffic sounds can promote eventfulness [49,50], while in residential areas, the dominant impact of traffic sounds is the reduction in pleasantness [17].This shows that the function of the setting can allow the same sound event to have different effects on the perception of the soundscape.
In most of the studies where soundscape perception models were constructed, physical acoustic parameters and psychoacoustic parameters were widely used [51].To enhance the predictive ability of the model and explore more comprehensive factors affecting soundscape perception, dynamic temporal parameters of sound sources [33], overall perception parameters of the acoustic environment from subjective evaluation (such as the overall loudness of the environment) [35], visual information [52], contextual information [32], individual information [49] and more other factors were introduced into the models as variables.In the context of constructing a perception model for the soundscape at the auditory level, it is also worth exploring factors more comprehensively and incorporating them into the perception model.

Aim and structure of this work
In this study, we hope to achieve the following aims: 1. Translate the soundscape attributes into Mandarin Chinese.2. Identify the traits of the soundscape that stimulate feelings of pleasantness and eventfulness.3. Identify the factors that affect the extent of these feelings.
Herein, we present this conglomerate study where first, the soundscape attributes were translated to Mandarin Chinese and verified, achieving aim 1. Next, an ex situ experiment was conducted in which participants were given the opportunity to perceive, experience, and evaluate binaural soundscape recordings from urban open spaces.From the recordings, the physical and psychoacoustic parameters, their percentile exceedance levels, and parameters about sound sources were extracted.These parameters were taken for principal component analysis to achieve aim 2. Subsequently, in Section 3.3, with pleasantness and eventfulness as the main dimensions of soundscape perception, potential influencing factors related to soundscape perception were identified based on correlation and regression analysis to achieve aim 3.

Translation of the eight soundscape attributes
As the soundscape perception scale involved asking about the subjects' experience on the eight soundscape attributes in ISO/TS 12913-2:2018 [6], translation of the eight attributes would entail the translation of the scale.Two expert panels constituted by bilingual researchers with more than five years of experience in soundscape research were set up at two separate institutions (Shenyang Jianzhu University and Chongqing University).First, each panel conducted focus group discussions to form a set of potential candidates by considering the parallelism with the corresponding English terms.Following this, the results of the two panels were cross-compared and discussed to create a final set, ensuring broad applicability and accuracy.
To verify the efficacy of the translation, a listening experiment was conducted.Following auditory exposure, participants were asked to evaluate the eight attributes.The results of these tests were used to perform an internal consistency test.Then, the eight attributes were further simplified to two perceptual dimensions of pleasantness and eventfulness using the method given in ISO 12913-3 [7].

Soundscape recordings
The stimuli for the experiment were provided by collaborators at University College London and consisted of 27 binaural recordings recorded in London during summer and autumn 2019 [53].Each piece of recording lasted 30 seconds and sounds like traffic sounds, construction sounds, high-pressure water spraying, streams, birdsong, conversations, footsteps, children playing, street performances, etc. can be identified from them.These recordings included typical sounds that may appear in most urban open spaces, representing a variety of different acoustic environments.Sounds from the recordings were categorised as human sounds (conversations, footsteps, and sounds from human activities), bird sounds, music sounds (performing music), water sounds (streams, fountains, irrigation sounds), and mechanical sounds (traffic sounds, construction sounds).

Participants
In total, 68 participants were recruited from Shenyang Jianzhu University, and passed three preliminary screening criteria: (1) staff and students aged 18-55 (2) no hearing impairment (3) no other significant health problems.Participants were administered the 21-item Weinstein Noise Sensitivity Scale prior to the experiment.The basic demographic information and noise sensitivity data for the participants are shown in Table 1.All procedures complied with relevant ethical regulations and were approved by the Ethics Committee at the School of Architecture and Urban Planning, Shenyang Jianzhu University (No. 20200901).Informed consent was obtained from all participants.

Experimental design
The experiment was carried out in a semi-anechoic chamber, as shown in Fig. 1.A within-subjects design was adopted.A Lenovo 310 laptop was used for controlling the playback of the recordings.An external sound card (Headphone Console 6S) was used with a headphone output (Sennheiser HD650).An iPad was provided to fill out the scale.To recreate the acoustic environment in the soundscape recordings, a multimeter (FLUKE15B+ Digital) and a calibration signal (1 kHz sinusoidal signal SPL 94 dB recording) were used for calibration.In addition, during the experiment, we used a sound level meter (Optimus cr:160) to monitor the background noise of the semi-anechoic chamber, which was around 15.5 dB(A), low enough to ensure that the experiment was not disturbed.
Before starting the actual experiment, a practice was offered to every participant.The recording for the practice was not among the 27 soundscape recordings used in the formal experiment.At the beginning of an experiment, the main instructor randomly assigned the 27 soundscape recordings to a participant, and the participants listened to the recording and made evaluations.There was at least a 30 seconds gap between consecutive recordings, and each participant was able to relisten to any recording at their discretion.All questions on the scale were phrased as suggested by ISO 12913-2 [6]: "To what extent do you agree or disagree with the following statements describing the acoustic environment at the moment?".Participants toggled a slider on a scale from 0 ("strongly disagree") to 100 ("strongly agree") to indicate their rating, with a default position of 50 (neutral) when no rating was given.

Data extraction and initial processing
Here, parameters of two classes were extracted to characterise the soundscape recordings.Acoustic parameters which include physical and psychoacoustic parameters were calculated using established methods, while sound source characteristics parameters were extracted by subjective evaluation by researchers.
Referring to the guideline for extracting acoustic parameters of binaural audio in ISO 12913-3 [7], this study calculated the following acoustic parameters: (1) The physical acoustic parameter L Aeq (dB(A)) and its percentile exceedance levels (L 5 , L 95 ).( 2) Psychoacoustic parameters: loudness N (sone), sharpness S (acum), fluctuation strength F (vacil), roughness R (asper) and their percentile exceedance levels N 5 , N 95 , S 5 , S 95 , R 10 , R 50 , F 10 , F 50 .These parameters were calculated by the BK Connect software.Loudness was calculated according to the method in ISO 532-1 [54].Sharpness values were calculated according to the method in DIN 45692 [55].Roughness and fluctuation strength values were calculated according to the method developed by Zwicker and Aurés [16].Some parameters for the characteristics of the sound sources were extracted with a jury test, similar to that used by Hong et al. [32].They include the saliencies of the five sound source types (bird sounds, water sounds, human sounds, music sounds, and mechanical sounds), the number of types of sound sources present, and the number of types of salient sound sources.The jury members included four graduate students (2 of each gender).Each member was asked to answer the following questions about the five types of sound sources: "Please rate the saliencies of the five sound sources based on the audio you hear".The answer options were set on a scale of 0-3 (0 = do not hear at all, 1 = hear a little but not dominant, 2 = dominates moderately, 3 = dominates completely).A consistency test was carried out on the evaluation results of the four jury members.The results showed that the Kendall's coefficient of concordance W of the evaluation results of human voice, birdsong and music sound was greater than 0.8, which were 0.886 (p=0.000<0.01),0.806 (p = 0.000 < 0.01), and 0.891 (p = 0.000 < 0.01), respectively.The W values of mechanical sound and water sound were 0.784 (p = 0.000 < 0.01) and 0.738 (p = 0.000 < 0.01), respectively.This showed that the evaluation results of the four jury members were consistent.Then the mean value of the evaluation results of the four jury members was calculated to represent the saliency of each sound source type.All the parameters extracted so far are summarised in Table 2.

Translation of soundscape attributes
The eight attributes and their Mandairn Chinese translations are summarised in Table 3. Due to the inherent difficulty in expressing precise meanings arising from cultural variations, certain soundscape attributes were conveyed using multiple Mandarin Chinese adjectives, such as Portuguese [56], Bahasa Melayu [57], Spanish and French [58].
The intra-class consistency of the translation results were confirmed in two steps.The first step involved evaluating the translations for bias by examining the corresponding relationships of the descriptors.The average scores of the eight attributes for each of the 68 participants were calculated.Subsequently, an ICC (intraclass correlation coefficient) consistency test was conducted on each pair of attribute descriptors, after reversing the signs of the negative attributes (a, ch, m, u).The results, presented in Table 4, revealed that all pairs exhibited ICC values greater than 0.9, indicating a very high level of consistency.These findings suggest that the translations successfully preserved the original semantic matching relationship of the eight soundscape attributes.

Table 3
The eight attributes tested in the scale for soundscape perception and their Mandarin Chinese translation (in which the scale in the experiment was presented).In the second step, the accuracy of the individual participants' understanding of the translated attributes was assessed.Using the same method, the evaluation scores of the 68 participants for the 27 audio recordings were subjected to an ICC intra-group consistency test.After eliminating four sets of missing values, the final sample size was N = 1832.The results, shown in Table 4, indicated that the ICC value for the p / a pair exceeded 0.6, while the ICC values for the other three pairs of attributes closely approached 0.6.These findings demonstrate a high level of consistency, suggesting that the Mandarin Chinese translation maintained the original attribute correspondences in terms of individual participants' understanding.
After standardisation [59], ISO-P and ISO-E were plotted in a scatter plot to observe their distribution (Fig. 2).Looking at either dimension individually, ISO-P and ISO-E both present a normal distribution.ISO-P almost spanned the entire space (95% CI = -1.0-0.9).Despite having a peak close to 0, it is skewed more to the negative.On the other hand, ISO-E has a narrower distribution (95% CI = -0.5 -0.9) with a peak around 0.25.Overall, the soundscape perception appeared to have relatively high eventfulness.
Looking at the two dimensions together, the majority of samples have relatively high ISO-E and are distributed in the 1 st and 2 nd quadrants (with high and low ISO-P respectively).Among these, the overall ISO-E of the samples in the 2 nd quadrant was significantly lower than that of the samples in the 1 st quadrant.The 3 rd quadrant contains samples with low ISO-P and ISO-E.The 4 th quadrant also has low ISO-E, but high ISO-P.The very low number of samples in this quadrant means it is difficult for a soundscape with low eventfulness to have a high pleasantness, at least for the 27 soundscapes used for this study.

Traits from sound sources
To explore the soundscape features that play a decisive role in the stage of people's perception, principal component analysis was carried out to classify the soundscape recordings using ISO-P and ISO-E values from the 68 participants as the descriptors, respectively, yielding two sets of results (Table 5).Varimax rotation was applied to extract orthogonal factors.For ISO-P, The Kaiser-Mayer-Olkin (KMO) measure of sampling adequacy was 0.646 > 0.6, and Bartlett's test of sphericity was also significant ( 2 = 749.97,and p < 0.001), which indicated that the data set was appropriate for principal component analysis (PCA).Eight components with eigenvalues greater than 1 were obtained.For ISO-E, KMO (KMO = 0.707> 0.6) and Bartlett's test of sphericity ( 2 = 890.97,p < 0.001) were again both significant.Six components with eigenvalues greater than 1.0 were extracted.These results are shown in Table 5.Table 5 reveals that the PCA conducted using ISO-P or ISO-E scores yielded different clustering results.This discrepancy indicates that the participants made their judgements about the attributes related to the two dimensions based on different criteria.Specifically, the traits that triggered attention and shaped perceptions varied.
By examining the sound sources present in each recording, we identified that the soundscapes within each cluster exhibited similar salient sound source types (Table 5), which could also be called the salient sound source label.This suggests that when participants evaluated traits of the soundscape, the information pertaining to the type of sound source contained cues that triggered attention and subsequently influenced perception.
An analysis of these salient sound source types highlights the prevalence and significant presence of mechanical and human sounds in all recordings.People's activities in urban open spaces created human sounds, including conversations, footsteps, and laughter, and mechanical sounds such as traffic and construction noises.Music sounds, while also associated with human activities, form a distinct category due to the specific nature of their performance.Birdsongs, although relatively uncommon, represent a rare element of the natural soundscape in urban open spaces but are easily discernible.Water sounds can be attributed to both rain and water features such as fountains, and in some recordings, they can be challenging to differentiate from mechanical sounds.
It is worth noting that in the clustering according to both dimensions, there is a phenomenon where different clusters share the same set of salient sound source types (clusters 1 and 8 according to ISO-P, as well as clusters 2 and 5 according to ISO-E).This implies that there might be the existence of additional cues that influence attention and perception beyond those accounted for by the salient sound source types.

Traits from acoustic characteristics
To further explore the traits that stimulated perception, cluster analysis was conducted according to ISO-P and ISO-E to yield the mean value and coefficient of variation (CV) of the acoustic parameters of the soundscape recordings in each cluster.These were contrasted with those of all soundscape recordings, as shown in Fig. 3 and Fig. 4. The parameters included in this analysis were those related to SPL (L Aeq , L 5 , L 95 ), loudness (N, N 5 , N 95 ), sharpness (S, S 5 , S 95 ), roughness (R 10 , R 50 ), and fluctuation strength (F 10 , F 50 ).
For the typical soundscape of all 27 urban open spaces, among the acoustic parameters mentioned above, except for S 95 , R 10 , and R 50 (CV < 0.2), all displayed a certain degree of variation (CV ≥ 0.2).It can be seen that for urban open spaces, the complex sound source composition makes the acoustic parameters behave rather differently.On the contrary, for the clusters obtained according to ISO-P and ISO-E, they displayed considerable convergence.That is to say, soundscapes that had similarities in perception also showed relative similar physical acoustic and psychoacoustic characteristics.
Upon further inspection, it can be seen that for the clusters according to ISO-P, clusters 1 and 8, which exhibited the same salient sound sources, showed significant differences in the parameters for SPL, loudness, and fluctuation strength.Similarly, for clusters 2 and 5 according to ISO-E, which also exhibited the same salient sound sources, significant differences in the parameters for SPL, loudness, and fluctuation strength could also be seen.Hence, even when the salient sound sources remain unchanged, variations in the sound energy and temporal features can elicit differences in the soundscape perception.
Meanwhile, in clusters with different salient sound sources, similar acoustic characteristics can be found.For example, for clusters 2, 3 and 6 according to ISO-E, all acoustic parameters were relatively similar.They belonged to different clusters due to human sounds were mixed with different sounds (Table 5).This further proved that the dominant sound acted as important cues in soundscape perception.

Modelling the multi-factor mechanism of soundscape perception 3.3.1. Factors for soundscape perception
To identify which soundscape factors were associated with the perception of pleasantness and eventfulness, we performed a Pearson correlation analysis (Table 6).Except for the basic parameters set out in Table 2, two additional parameters from secondary calculations (N 5 /N 95 and L 5 -L 95 )were included.N 5 /N 95 was suggested as potentially useful by ISO-12913-3 [7], and L 5 -L 95 characterise the degree of variability of SPL, which was also shown to be related to soundscape perception [60][61][62].
ISO-P and ISO-E showed correlations of varing degrees with the saliencies of various sounds.Both showed significant negative correlation with the negative sound S MS , with the former having a higher correlation coefficient (r = -0.693,p = 0.000 and r = -0.444,p = 0.020).ISO-P was significantly positively correlated with all positive sound sources, with the strongest correlation with S BS (r = 0.549, p = 0.005) and similar correlations with S HS (r = 0.475, p = 0.012) and S MuS (r = 0.471, p = 0.013).Among positive sounds, ISO-E was positively correlated with S HS (r = 0.626, p = 0.000) and S MuS (r = 0.623, p = 0.001).
The uncorrelated results shown by S WS may be due to the diversity of water sounds in the soundscape recording.From these recordings, the water sounds heard by the participants were not completely from nature, but might also include the sound of high-pressure water spray, fountain, and irrigation.These sounds could bring different feelings and were not always pleasant to the participants.

Table 6
Pearson's correlation results between the two perception dimensions and the parameters for sound characteristics.** and * indicate correlation is significant at the 0.01 level and 0.05 level (2-tailed), respectively.

Influencing mechanism of soundscape perception
As shown in Table 7, this section established two regression models to explore the concurrent influence of multiple factors on soundscape perception.Through stepwise regression, the significant factors obtained from the correlation analysis were input into the corresponding models for ISO-P and ISO-E.Since there was strong collinearity between the SPL-related parameters and the loudness-related parameters, one of

Table 7
Regression models for the two perceptual dimensions ISO-P and ISO-E with stepwise regression with input independent variables identified from prior correlation analyses.them was to be selected to be included in the model.While the correlations between the two sets of parameters and ISO-P were both generally significant, ISO-E was significantly correlated with L 5 -L 95 but not with any loudness-related parameters.Therefore, the SPL-related parameters were chosen to be included in the regression model so that richer information could be considered.
Inspecting the results of the two regression models, it was found that soundscape perception was affected by both sound source characteristics and acoustic parameters.The perception of both pleasantness and eventfulness was subject to the significant negative influence of negative sound sources.In particular, S MS could significantly negatively impact the perception of ISO-P.On the other hand, the positive sound source of bird sound had significant positive impact on the perception of pleasantness.
The percentile exceedance values of psychoacoustic parameters appeared in both models.S 95 negatively impacted ISO-P.Both L 5 -L 95 , which represented the variability of sound energy, and F 50 , which represented temporal characteristics, would stimulate people's perception of eventfulness.

Translation of the soundscape attributes in support of soundscape research in Mandarin Chinese
Through a rigorous translation process, this study ensured the semantic consistency of the Mandarin Chinese version of the soundscape perception scale with the original version.Results from the ICC test confirmed that the Mandarin Chinese translation of the attributes maintained the original matching.The soundscape dimensions of ISO-P and ISO-E calculated from the attributes showed reasonable results of soundscape perception, as visualised by Soundscapy (Section 3.1).The Mandarin Chinese version of the scale can be used to measure the perceptual attributes of the soundscape by native Mandarin Chinese respondents, forming a measure of both dimensions.Using the measures obtained from this scale as parameters, subsequent principal component analysis (Section 3.2) and multiple regression analysis (Section 3.3) produced good data results, supporting more soundscape research in Mandarin Chinese cultural contexts in the future.
Based on the translation results, each of the eight perceptual attributes required two or more words to accurately express the semantic information of the English attributes, except for pleasant and calm.This finding was consistent with the translations in other languages such as Portuguese [56], Bahasa Melayu [57], Spanish and French [58], etc., within our international collaboration.This may be due to the lexical complexity inherent in cross-cultural translations.However, this also suggests that it might be possible to find a more culturally appropriate semantic equivalent with a bottom up approach, where native Mandarin Chinese respondents are asked for their perceptual descriptions of familiar soundscapes around them.This approach may yield different results not only because the descriptions of soundscape perceptions come from the natural use of native language, but also because the perceived objects are urban open space soundscapes in a Mandarin Chinese cultural context, which could be significantly different from those in other countries.However, this approach may also create another unintended risk that the dimensions of open space soundscape perception in a Mandarin Chinese socio-cultural context differ from those in ISO/TS 12913-2:2018 [6], which suggests the potential for further investigation.

The soundscape traits that stimulate soundscape perception
In this study, we extracted clusters from the 27 soundscape recordings according to ISO-P and ISO-E by principal component analysis and found that the most definitive commonality of the recordings within each cluster was the salient sound source types they contained, centred on the two main types of human and mechanical sounds, which were among the most salient sound sources in cities.This agrees with the "foreground-background" phenomenon proposed by Gestalt psychologists.This result means that in urban open spaces, all sounds initially exist as the background, while the soundscape trait of salient sound source type removes people's non-directing attention and stimulates feelings of pleasantness and eventfulness.This perceptual interaction is based on the individual's prior knowledge and expectations to judge whether the current acoustic environment is harmonious and meets the individual's expectations.Subsequently, rich auditory mapping is generated in people's brains through the participation of auditory organs and subsequent perceptual processing [63].These rich mappings require the input from the acoustic characteristics of the soundscape.In different environments with the same or similar salient sound source type, the difference in the acoustic performance (such as SPL-, loudness-, and fluctuation strength-related parameters) will lead to different perception results.This is probably why some recordings were not classified as the same component even though they had the same sound source type.

Multiple influencing factors of soundscape perception
According to the results of correlation analysis and the regression models, in typical urban open spaces represented by the 27 soundscape recordings, soundscape perception demonstrated strong correla-tion with sound source information.For both pleasantness and eventfulness, the acoustic parameters and the presence characteristics of the sound source have shown a certain influence.Both of these two factors come from the information released by the sound source, and both are physical cues provided by the sound source when people perceive it.The synergistic application of these two physical cues can construct the characteristics and changes of the acoustic environment in the form of objective parameters as completely as possible within a period of time.
The presence characteristics of sound sources obtained by people's subjective identification can describe the content of the auditory environment and the saliency of different sound sources, which can allow researchers to understand the participants' attention.It is worth noting that mechanical sound has a negative impact on both dimensions.The mechanical sounds in this study cover the sound of traffic and construction machinery, which makes it rich in acoustic performance, and this negative effect is likely to come from their higher sharpness or loudness [30].The sound of birdsong can enhance people's pleasure, which agrees with previous research [64].
Acoustic parameters represent the special acoustic performance of a certain type of sound source in the combined environment and reflect the energy and frequency of the acoustic environment.The S 95 , which describes the high-frequency components of the sound, is included in the pleasantness model.It represents the level of sharpness in the stable state of the environment.On the other hand, L 5 -L 95 , which represents the temporal variability, is included in the model for ISO-E.Such variations in the sound energy are usually triggered by high-SPL sound events, which attracts people's attention and can enhance people's sense of eventfulness.The F 50 parameter in the eventfulness model indicates the median level of fluctuation for a period of time, and represents the acoustic performance of traffic sounds and human footsteps.Compared with other urban sounds, these types of sounds have a more obvious fluctuation pattern [41,42] and can enhance people's sense of eventfulness.This is consistent with previous research which found that traffic sounds [49,64] and human sounds [17] will enhance people's sense of eventfulness.In addition, the sound pressure level and loudness of the environment were found to have a strong correlation with people's pleasantness, which is consistent with previous research [34,65].However, in subsequent regression studies, no parameter related to sound pressure level or loudness appeared in the perception model of pleasantness.This reemphasises the current paradigm shift from controlling noise to focusing on content in urban open spaces.
The percentile exceedance level of one psychoacoustic parameter was found to be in either model, respectively.Psychoacoustic parameters represent the acoustic performance of certain events that can cause changes in perception.In a previous study [33] that modelled perception for different venues, traffic sound appeared in the perception model of urban streets in the form of loudness parameter and was interpreted as a part of the street.In contrast, in the perception model of urban parks, traffic sounds were interpreted as "events" that affect pleasantness, appearing in the model in the form of sound source parameters.This means that when traffic sound occurs in the street, people understand it as background sound, while in a more pleasant place such as a park it is understood as an event, or a foreground sound.This result further explains the significance of the acoustic parameters that appear in the two perceptual models in this study.
For all the recordings in this study, S 95 Avg = 1.21 acum, Std Dev = 0.22 acum.The sound of mild wind, a small number of people (footsteps, small conversations) or a small number of slow-moving vehicles could cause the acoustic environment to have such a low sharpness value.When the number of people increase, the traffic is more busy, the speed of vehicles is higher, or there are fountains or other highpressure water sounds in the venue, the S 95 level would increase.This is also consistent with a previous work about using S 95 to characterise places [42].Therefore, in the pleasantness model, S 95 explains natural sounds such as wind and water in urban open spaces, or whispers of crowds, footsteps and calm traffic sounds, which are easily over-looked as background sounds.When the traffic sound in the environment presents a sound state that cannot be ignored, or the annoying sound of construction machinery appears in the environment, it will become a sound source event that affects people's sense of pleasantness and eventfulness, corresponding to the saliency of mechanical sounds that appears in both models.
The 27 soundscape recordings in this study contain typical sounds in most cities.The scenes include parks, green spaces, squares, streets, etc.The overall L Aeq has an average of 56.93 dB(A), Std Dev = 11.85 dB(A).Although construction sounds, signal sounds, and cars passing at high speed can make the sound pressure level of the city fluctuate and attract attention, affecting perception, in most of the time, the attentiongrabbing foreground sounds can still play a positive role in perception as much as possible through human intervention.In environments with high sound pressure levels, the sound pressure level or loudness is an important factor affecting the pleasantness.In relatively calm places with fewer sound events, all sounds can be considered as background sounds, but due to certain information released by the sound sources, some sounds will become circumstantial foreground sounds and pretext perception.The improvement of their content can directly improve people's experience.At the same time, although the background sound is often not the focus of people's attention, it still takes effect together with the foreground sounds with its acoustic characteristics.Thus, it is also possible to bring certain positive contributions with background sounds by fine-tuning certain acoustic parameters.The negative effect of S 95 on ISO-P suggests that reducing the sharpness of background sound can improve soundscape pleasantness.For example, depending on the spatial function and landscape characteristics, adding soft background music or appropriate masking sound could reduce the sharpness of the background sound and thus improve the pleasantness.Correspondingly, in the model for ISO-E, the positive effect of L 5 -L 95 suggests that reducing the SPL of the background sound facilitates the foreground sound to be perceived and improves the perception of surrounding auditory information.

Concluding remarks
In summary, this study first translated the soundscape attributes in ISO/TS 12913-2:2018 to Mandarin Chinese, which were subsequently verified, thereby creating a reliable questionnaire for Mandarin Chinese soundscape researchers to investigate soundscape perception.Using the questionnaire developed, the relationship between soundscape characteristics and the people's perception experience was examined with an ex situ soundscape perception experiment.Principal component analysis identified the soundscape traits that stimulate the feelings of pleasantness and eventfulness, the two dimensions of soundscape perception.Subsequent regression analysis identified various soundscape characteristics that affected the extent of the perception of pleasantness and eventfulness.
The soundscape recordings used in this study were all recorded in London, United Kingdom.Due to differences in socio-economic, cultural, and living habit differences, these soundscapes will inevitably be somewhat different from those native to Mandarin Chinese-speaking regions [66].Such differences could cause deviations in the evaluation of the soundscape attributes.Nevertheless, this research provides valuable insight into the perception mechanism in general, and future research could follow up to investigate the potentially varying mechanisms in country-specific socio-cultural settings.
Furthermore, a vital prerequisite for the translation work is that the attributes in the ISO specification apply to all countries involved.However, this may not be entirely accurate due to the country-specific contextual and individual characteristics that could alter the outcome of perception [66].Therefore, to create an even more accurate set of soundscape attributes for the Chinese context, a bottom-up study could be conducted where the original experiment by Axelsson et al. is replicated in the Chinese language and setting.Comparing the findings

Fig. 1 .
Fig. 1.(a) Experimental scene and equipment (written consent acquired from identifiable person for publication).(b) Experimental process.

Fig. 3 .
Fig. 3.The mean and CV of the acoustic parameters of the soundscape recordings in each cluster according to ISO-P.C1(5) means cluster 1, which contains five soundscape recordings.All refers to all 27 soundscape recordings.

Fig. 4 .
Fig. 4. The mean and CV of the acoustic parameters of the soundscape recordings in each cluster according to ISO-E.C1(7) means cluster 1, which contains seven soundscape recordings.All refers to all 27 soundscape recordings.

Table 1
The sample demographic characteristics.

Table 2
All parameters extracted from the soundscape recordings and the perception scales.a.u.stands for arbitrary unit.

Table 4
Intraclass correlation efficient (ICC) analysis for the eight attributes.

Table 5
Clustering with Principal Component Analysis and the mean saliencies of the five sound source types.