Understanding how multi-sensory spatial experience influences atmosphere, affective city image and behavioural intention

This article firstly emphasizes the perspective viewing public spaces as places where meaningful spatial quality, i. e., atmosphere, is generated through multi-sensory spatial experiences, secondly proves that atmosphere has a positive direct impact on affective city image, also a positive indirect impact on behavioural intention, and finally proposes strategies of designing, managing and representing architectures and urban spaces, for city image formulation and communication. Nanjing, a historical Chinese city eager to re-image, is chosen as the case area to testify the significance of multi-sensory spatial perception in shaping one ’ s affection for a city. The study reviews the key dimensions composing multi-sensory experience in public spaces, also interviews 162 visitors and 201 residents. The results suggest that, for sustainable urban development, the design, management and promotion of iconic public spaces should holistically enhance people ’ s haptic, audible and visual experience in motion to facilitate perception of atmosphere.


Introduction
The significant role played by public buildings and urban areas in formulating and shaping the city image has been extensively discussed and approved (e.g., Hristova, 2019;Ingersoll, 2000;Lai, 2010;Macarthur, 2015;Nolasco-Cirugeda et al., 2020;Yüksel and Akgül, 2007). Buildings, public squares, parks, streets, as urban characters, as fixed environment, as service providers, and as recreation (Kotler et al., 1999), deliver various messages of the city to each individual receiver during one's spatial experience, and help form the city's image in one's mind. Dai et al. (2018) argued that previous studies discussing the link between public spaces and the city image obsessively focus on the stylistic or morphological dimension of public buildings and urban areas, devaluing people's bodily experience in the spaces. The spatial experience, derived from the multi-sensory interaction with all the details constructing the public spaces (Pallasmaa, 2016), actually affects people's overall impression about a place and cause corresponding consequences in shaping the "affective tone" of a city (Hasse, 2016).
During one's spatial experience, the most overwhelmingly perceived character of the built environment is the atmosphere (Pallasmaa, 2014). An atmosphere is simultaneously influenced by the perception of various sensory dimensions of a space, i.e., not only visual, but also haptic and audible ones (Böhme, 2016). It is also a subjective and compositive reflection on these dimensions, revealing the overall emotional impression (Böhme, 2013a(Böhme, , 2013b) that a space makes on people. Therefore, it makes one of the key spatial characteristics that influences the affective mental image of a place or a city in people's minds (Hasse, 2016).
Influenced by the atmosphere, the affective city image, i.e., the individuals' subjectively experienced feelings attached to a city (e.g., Hosany et al., 2007), is vital in evaluating a city (e.g., Sahin and Baloglu, 2011). It acts as a determinant predicting people's attitudes towards a city and intention to recommend a city (e.g., Ekinci and Hosany, 2006;Gilboa et al., 2015;Hosany, 2012). Thus, affective city image can be used as a powerful tool to help identify and differentiate the city in people's minds, so as to attract tourists, investments, companies, talented employees, also retain high-income, well-educated migrants and residents (Cleave and Arku, 2020;Doucet et al., 2011), aiming at the sustainable development of the social and economic function of the city (Ashworth and Voogd, 1990).
Given the vitality of the atmosphere in affecting the formulation of the affective city image in theory, it is worth discussing how this spatial characteristic is constructed and how it is causally related to the affective city image in the real situation. In this article, we aim to answer a series of research questions. Firstly, given the earlier substantial discussions on the intimate relationship between sensory spatial dimensions and the atmosphere (Bille and Sørensen, 2016;Böhme, 2016), is the perceived atmosphere of a space significantly influenced by people's perception of various sensory dimensions? Meanwhile, concerning that sociodemographic factors can affect people's preferences towards the spatial characteristics of the built environment (Hami et al., 2016), does the correlations between the perceived various sensory dimensions and the perceived atmosphere vary across different sociodemographic groups? Secondly, considering the emotionality of the atmosphere and its overarching role in encapsulating the general impression of a space, how is it causally related to the affective dimensions of the city image? Thirdly, given that people's behavioural intentions in a city can be highly influenced by the affective city image, if the atmosphere affects the shaping of the affective city image, does it also indirectly affect behavioural intentions?
In order to answer these questions based on statistical analysis in real situations, the research was embedded in the case study of Nanjing, a historical city in China. Known as the "ancient capital of China for six dynasties", and the current provincial capital of Jiangsu, Nanjing possesses rich historical, cultural, economic, technological and educational resources for urban development (De Jong et al., 2018;Yang et al., 2019). Nevertheless, according to the local authorities and media, the city of Nanjing has been facing the crisis of lacking attraction to potential high-income and highly-educated migrants as well as residents, which might cause negative consequences in industrial transformation and innovation of the city (The Chinese People's Political Consultative Conference, 2019aConference, , 2019bZou, 2019). To attract and retain the target groups for the sustainable urban development, the city of Nanjing urges re-imaging. 162 visitors and 201 residents were interviewed through a questionnaire in 2019. The statistical analysis was performed, based on 363 responses on the sensory experience and atmosphere perceived in the selected architectures/urban areas, the affective city image, and behavioural intention, to uncover the correlations between all variables. The results shed light on branding strategies for Nanjing and other historical cities in terms of architectural/urban design, management and representation, stressing the significance of architectural sensory dimensions as well as multi-sensory experiences in shaping the affective city image, targeting both visitors and residents.

Perception of public space and its impact on city image
The sub-field of place branding, city branding, has substantially developed during the past few decades, especially regarding the communication of city images to target groups (Braun et al., 2014;Zenker et al., 2017). Kavaratzis (2004) developed a city image communication model that includes three levels, with urban landscape, structure, infrastructure and the city's behaviour constructing the primary communication. Braun et al. (2014) argued that city images are mostly communicated through their physical features, and the urban environment can make a strong communicator with experiential features (Brakus et al., 2009). These insights, emphasizing the vitality of public spaces in city image communication, are evident in diverse cases. The Guggenheim Museum, signifying urban regeneration, was once almost the synonym of Bilbao (Ockman, 2004). Similarly, "the City of Arts and Science" was once another name of the Spanish city, i.e., Valencia (Godfrey and Gretzel, 2016). The iconic, high-tech architectures such as the "Bird's Nest" and the "Water Cube" built particularly for Olympic Games in Beijing became new tourism resources and created a new sense of place for the city (Zhang and Zhao, 2009). Iconic architectures and urban areas in these cases act as a powerful tool to build a differentiated image and identity in response to the demands of target groups to boost the city's competitiveness (Gilboa et al., 2015;Lai, 2004).
Being commissioned to construct key elements shaping a distinctive city image, many big names in the late twentieth century, including Frank Gehry, Zaha Hadid, and Santiago Calatrava began to design architectures and urban areas functioning as urban landmarks as well as icons of their own individuality. Those landmarks are often perceived more as pieces of art than living spaces (Winkenweder, 1999). Godfrey and Gretzel (2016) has criticized this biased viewpoint in public buildings and urban areas, that overlooks other relevant sensory dimensions composing a whole space that makes sense to people who visit and use it. Their argument is in line with Pallasmaa's (2016) opinion that the biased focusing on the visual form should be responsible for the weak atmospheric quality in many contemporary spaces.
Apart from the visual factors, such as the scale and the form of the public space, there are other frequently used factors that designers and critics employ to describe and criticize architectural and urban design projects, concerning their vitality in composing the meaning of a space. The atmosphere is the dominant one used by architects, urban planners and researchers focusing on people's perception of buildings and urban areas from the perspective of phenomenology (e.g., Lynch, 1984;Norberg-Schulz, 2019;Seamon and Sowers, 2008;Trancik, 1986). The atmosphere is also the very key factor to transform a "space" to a meaningful "place". Many researchers understand a "place" as an amalgam of the physical space and the reflective social space (e.g., Heidegger, 1971;Hill, 2006;Merleau-Ponty, 2013;Norberg-Schulz, 1988;Raban, 2017). The physical space refers to the objective measurable physical foundation for the perceptual experience, while the reflective social space refers to the image of the perceived space reflected in people's minds. Only through real or ideated bodily confrontations derived from multi-sensory experiences, rather than mere visual observation, can one grasp the atmosphere and build up the reflective social space in one's mind (Pallasmaa, 2000), to establish a sense of place (Leatherbarrow, 2015). An atmosphere, in this context, is a complex spatial quality, a subjective reflection of the multi-sensory fusion of various perceptible factors in a space (Boiné et al., 2018). An atmosphere can be rapidly and holistically grasped as a general ambience of the public spaces to generate meanings for each individual (Pallasmaa, 2014).
Various sensory dimensions of spaces generating atmospheres have also been extensively discussed by architects and critics (e.g., Leatherbarrow, 2015;Zumthor, 1999). The proposed dimensions also overlap very much in the existing literatures. Among all the authors, Böhme (2016), based on elaborative reasoning, from the phenomenological perspective, has systematically proposed a series of factors, as generators of the atmosphere, that can be perceived through multi-sensory spatial experiences. These factors include materiality, lighting, sound, serial scenes seen through bodily movement. Böhme (2016) has also mentioned metaphor in built environment as one of the generators. However, considering that not every piece of architecture or urban space is associated with or discoursed upon with a metaphor, and that metaphors are not directly perceived through sensory experience, we choose to only discuss the former four dimensions in this article. The connotations of atmosphere and the four sensory dimensions generating atmosphere are presented in Table 1, based on Böhme's study.
According to Böhme (2016), the impression of serial scenes seen in spaces and lighting rely very much on the motional visual experience, through bodily movement in the built environment. The motional visual experience provides the experiencer, through physical presence, the constant changes of perspectives and focal points, that can convey an impression of space, including its size and physical boundary, measured by the moving body. Throughout the movement, the functioning of light relies much on people's perception of colour in motion and its modification under a dynamic illumination condition. The light, through characteristically modifying the total colour impression, stimulate the "sensual-moral effect of colour" (Goethe, 1970), adding to the overall visible emotional quality of a space that influences how one feels. The materiality, whereas is perceived through the haptic experience, based on people's knowledge of the haptic qualities and the synesthetic characters of the raw materials, and how they are articulated in construction, that help people perceive the materials and their concomitant building techniques. The sound, as the only factor perceived mainly audibly, shapes the feelings of the listeners in spaces through the generating of acoustic atmospheres. As Böhme (2016) argues, the characteristic feeling of an urban or rural atmosphere, is very significantly determined by the related acoustic space.
The multi-sensory spatial experience and the resulting atmosphere also serve as important constituents of a city, that lead people to create associations, internalized and formed as a city brand, further, simplified as an image in people's minds (Kavaratzis and Kalandides, 2015). As theorized by Kavaratzis and Kalandides (2015), "materiality", "practices", "institutions", and "representations" constituted a city. "Practices" denotes the production, use, and appropriation of the material substrate of a city, that identify with the multi-sensory spatial experience in public spaces emphasized in this research. Through "practices" in architectural and urban spaces, people can obtain a thorough bodily perception of the physical spaces. Such perception can be abstracted as the affective part of the city brand and the city image. Therefore, when discussing the role of public spaces in shaping the city image, it is sensible to observe the public space as an amalgam of the tangible built-substance and places that trigger multi-sensory spatial experiences.

Affective city image
To discuss the correlations between the multi-sensory experience in a space and the affective city image, we should clarify the connotation and components of the affective city image. Many researches have studied the affective side of city image, that contains feelings and emotions related to a place (e.g., Baloglu and McCleary, 1999;Hosany et al., 2007), highlighting the crucial role the affective image plays in evaluating a place (e.g., Sahin and Baloglu, 2011). Its determining effect on behavioural intention, including positive attitudes and intention to recommend a place to others, has also been stressed by many researchers (e.g., Ekinci and Hosany, 2006;Hosany, 2012). Among the most recent quantitative studies of city image or destination image, affective components of an image have been widely developed and employed (e.g., Almeida-García et al., 2017;Dragova et al., 2014;Manyiwa, 2018;Papadimitriou et al., 2015). The affective items used to compose affective image again overlap much in the existing literatures. The set of four items employed by Papadimitriou et al. (2015), i.e., Unpleasant/Pleasant, Distressing/Relaxing, Ugly/Pretty, Gloomy/Exciting, is adopted in this article, as they are not developed within a case-based context, and have been tested in several previous cases across different countries, hence can be generically employed in new case studies.

Hypotheses
Corresponding to the questions raised in Introduction, we proposed a series of hypotheses grounding on the literature review. First, regarding the relationship between the atmosphere of a piece of architecture or an urban space and people's perception of materiality, lighting, sound, serial scenes through bodily movement, Böhme (2016) has elaborated in his texts how an atmosphere is generated during the process that a human body senses the space, and how these four sensory dimensions constitute such process through visual, haptic and audible interactions. Havik and Tielens (2013b) also argued that the experience of atmosphere actually lies in the simultaneity of all these sensory dimensions, and the way they are perceived together. Hence, materiality, lighting, sound, and serial scenes seen through bodily movement should simultaneously influence people's perception of atmosphere.
As emphasized in Section 2.2, public spaces play a vital role in shaping the city image. The atmosphere, one's emotional impression of the built environment, resulting from the bodily experience, is the dominant spatial quality that endows meanings to a place. The more impressive one deem the atmosphere to be, the more likely a sense of place will be built within one's mind (Havik and Tielens, 2013a). Since the sense of place mainly describes the affective connection between one and a place (Raymond et al., 2017), it can largely predict an individual's affective image of a place (Spencer and Dixon, 1983). Hence, the atmosphere perceived in architectures and urban spaces in a city should be able to determine the affective city image.
The behavioural intention can be understood as one of the outcome variables of the city image (Oshimi and Harada, 2019). The construct of behavioural intention, from tourists' perspective, encompasses a tourist's willingness to revisit and recommend the destination (Wang and Hsu, 2010). From residents' perspective, similarly, the behavioural intention can include the intention to leave and provide positive wordof-mouth recommendation (Zenker and Rütter, 2014). Hence, in this research, we employ intention to stay (resident)/revisit (visitor) and positive word-of-mouth recommendation as the items composing the construct of behavioural intention. The relationship between the city image and the behavioural intention is widely recognized from visitors' viewpoint (Bigné et al., 2001;Kaplanidou and Vogt, 2007;Moon et al., 2013). Some literatures also proved such correlation from residents' viewpoint (Oshimi and Harada, 2019;Schroeder, 1996). As stated in Section 2.3, many researchers have proved the affective dimensions have Table 1 Key dimensions composing multi-sensory experience in architecture / urban area.

Dimension Connotation Source
Atmosphere "A tempered space of mindful physical presence", assessed by sensation, into which one enters or finds oneself and perceives a sense of place, described and characterized by expressions for sensitivities, such as "oppressive", "elevating", "open", and "confining", etc.

Materiality
The applied use of various materials or substances in the medium of building, shaping the atmosphere, perceived through visual and haptic experience Böhme (2016) Lighting Illumination for architectural design and function, perceived through visual experience and bodily movement, that modifies the total colour impression of the space, adding to the overall visible emotional quality of a space that influences how one feels, such as "cold", "warm", "gloomy" and "welcoming", etc. Sound Acoustics and music, key elements in the design of built environment, shaping the acoustic atmosphere, that can be described and characterized by expressions for sensitivities, such as "cheerful", "grave", "strident", and "soft", etc. Serial scenes seen through bodily movement Visions of a space when the body is in motionchanges of perspective and focal pointthat conveys an integrated and continuous impression of the public spaces.
determining effect on the behavioural intention (e.g., Ekinci and Hosany, 2006;Hosany, 2012). Thus, our hypotheses are proposed as follows: H1. Materiality, lighting, sound, and serial scenes seen through bodily movement simultaneously influence people's perception of atmosphere.

H2.
Atmosphere has a positive impact on affective city image.

H3.
Affective city image has a positive impact on behavioural intention.
Concerning that sociodemographic factors, such as income, ethnicity (Hami et al., 2016), gender (Stamps III, 1999), occupation (Zhen et al., 2020), and residential status (Faggi et al., 2013;Orland, 1988), actually affect people's preferences towards the aesthetic characteristics of public spaces, we should take these influences into account in respect to decision making in architectural and urban design and management for city re-imaging, aiming at target groups. Since the high-income visitors and residents form a very important part of the target group in city branding, it is vital to understand the differences between the preferences of visitors and residents with different income levels. Visitors and residents of various classes of annual income hold different motivations, expectations, and behavioural patterns in a city, as a consequence, their perception of multi-sensory experiences and its impact on atmosphere may vary to a certain degree. Hence, in this research, the residential status and the income level of respondents are employed as the moderators to test their moderation effect on the relationship between the perception of the four sensory dimensions and the atmosphere.
H4. The strength of the relationships respectively between materiality, lighting, sound, serial scenes seen through bodily movement and atmosphere will be moderated by the residential status.
H5. The moderation effect of the residential status for the strength of the relationships, respectively between materiality, lighting, sound, serial scenes seen through bodily movement and atmosphere, differs for different income levels.

Case of Nanjing
To testify the five research hypotheses, the city of Nanjing was chosen as the case area considering the desire of local government to improve its social and economic function. With a history dating back 2500 years, Nanjing has a prominent place in Chinese history and culture. The city has been recognized as "The City of Literature" by UNESCO as it boasts 100 cultural centres and over 300 book stores (UNESCO Creative Cities Network, 2019). Nevertheless, according to the study on the city image of Nanjing conducted by Tan (2014), although the brand of Nanjing has been measured as one of the most valuable ones worldwide (Global City Lab, 2019), it should be seen as a result of people's hereditary recognition of the rich historical, cultural and natural resources possessed by Nanjing, not the result of elaborate manipulation by local government in terms of city branding (Tan, 2014). As one of the consequences of inadequate city branding, the city of Nanjing is facing the awkward situation of losing high-income and highly-skilled workforce. Only around 30% graduates choose to work and live in Nanjing, a city where 83 universities and colleges are located and 562,000 students are trained annually (Wang, 2016). With the everincreasing aging population, Nanjing is in urgent need of attracting more tourists, investment, talented workforce, high-income migrants and retain high-income residents, so as to facilitate the prosperity of local industries, which is vitally important for strengthening the economic competitiveness in upcoming decades (The Chinese People's Political Consultative Conference, 2019a, 2019b; Zou, 2019). Hence, taking Nanjing, a city full of well-known architectures / urban areas, as the case area, uncovering the causal relationship between perceptions of sensory dimensions, atmosphere, affective city image and behavioural intention, and proposing strategies in terms of architectural / urban design and management based on the findings, will contribute to reimaging and rebranding of Nanjing and other historical cities with similar dilemmas and goals of urban development.

Data collection
A questionnaire (see supplementary materials) was designed to collect people's opinions on sensory perception, atmosphere, affective city image and their intentions of future behaviours to estimate the proposed theoretical model. Apart from the questions concerning interviewees' sociodemographic data, including residential status, age, gender, level of income, educational level, four main questions were included in the questionnaire.
To maximize the validity of the collected data about interviewees' sensory perceptions, in regard to their possible causal connections to evaluations of affective city image, it was important to make sure that interviewees were able to provide answers based on their experiences in public spaces that contribute very much to the affective city image of Nanjing in their minds. A preliminary survey of most representative architectures/urban areas in Nanjing was conducted online in August Fig. 1. Theoretical model. 2019. 44 options of architectures / urban areas were listed and provided in the question based on the most recognized iconic architectures / urban areas selected by the influential media (Sohu, 2017), local residents (Jinling Evening News, 2018) and domestic visitors (ctrip.com, 2019;Fliggy.com, 2017;qunar.com, 2020). 12 ones were selected in the preliminary survey as the iconic ones that can best represent the city image of Nanjing (see supplementary materials), according to the results voted by 53 participants who claimed that they have visited all the architectures / urban areas listed in the survey. These 12 buildings / built areas were then listed as the options of Question 1 in the questionnaire, asking people to choose 3 buildings / built areas that most represent the image of Nanjing in their minds. Instead of choosing one option, 3 options as a whole could provide us a more comprehensive knowledge of interviewees' spatial perceptions, counteracting the biases in the perception generated in each single public space. Question 2 measures the respondents' perceptions of materiality, lighting, sound, serial scenes seen through bodily movement, and atmosphere, based on the spatial experience in the three chosen buildings/built areas respectively. As shown in the translated questionnaire (see supplementary materials), each sensory dimension was measured with a single-item indicator, because the measure of sensory experience requires abstract thinking (Petrescu, 2013) and recall. They can be more efficiently measured with single-item indicators than multiple-item scales. Question 3 and 4 measure the constructs of city image, and behavioural intention.
Among the four questions, only Question 1 was designed as multiplechoices. The other three were all set as Likert scale questions, measured on a five-point scale. Questionnaires were distributed in an electronic version in a face-to-face manner, from 1st September to 15th December 2019, in Nanjing. All interviewees declared that they had visited every building / built area listed in Question 1. In total 401 questionnaires were filled. 363 ones were deemed as valid. The response rate was 90.5%. Among the 363 interviewees, there were 162 visitors and 201 residents. As shown in Table 2, most of the interviewees were young and middle-aged people. 58.7% of them were under 40 years old. The male and female numbers were in almost equal proportion. 93.4% of the respondents hold an educational level of bachelor degree or even higher degrees.
The categories of annual income were set according to "The Classified Per Capita Spendable Income of Urban Residents" published in "The Statistical Yearbook of China" (National Bureau of Statistics of China, 2018), which was an annual statistical publication, reflecting comprehensively the economic and social development of China. Referring to the published classification, the Per Capita Spendable Annual Income could be categorized into five groups. Due to the considerable number of young respondents who were still studying in the universities for their bachelor, master or Ph.D. degrees, with almost no income or little scholarship, in this question, the option of Very-Low-Income (under 13,723 RMB annually) was added along with the five options. Among the respondents, 41.9% were High-Income individuals, 11.6% held Above-Average-Income, 7.2% held Average-Income, 5.8% held Below-Average-Income, 8.5% held of Low-Income and 25.1% held Very-Low-Income. The large proportion in Very-Low-Income was mainly caused by the number of undergraduate and postgraduate students involved in this survey.
Considering that the city of Nanjing is very eager to retain and attract highly-educated employees with preferential employment, housing and migrant policies (Nanjing Municipal Human Resources and Social Security Bureau, 2020; Nanjing Municipality, 2020), students in universities also make an important part of the target group in this case. Given their talents needed by the city, they can also be potential high-income migrants or residents in the future. Hence, the interviewees were largely young, highly-educated, and of relatively high consumption capacity in Chinese context. Although this sample did not entirely represent the visitor and resident population in Nanjing, they surely composed a reasonable target group for this city.

Data process
The mean score of each perceived sensory dimension collected from 3 representative public spaces was adopted as the value of variable indicating the impressive level of each dimension composing the atmosphere. We employed the form of structural equation modelling (SEM) for data analysis testing the hypotheses 1-3. SEM is a comprehensive statistical approach testing hypotheses about directional and nondirectional relations among a set of observed and latent variables (Hoyle, 1995;MacCallum and Austin, 2000). Among nine key variables in the model, materiality, lighting, sound, serial scenes seen through bodily movement, atmosphere, residential status and income level are single-item variables. Although there used to be debates especially regarding the measurement reliability of single-item variables (Fuchs and Diamantopoulos, 2009), the inclusion of single-item indicators in SEM has been tested in many previous studies with successful outcomes (e.g., Adjei et al., 2010;Bettencourt et al., 2005). The reliability estimate for single items in this research was fixed to 1. The error variances of single items were calculated as "sample variance of the indicator * (1scale reliability estimate)" (Petrescu, 2013).
Prior to estimating the SEM, we tested the normality of the data, and the construct validity of the measurement model. The correlations between all variables, Cronbach's α, shared variances, means, standard deviations, skewness and kurtosis were calculated and shown in Appendix A and B. Appendix C presents the factor loadings (β), composite reliability (CR) and the average variance extracted (AVE). The values of skewness and kurtosis basically fell between the range proposed by George and Mallery (2010), determining that normality assumptions have been met (Kline, 2005). All items loaded significantly (p < 0.001) on to the constructs and all factor loadings were higher than 0.78, conforming to the criteria employed by Hair et al. (2010). The values of Cronbach's α were all higher than 0.7 (Cortina, 1993). The values of CR and AVE ranged from 0.67 to 0.89. According to Fornell and Larcker (1981), the internal consistency and the convergent validity of the construct was adequate. As shown in Appendix A and C, the AVE of each latent construct was higher than its highest squared correlation with any other latent variable (Fornell and Larcker, 1981). Thus, discriminant validity was established on the construct level.
We checked the Pearson correlations between the key variables and possible control variables (age, gender, educational level). While gender did not show a strong correlation with our key variables, age and behavioural intention (0.14**), educational level and affective city image (− 0.19**), behavioural intention (− 0.22**) significantly correlated. Thus, both age and educational level were included as control variables in our model (Fig. 1).
After estimating the SEM model with AMOS, the model was tested as a moderated moderation model (Hayes and Preacher, 2014) using regression analysis and the PROCESS tool (Hayes, 2017), to testify hypothesis 4 and 5. In this model, the relationships between the materiality (X1), lighting (X2), sound (X3), serial scenes seen through bodily movement (X4) and atmosphere (Y) were hypothesised to be modified by both residential status (M) and income level (W). In addition, the moderating impact of M was hypothesised to be conditional on W. In other words, in a certain condition (e.g., high W), we expect M to moderate the relationship between X1, X2, X3, X4 and Y, whereas in another condition (e.g., low W), we expect M either not to moderate or moderate to a lesser extent (Lam et al., 2019). The moderated moderation model was estimated respectively for X1, X2, X3, and X4, with PROCESS tool, version 3.5, model 3.

Results
The goodness-of-fit of this model is good: X 2 = 34.94, p = 0.089. CMIN/DF of 1.40 is lower than the benchmark of 5 (Arbuckle and Wothke, 1999); the RMSEA of 0.033 (CI 95% = 0.042-0.075) is below the cut-off value of 0.06 (Hooper et al., 2008) and PCLOSE = 0.86; the SRMR of 0.03 is lower than the threshold of 0.08 (Hu and Bentler, 1999), and finally, the TLI (0.989) and the CFI (0.996) are higher than the acceptable threshold of 0.95 (Hooper et al., 2008). Educational level (on behavioural intention = 0.12*) and age (on behavioural intention = 0.11***) are control variables in the model. As illustrated in Table 3, the results demonstrate that materiality, lighting, sound, and serial scenes seen through bodily movement simultaneously influence people's perception of atmosphere (H1), and atmosphere determines affective city image (H2). Affective city image has a positive direct effect on behavioural intention (H3).

Discussion
This paper uncovers the causal relationships between the multisensory experience in a space, the atmosphere of the space, the affective city image, and the behavioural intention. The results have confirmed that, the more impressed people are with the materiality, lighting, sound, and serial scenes seen through bodily movement in a space, the more impressive atmosphere will be generated during the spatial experience. The perceived atmosphere affects the affective city image shaped in people's minds, and in addition their behavioural intention. Meanwhile, the residential status and the income level have moderation effects on the influence from the sound and serial scenes seen through bodily movement to the atmosphere.

Theoretical contributions
The findings have emphasized the vitality of multi-sensory spatial experiences and atmospheres generated in architectures and urban spaces, in terms of affective city image formulation and communication, as well as behavioural intention prediction. The overarching role of atmosphere in spatial experience (Havik and Tielens, 2013a) and the generative roles of the key factors producing the atmosphere (Böhme, 2016) that have only been theoretically proposed are statistically proven in this research. Meanwhile, this research has also statistically unearthed the effects of multi-sensory architectural perceptions on building up one's affection towards a city and predicting one's willingness to stay in or migrate to the city, and recommend the city to others.
The results have also revealed that visitors and residents of various income levels actually act differently regarding the generation of atmosphere ( Figure 2). For instance, visitors with low income tend to perceive very impressive atmosphere when they find the sound of a space is impressive. While a reversed tendency is found for visitors with high income. Visitors with low income tend to perceive atmosphere as impressive when they find the serial scenes seen through bodily movement are very impressive. Such tendency is less obvious for visitors with moderate and high income. For residents, the tendency still exists, however it is more obvious with high-incomes ones than moderate and low ones. Meanwhile, for the residents, the differentiations across income levels are much less obvious compared to visitors. Hence, the residential status and the income level should be incorporated as crucial factors in city re-imaging from a perspective of multi-sensory architectural and urban experience. Notes: **p < 0.01 ***p ≤ 0.001; Standardized coefficients are reported with standard errors for direct effects in parentheses.

Practical implications
To improve the target audiences' affective images of the city, it is viable to strengthen people's sensory impressions of the spaces they visit and use, through architectural design, urban design, urban space management and place representation. In general, to impress people with the atmospheres in architectures and urban spaces, designers and planners can endeavour to produce spaces that are more likely to attract people to visit and use, grounding on proper design quality, ambient environmental conditions and support structures (Jens and Gregg, 2021), so as to activate motional− visual and haptic experiences within and around the spaces.
To provide people with a deep impression of the motional visual and haptic experiences, our knowledge about people's perception of sensory dimensions can be adopted corporately in terms of spatial design and management. For instance, materials with good haptic qualities should be chosen in design. Compared to industrial production, natural materials with unending variety, typical irregularity and non-conceptual recognizability manifest themselves with better haptic qualities (Böhme, 2016). Lighting can be designed in a way to overlay the entire space with an inspiring and dynamic colour-modifying hue to refine the characteristic mood of the space. The acoustic atmosphere or the soundscape (Ge and Hokao, 2005) can also be elaborately designed to endow a building, a square, or a pedestrian precinct with a unique character. On the basis of existing literatures on the soundscape and sound preferences in urban spaces (e.g., Kang and Zhang, 2010;Yang and Kang, 2005), natural sounds, such as the sound of water, in the form of fountains, springs or cascades, and birdsong are much preferred by users. Dynamic sound levels can also interest users more than the constant ones. Hence, designers can intersperse public spaces with natural elements to generate more natural sounds to create a preferable soundscape.
More design and management strategies should also be generated respectively across different sociodemographic groups. Corresponding to differentiations in spatial perception stated in Section 5.1, in order to impress high-income visitors with the atmosphere positively through sound, their preferences on acoustic atmosphere should be further studied to modify the current acoustic designs in the spaces they visit most. In addition, to build up a more positive link between the serial scenes seen through movement and the atmosphere for both highincome visitors and residents, it is vital to explore the reason behind the current correlations. Considering that visitors of different income level diverge much more than residents in this aspect, one possible reason is that, for some most visited urban areas in Nanjing, the traffic delay and overcrowding has been a severe issue, hampering smooth and pleasant spatial perception through bodily movement, in a car, a bus or on foot. Residents who live in the city hold better knowledge about the traffic condition so they can choose routes more wisely and make their spatial visual experience smoother in motion. High-income visitors who often travel by car, whereas, are more likely to be affected by the traffic issues. Therefore, to improve this condition, local traffic space should be managed to balance the motional− visual and travel demands of both residents and visitors. For instance, specific traffic routes connecting the places most-visited by visitors and residents can be respectively developed and recommended to the two groups.
As concluded by Kavaratzis and Ashworth (2005), apart from the physical settings, and people's experience in the city, various forms of city representations also play a crucial part in the city brand communication process. The representations can be utilized to strengthen people's impressions of the spatial atmosphere. For instance, personal spatial experiences represented, in the form of video, and shared on microblogs can be used to disseminate the multi-sensory characteristics of a public space, overcoming the spatio-temporal constraints. The emerging videoblogs posted on social network sites function well in representing and spreading motional visual and haptic experiences from personal perspectives. To help impress more potential migrants and residents with the multi-sensory spatial qualities through videoblogs, city marketers can choose to collaborate with notable bloggers to share multi-sensory spatial experiences that highlight the haptic, visual, and audible qualities of iconic urban spaces.

Limitations and future studies
Our study has a few limitations. First, this study did not use a representative demographic sample of visitors and residents of Nanjing. Future studies should use more representative samples to increase the reliability of the results. Second, the data collection was conducted only in the city of Nanjing. There is a possibility that the estimated model is more valid and relevant for this city. Future studies should be conducted among residents and visitors in other cities or countries. Third, as discussed earlier in Sections 5.1 and 5.2, the effects on atmosphere from sound and serial scenes vary across residents and visitors, across different income levels. To deeply understand the reasons behind such variations, future studies need to investigate the preferences on the acoustic atmosphere and motional bodily experiences, aiming at the target groups, and reveal the divergences between current spatial qualities and the preferred ones. Fourth, since the luminous and acoustic environment is variable and situational, people's perception of lighting and sound can vary throughout the day and on different occasions. Future studies need to take more factors, influencing the change of lighting and sound, into account, e.g., time of the day, season, special events.

Conclusion
This article, from a phenomenological perspective, introduces and stresses the vitality of multi-sensory experiences in perceiving the overall impression of a space, i.e., atmosphere, and in shaping the affective city image and the behavioural intention. It offers a broader view of seeing the architecture / urban area as an amalgam of built spectacles attracting visual attentions and places that make sense to people through bodily haptic, audible, and visual experience in motion. These two perspectives of understanding architectures and urban spaces are inseparable and should be jointly considered in terms of city image formulation and communication. For practitioners, in the design, management and promotion of architecture and urban spaces, the sensory dimensions generating the atmosphere should be holistically considered to strategize for city imaging and re-imaging.

Declaration of Competing Interest
None. Note: The factor scores come from the CFA-mode. CR and AVE have been calculated with the factor scores.