1 Introduction

Cultural consumption has long captured the interest of scholars in different fields, from cultural economics to urban studies to sociology. Starting from the end of the 1990s, and from the UK, cultural institutions have begun to be considered as fundamental components of public policies and programs aimed at reducing social exclusion (Sandell, 1998), creating a policy framework in which culture was officially recognised as an agent for good both at the individual and at the societal level (Belfiore, 2002). Research showed that cultural consumption plays an important role in shaping collective identities and contributes to the implementation of social norms (Hutter, 1996), as well as in shaping personal and social cohesion (Matarasso, 1997). Cultural consumption helps in defining values and behavioural structures (Crociata et al., 2015; Quaglione et al., 2017) and provides benefits on several domains such as employment, education, crime and health (SEU, 2001).

In this context, studying cultural consumption phenomena and understanding its determinants acquire relevance when considering its potential as a tool for both individual and societal well-being, that goes in the direction of creating more inclusive societies (Ferraro et al., 2019). In doing so, a fundamental initial stage involves investigating the distribution of cultural preferences and their associations with distinct social groups (Katz-Gerro, 2004). An extended strand of literature explains the relevance of the main determinants of cultural consumption (Alexander, 2003; Bennett et al., 2009; Katz-Gerro, 2002). Amongst these determinants, human capital and income levels have been empirically proven to play positive effects on cultural consumption levels (Katz-Gerro, 2002).

Although much has already been said about socio-economic and demographic determinants of cultural consumption, its spatial dimension has been taken less into consideration (Moldavanova et al., 2021). Space has been found to operate as an opportunity structure, with the location where people live influencing levels of cultural consumption through its socio-economic characteristics (Brook, 2016). At the same time, considering that many cultural policy frameworks are based on location-driven rationales, arguing that by increasing people’s proximity with cultural institutions their consumption level will increase, the effects played by space need not only to be investigated through a compositional or contextual dimension, but also through a direct one. Therefore, this paper aims at incorporating academic findings on the factors influencing cultural consumption with a focus on spatial distance and spatial dependence. Specifically, we seek to inquire about the extent to which spatial distance plays a role on neighbourhood’s levels of cultural consumption. Our main research question is: What is the impact of distance from the cultural offer on aggregate levels of cultural consumption? To analyse this issue, we move from the intuition of using human mobility data towards cultural institutions as a proxy for individual cultural consumption levels, moving away from the necessity of having access to survey data to research cultural consumption phenomena. The present paper is based on an original dataset that includes high-frequency location-based data (HFLB), commonly known as GPS data, on a sample of French individuals moving from their homes to visit cultural institutions. Specifically, data are available for six French cities that are Paris, Lyon, Nantes, Rennes, Marseille and Nice for the year 2017.

The paper’s contribution can be traced both in its methodological innovations and in the new knowledge, it delivers on the spatial dimensions of cultural consumption. In terms of the method used, we show three main novelties. First, we show that through the use of non-traditional data on human mobility it is possible to overcome the well-known lack of data on cultural phenomena as well as the potential biases of over/underestimation derived from the use of survey data (Duranti et al., 2005), introducing a data collection method which is easily replicable and scalable in different territorial settings. Second, we are able to investigate the effect of explanatory variables on the intensity of cultural consumption, integrating previous literature which has mostly focussed on probit and logit regression models. Third, we control for spatial dependence in our model by fitting a Spatial Lag model, to reduce the positive and significant spatial autocorrelation in the OLS model’s residuals, showing the territorial importance of the traditional socio-economic determinants of cultural consumption. At the same time, our results show that the effects of the socio-economic variables traditionally regarded as significant determinants of levels of cultural consumption propagate across space with a distance-decaying effect and that proximity with the cultural offer, although being significant, has a lower effect than other explanatory variables (such as income and education). Furthermore, we find that the relationship between income and level of cultural consumption is a nonlinear one (inverted U-shape), explained by the fact that in the area of analysis high income individuals cluster in sub-urban peripheral areas, showing how spatial patterns influence cultural consumption not only directly (proximity) but also through the mediation of other explanatory variables.

The paper is organised as follows. First, we present an extensive literature review on cultural consumption. Second, we introduce the data collected for the analysis and the methods used to compute variables from non-traditional data sources. We then proceed in explaining the research methods, introducing the criteria used to assess spatial dependence in the analysis. Finally, results are presented and discussed, with a particular focus on their implications in terms of cultural policies.

2 Literature review

In cultural economics, demand-side analyses have for long captured the interest of scholars. Given the particular nature of cultural goods, modelling cultural consumption is a difficult task, with many authors suggesting the need to open cultural economics to contaminations from other disciplines (Crociata, 2009). Traditional economic theory is based upon the principle of rational consumers, according to which the marginal utility obtained from a given good diminishes when consumption increases. According to this principle, neoclassical theories of cultural consumption (Becker & Stigler, 1977) consider cultural tastes as immutable throughout time, modelled through a set of exogenous variables. At the same time, other scholars consider cultural goods to be extraneous from this phenomenon, as their consumption is characterised by positive virtuous cycles in which consumption triggers mechanisms of addiction towards this particular typology of goods (Benhamou, 2001). Cultural goods are characterised by a strong information asymmetry, making it impossible to evaluate them before the act of consumption (Trimarchi, 1993). Cultural consumption is a three steps process (Throsby, 1994): (i) immediate satisfaction derived from the act of consumption; (ii) accumulation of knowledge; (iii) accumulation of knowledge and experience affecting future consumption. It is therefore characterised by what has been described as a form of learning-by-consuming (McCain, 1995), where consumption behaviours depend on the amount of past cognitive and perceptive information accumulated by consumers, and tastes are therefore endogenous.

Inside research on cultural consumption, significant efforts have been directed at understanding its determining factors. Research has for long investigated the link between cultural consumption and social stratifications, mostly regarding economic wealth, education, gender and age (Alexander, 2003; Bennett et al., 2009; Katz-Gerro, 2002). At the individual level, empirical evidence suggests that education and household income are consistent determinants of levels of cultural consumption, remaining stable predictors across countries with different cultural policies (Falk & Katz-Gerro, 2015). Human capital, measured as the education level, is found to be positively correlated to the probability of participating in overall cultural activities (Hallmann et al., 2016; Heikkilä & Lindblom, 2023; Muñiz et al., 2011), as well as to the propensity of consuming specific categories of cultural products such as the music one (Favaro & Frateschi, 2007; Prieto-Rodríguez & Fernández-Blanco, 2000). At the same time, the marginal effect played by higher levels of education varies between different typologies of cultural activities, being more relevant for highbrow cultural products (Suarez-Fernandez et al., 2019). Similar strong and consistent relations are found for income measures, suggesting that consumption of cultural goods is higher for higher income individuals (Chen & Tang, 2021; Courty & Zhang, 2018; Ringstad & Løyland, 2006), is price elastic (Lévy-Garboua & Montmarquette, 1996), and that such effect is not the same for all categories of cultural goods (Ringstad & Løyland, 2011). Research has also investigated the effects played by external factors on individual levels of cultural consumption, stressing the positive effect played by the regional cultural infrastructure (Rössel & Weingartner, 2016) or the city’s level of specialisation in the creative economy (Rodríguez-Puello & Iturra, 2022).

Until recently, cultural consumption has mostly been explained through a set of socio-economic indicators (“Appendix 1”—Table 8). Household income, education, gender and age are consistently found to be positively correlated with higher levels of cultural consumption. Research has also focussed on empirically explaining the effect of other variables, such as social status (Chen & Tang, 2021) and price (Ringstad & Løyland, 2006) on individual levels of cultural consumption.

Contextual variables have been explored as well, finding positive correlations between levels of cultural spending and infrastructure and individuals’ cultural consumption levels (Rodríguez-Puello & Iturra, 2022). Generally speaking, there seems to be a theoretical and empirical consensus on the effect played by a set of determinants on patterns of cultural consumption, creating groups of more privileged individuals (e.g., higher income and more educated) that show higher consumption levels of cultural goods (Courty & Zhang, 2018). It is therefore evident the way in which the unequal distribution of several socio-economic factors of social stratification amongst the population not only determines, but also influences patterns of inequality in the observed habits of cultural consumption. In this context, investigating the distribution of cultural preferences and their associations with distinct social groups acquires both research and policy relevance (Katz-Gerro, 2004).

Although much is known on the effect played by socio-economic individual characteristics on cultural consumption habits, there is a dearth of studies on the effects played by spatial dimensions of inequality (Moldavanova et al., 2021), namely on studies that take into account space and the fact that individuals reside in different places (Widdop & Cutts, 2012). Spatial dependence has been found not only to play a role in determining a city's public spending on cultural activities (Getzner, 2021), but also to operate as an opportunity structure for cultural participation through institutional and geographical effects (Brook, 2016). The geographical location in which individuals reside significantly influences their life experiences. Consequently, it becomes apparent that neighbourhood effect theories do not undermine or supplant sociological theories concerning social stratification; instead, they delve into the spatial dimension, investigating the interplay between social phenomena and spatial factors, such as access to amenities and services (ibid). Accounting for spatial dynamics, together with socio-economic ones, allows to better understand the power of clustering and agglomeration of individuals and how such phenomena risks exacerbating patterns of inequalities in which place is not a background element but a fundamental determinant (Cunningham & Savage, 2015).

As Evans and Foord (2008) suggest, many of the current cultural policy frameworks act upon the rationale according to which cultural institutions should be provided based on proximity and transport availability. In recent times, urban planners have shown a significant interest in ensuring the fair distribution of public facilities, operating under the assumption that the location of such amenities correlates with various measures of accessibility (Tahmasbi et al., 2019; Taleai et al., 2014). Scholars have defined urban equity as the extent to which public facilities are distributed evenly across different areas, corresponding to the spatial variations in the need for these services (Omer, 2006). Recent research has started to examine the adverse effects stemming from cultural policy models that have been implemented (Moldavanova et al., 2021; Sacco & Crociata, 2012). The focus has been on addressing concerns related to the equitable accessibility of cultural amenities, aiming to reduce disparities amongst neighbourhoods (Houston & Ong, 2013) and identifying spatial inequality patterns, including the core-periphery one (Moldavanova et al., 2021).

In this context, understanding how space, and spatial distance, determine levels of cultural consumption become pivotal for the well-being of urban societies and for the formulation of urban cultural policies aimed at reducing patterns of geographical inequality, which have been long neglected by policy makers (Brook, 2016). Under the rationale that considers cultural consumption as a tool to reduce social exclusion, understanding how space interacts with traditional determinants of cultural consumption becomes imperative for the formulation of equitable cultural policies.

Research in cultural economics, more specifically on determinants of cultural consumption, has to face the well-known lack of data on cultural phenomena. Secondary data on the cultural offer, mostly in terms of the availability of cultural institutions, have been used to explore supply-driven determinants of cultural consumption phenomena (Getzner, 2021; Rössel & Weingartner, 2016). From the demand side, since secondary data with suburban geographic granularity are not available, scholars mostly rely on survey data capturing individual participation in cultural activities (Brook, 2016; Falk & Katz-Gerro, 2015; Getzner, 2020). Recent research has shown the relevance of using big data and machine learning techniques for the development of evidence-based cultural policies, for example by being able to predict indexes of cultural demand according to the local cultural supply (Lee et al., 2022). This paper moves from the intuition of using non-traditional data to measure individual levels of cultural consumption, both overcoming the previously mentioned lack of data and achieving an increased level of spatial granularity that allows studying intra-urban patterns of cultural consumption. Human mobility information, coded into GPS data, by providing evidence on visiting habits and distances (Dong et al., 2020), allows researchers to investigate segregation patterns in a given area by unveiling details that cannot be captured through traditional data (Biazzo et al., 2019) and to understand its spatial distribution at a very granular level (Alessandretti et al., 2020). Furthermore, by cross-referencing mobility data with location and semantic information of a territory’s points of interest (POIs) collected from OpenStreetMap (OSM), it is possible to categorise destinations as either cultural institutions or not. Academic interest in Volunteered Geographic information (VGI) has recently grown due to its perceived potential in adding value to the study or urban life and place representation (Ballatore & De Sabbata, 2020). In this context, OSM data, although with significant concerns about its reliability and consistency for the study of spatial consumption inequalities (Rabiei-Dastjerdi et al., 2020), are considered to be quantitatively rich as well as positionally and semantically accurate, with previous research showing its accuracy for cultural statistics (Balducci, 2019). Using GPS and OSM data, we are able to measure human mobility patterns from the residency location of individual users towards cultural institutions, using it as a proxy of cultural participation.

3 Data

The data used for this analysis come from an original dataset provided by Cuebiq, a private human mobility data provider, covering six French cities during a one-year time span (2017). For each record, we have an anonymous ID of the user, their latitude and longitude as well as a timestamp describing the local time and data. All data are collected from users who opted to share anonymised data through a privacy-compliant process. The six cities of the case study are Paris (Île-de-France), Lyon (Auvergne-Rhône-Alpes), Nantes (Pays de la Loire), Rennes (Bretagne), Marseille and Nice (Provence-Alpes-Côte d'Azur). To run the analysis, all individual data is averaged at the most granular spatial area defined by the French national statistics institute (INSEE), IRIS zones, which represent the basic spatial component for the dissemination of sub-municipal data for all municipalities with over 10,000 inhabitants and most of the municipalities between 5000 and 10,000 inhabitants (Table 1).

Table 1 Dataset dimensions

For each user, the dataset contains an anonymous ID as well as the latitude, longitude and timestamp of all recorded positions. Visits are assigned according to Hariharan and Toyama (2004), clustering groups of registered positions in a diameter of \(D_s\) = 200 m for at least a given time \(T_s\) = 600 s through a Density-based Spatial Clustering of Applications with Noise—DBSCAN algorithm. DBSCAN groups data points based on their density in a given neighbourhood, without the need to specify the number of clusters beforehand, and effectively handles clusters of arbitrary shapes. Home locations are assigned considering the position where the user spent the most time in the 9 p.m. to 8 a.m. hours on all nights of the week. Only users with at least 10 valid stops registered are retained. Stops are then assigned to points of interest (POIs) using OpenStreetMap (OSM) data and cultural institutions are manually categorised according to OSM tags,Footnote 1 following the classification presented in Table 2. The classification adopted derives from an original re-adaptation of existing taxonomies on CCIs and sites of cultural production (DISCE, 2022; Domenech et al., 2011), to fit the analysis of cultural institutions and sites of cultural consumption. For the purpose of this analysis, no distinction is made amongst different categories of cultural institutions and Table 2 is presented with the aim of showing the OSM tags used to categorise a given POI as a cultural institution.

Table 2 Proposed classification of cultural institutions

With the aim of studying the socio-economic characteristics that determine cultural consumption patterns in a spatial setting, the following variables have been collected moving from the academic literature on the topic:

Frequency of cultural consumption: From the GPS dataset, for each IRIS zone, it is possible to compute the average frequency of cultural consumption (\(f_{iris}\)), or how many times on average each individual \(i\) residing in a given IRIS zone visited a POI classified as a cultural institution:

$$f_{{\text{iris}}} = \mathop \sum \limits_{i = 1}^U \left( {\mathop \sum \limits_{p = 1}^N f_{p, i} /S_i } \right)/U$$
(1)

where \(f_{p,i}\) represents the number of times the user \(i\) visited the cultural institution \(p\), \(N\) is the number of cultural institutions visited by the user \(i\),

\(S_i\) is the number of total stops registered in the dataset for each user, \(U\) the number of users in the dataset residing in the given IRIS zone.

The normalisation over the number of total stops registered for each user \(S_i\), results in the measure expressing the percentage of cultural trips over total trips and not the absolute value of cultural trips registered for each user. The normalisation over the total number of stops is deemed necessary to reduce the bias due to data coverage, which is not constant across users.

Spatial proximity with the cultural offer: The proximity indicator adopted (\(P_{{\text{iris}}}\)) is theoretically inspired from the 15-min city concept (Moreno et al., 2021) as a measure of the distance of the city’s cultural institutions from each IRIS zone’s centroid location; calculated as the average time needed to reach the 20 nearest cultural POIs.

$$P_{{\text{iris}}} = \mathop \sum \limits_{n = 1}^{20} \frac{{t_{{\text{iris}}, n} }}{20}$$
(2)

where \(t_{{\text{iris}}, n}\) represents the time needed to reach the 20 nearest cultural institutions \(n\) from the centroid of the given IRIS zone. Therefore, \(P_{{\text{iris}}}\) values are the same for all users \(i\) with the home location in the same IRIS zone. The travel time is estimated through the OSRM algorithm by computing optimal walking paths.

Quality of public transport: As a measure of the quality of public transport for each location, we adopt the velocity score (Biazzo et al., 2019). The metric is computed for each centroid of the IRIS zones and quantifies the average velocity needed to move away from the area using public transport aggregating information from all the isochrones starting from the chosen point. Isochrones are the group of locations that can be reached from a given point in a given amount of time and are calculated using GTFS static data (General Transit Feed Specification) and algorithms capable of finding the optimal path (in terms of time) using public transport. In general terms, the velocity score indicates the quality of public transport in a given location.

Spatial mappings of the three computed variables have been realised for the city of Paris (“Appendix 2”), for it being the one with the highest amount of data available. The analysis of the geographical distribution of the three variables reveals clear core periphery patterns, particularly pronounced for the two variables describing access measures: proximity with the cultural offer (“Appendix 2”—Fig. 5) and the velocity score (“Appendix 2”—Fig. 6). The spatial distribution of the average frequency of cultural consumption (“Appendix 2”—Fig. 4), whilst maintaining on average higher values in core areas has a less defined pattern, mostly given by the fact that the background data captures real world mobility phenomena and not theoretical computations.

The other explanatory variables used in the analysis are directly collected from secondary sources and computed as follows (Table 3).

Table 3 Variables description

As shown in Table 3, being the data used for the analysis collected from different data sources with varying geometries, spatial interpolation has been performed to assign variables computed with a hexagonal tessellation grid to the corresponding IRIS zones, by weighting variables from the source dataset according to their overlap with the target dataset’s polygons, and afterwards re-aggregating them over the target dataset’s geometries. Table 4 shows the statistical descriptions of the variables used in the analysis, cleaned from outlier values.

Table 4 Statistical descriptions

4 Methods

As the aim of this paper is to investigate the extent to which spatial distance plays a role in explaining neighbourhood’s levels of cultural consumption, we measure our dependent variable as the average, for each IRIS zone, of the frequency of cultural consumption of all individuals that reside in the given IRIS zone.

Whilst most of the academic literature on the topic uses either probit (Falk & Katz-Gerro, 2015) or logit (Brook, 2016) models, due to constraints derived from the use of survey data, having access to a continuous dependent variable, we first explore the relationships between our measures through an OLS model. The model is specified as follows:

$$y_{{\text{iris}}} = \alpha + \mathop \sum \limits_{z = 1}^Z \beta_z x_z + \epsilon$$
(3)

where \(y_{{\text{iris}}}\) represents the level of cultural consumption measured for each IRIS zone, and \(Z\) the total number of covariates used for the analysis. At the same time, given the spatial nature of our data which makes the use of an OLS model problematic, we advance the literature on the determinants of cultural consumption by fitting a spatial autoregressive model. First, a spatial weights matrix is built. Given the level of granularity of the geographical data that result in socio-economic data not being available for several IRIS zones (mostly in core areas of urban centres where their spatial extension is smaller), creating several islands, distance-based spatial weights matrix is deemed more appropriate than a neighbour-based one. Spatial weights are therefore measured supplementing a neighbours list with weights for four types of threshold distances: 2 km, 4 km, 8 km and 20 km. These values are chosen as they describe the geographical areas considered in the analysis: 2 km represents the minimum distance necessary to obtain an adequate number of neighbouring values in the analysis; 8 km is indicatively the average diameter of the municipalities of the three largest cities analysed (Paris, Marseille and Lyon); 4 km was selected as it represents an intermediate value between the neighbourhood and municipal size; 20 km is indicatively the average diameter of the entire urban areas considered in the analysis. Distance-based weights are built according to the IDW—inverse distance-squared weighting method (Anselin, 2018), to control for the distance decay effect in accordance with Tobler’s first law of geography (Tobler, 1970). We adopt a value of α > 1 (α = 2) to account for a faster rate of distance decay effects given the presence of spatial autocorrelation in our data. Distance weights are therefore specified as follows:

$$w_{ij} = 1/d_{ij}^\alpha \quad \alpha = 2$$
(4)

Spatial autocorrelation in the OLS model (Eq. 3) is checked through a Moran’s I test on the model’s residuals (Cliff & Ord, 1981). To control for the alterations produced both by global and row-standardisation techniques (Westerholt, 2022), respectively, the over-representation assigned to areas with high connectivity (core areas) and to areas with low connectivity (peripheral areas), we standardise the spatial weights matrix following the variance-stabilising technique suggested by Tiefelsdorf et al. (1999). Following the presence of positive spatial autocorrelation in the Model’s residuals, several spatial regression models are specified, namely a Spatial Error model, a Spatial Lag model, a Spatial Durbin model, a Spatial Durbin Error model and a Spatial Autoregressive Combined model. Following a comparison between the Bayesian Information Criterion (BIC) values of the fitted models, a Spatial Lag model is built, specified as follows:

$$y_{{\text{iris}}} = \alpha + \rho Wy_I + \mathop \sum \limits_{z = 1}^Z \beta_z x_z + \epsilon$$
(5)

where \(y_I\) is the set of values of \(y\) from IRIS zones other than \(y_{{\text{iris}}}\), and \(W\) is the distance-based spatial weights matrix, which is higher for \(y_I\) observations that are spatially closer to \(y_{{\text{iris}}}\). \(\rho\) is the spatial autoregressive coefficient.

5 Results

To check the significance and the strength and direction of the effects played by proximity over aggregate levels of cultural consumption, we run an OLS model (Eq. 3) on standardised variables. At the same time, considering the spatial nature of the analysis, and with the assumption that values of other IRIS zones have an impact in the determination of cultural consumption levels of neighbouring areas, we performed the Moran’s test for spatial autocorrelation on the residuals of the OLS Model using four distinct formulations of the distance-based spatial weights matrices.

Results show the existence of significant positive spatial autocorrelation, which decreases when the distance threshold set for creating the spatial weights matrices increases. Since we can reject the null hypothesis of the Moran’s I test, we conclude that by including spatial dependence in our model, we are able to remove spatial noise from our estimate of cultural consumption neighbourhoods’ levels.

The comparison of the BIC values for the different Spatial models’ specifications (Fig. 1) suggests that a Spatial Lag model represents the best fit for all but one spatial weights matrixes. Nonetheless, for clarity and readability purposes, we proceed in running a Spatial Lag model for the four different distance-based matrixes.Footnote 2 Results of the Spatial Lag model show a positive and significant coefficient for the spatial lag term \(\rho\) (rho), consistent with the positive spatial autocorrelation found through Moran’s I test, indicating that the spatial model is a better fit than the OLS one, which is also suggested by the AIC values of the four spatial models.

Fig. 1
figure 1

Comparison of BIC values amongst different models’ specifications and with different spatial weight matrixes. Figures show BIC values across different Spatial models’ specifications for the four distance thresholds adopted to build spatial weights matrixes

All Spatial Lag models have positive and significant spatial autoregressive coefficients \(\rho\) (rho). Results show that all the variables are significant in determining the level of neighbourhood cultural consumption and that they validate previous research on the topic (“Appendix 1”—Table 8). The income variable initially showed a not expected negative sign, which is explained by the inverted U-shaped curve obtained when plotting the variable over the level of cultural consumption (Fig. 3), which is corrected by inserting its quadratic form in the model’s specification.

The negative sign initially obtained can be explained by the spatial distribution of income levels in French cities. A visual inspection of Fig. 2, which shows a spatial elaboration of INSEE data on household income by IRIS zone, reveals the concentration of high-income individuals in sub-urban peripheral areas, creating neighbourhoods with high values of average income and high income located in peripheral areas and therefore with lower access to cultural institutions. The geographical distribution of high-income neighbourhoods in sub-urban peripheral areas creates an inverted U-shaped correlation between income and cultural consumption levels (Fig. 3) according to which, cultural consumption increases by increasing income up to a certain cut-off value, after which the level of cultural consumption rapidly decreases.

Fig. 2
figure 2

Source: our elaboration on INSEE data. The Figure shows average households’ disposable income for each IRIS zone in Paris as elaborated by INSEE for 2017. The Figure shows that high income neighbourhoods tend to cluster in sub-urban peripheral areas

Spatial distribution of Income in Paris.

Fig. 3
figure 3

Scatter plot of Income on Frequency of Cultural Consumption–—Paris. Note: to increase the readability of the data visualisations, and to reduce the noise of the data at the users’ level, income values (x) have been rounded to create bins of values and y variables have been averaged over each rounded x

Although Fig. 2 provides the spatial distribution of Income exclusively for the city of Paris (chosen for the greater geographical dimension and the higher amount of available data that make the visual analysis of the spatial distribution more intuitive) the geographical mapping of Income has been explored for all six cities, resulting in similar patterns. Figure 3 shows the scatter plot of Income over the Frequency of Cultural Consumption at the individual level for the city of Paris. The scatter plot shows the Locally Weighted Scatterplot Smoothing algorithm (LOWESS), a supervised learning regression algorithm used for regression analysis where data attributes don’t allow linear regression models to produce a good fit (Dobilas, 2020).

Results show that proximity with the cultural offer has a positive effect on the level of neighbourhood’s cultural consumption (Brook, 2016)—the negative sign of proximity derives from the fact that the variable captures the amount of time necessary to reach the closest cultural POIs, meaning that for higher values proximity decreases. At the same time, proximity’s coefficient has the lowest value of all explanatory variables.

Regarding the other explanatory variables, the average education level, used as a proxy of the neighbourhood’s average human capital, has a positive correlation with the level of cultural consumption (Courty & Zhang, 2018; Falk & Katz-Gerro, 2015; Favaro & Frateschi, 2007; Hallmann et al., 2016; Prieto-Rodríguez & Fernández-Blanco, 2000; Rössel & Weingartner, 2016). The same goes for the regional variables, namely the level of per capita cultural expenditure of the regional government and the regional density of cultural institutions, which show a positive correlation with the neighbourhood’s level of cultural consumption (Rodríguez-Puello & Iturra, 2022; Rössel & Weingartner, 2016; Widdop & Cutts, 2012). The quality of public transport has a positive and significant effect as well (Brook, 2016).

When analysing coefficients of a Spatial Lag model it needs to be taken into consideration that the effect of an explanatory variable is the result of two distinct components:

  • Direct effect: the average over all areas of the impact that a unit increase in the explanatory variable X in area j has over y in the same area j;

  • Indirect effect: captures the spatial spillovers and is the impact that a unit of the explanatory variable X in region j has over y in all other areas jointly.

Both direct and indirect effects of all the explanatory variables validate previous findings in terms of the direction of the effect. Once again, Income and Education have the strongest direct and indirect effects on the explanation of cultural consumption levels whilst the effect of proximity is amongst the lowest. It has to be noticed that almost 30% of the effects played by explanatory variables are spatial spillover effects (indirect ones), meaning that spatial dependence is present in the studied phenomenon.

6 Discussion

Results show that spatial distance matters in explaining cultural consumption patterns. It does not only matter as a determinant of cultural consumption aggregate neighbourhood levels, measured as the spatial proximity with the cultural offer, but also as the carrier through which other socio-economic variables influence cultural consumption patterns inside cities. By modelling cultural consumption in a spatial setting, we show that the effects played by those factors traditionally regarded as the major predictors of consumption levels are not isolated in space, but they propagate from one neighbourhood to the other with a distance-decaying effect. Furthermore, we show that spatial autocorrelation limits results obtained through a non-spatial regression model, showing that spatiality needs to be taken into account when studying cultural consumption patterns in urban settings.

Similarly, to other studies adopting a spatial perspective to the analysis of cultural consumption determinants (Brook, 2016; Widdop & Cutts, 2012), we show that the expected effects played by the traditional socio-economic determinants hold both in the significance and in the direction of the effect. Regression results show that all explanatory variables have a significant effect in explaining aggregate cultural consumption levels in neighbourhoods, and that it is positive both for their direct and indirect effects. We show that human capital, proxied as the average level of education, has a positive effect in the explanation of cultural consumption levels and that so do the regional per capita expenditure in culture and the regional density of the cultural infrastructure. At the same time, the relationship between income and level of cultural consumption takes the form of an inverted U-shape curve. This is explained, for our specific case study of French cities, by the fact that high income individuals cluster in sub-urban peripheral areas, creating rich neighbourhoods located far away from the cities’ cultural offer (Fig. 2). The graph (Fig. 3) shows the expected positive effect of income on cultural consumption levels up to a given cut-off value, after which the effect becomes negative. Such a result is interesting from several points of view. First of all, it shows that the relation between income and cultural consumption is a nonlinear one and that for high-income neighbourhoods, the level of cultural consumption rapidly decreases. This also argues that the shape of the curve describing the correlation between income and cultural consumption is highly influenced by spatial factors. For this specific case study of French cities, the U-shaped curve results from the effect of the spatial clustering of high-income individuals in sub-urban peripheral areas.

The paper shows that spatial proximity plays a significant effect in determining levels of cultural consumption, but also that the intensity of such effect is lower than the one of other explanatory variables. Therefore, notwithstanding its role in explaining aggregate neighbourhood levels of cultural consumption (negative and significant coefficient); its effect is amongst the lowest ones of all explanatory variables, arguing for significant policy implications. Considering that urban policy makers have the ability to condition availability and influence accessibility to cultural institutions (Katz-Gerro, 2004) and that the location of publicly provided cultural goods is amongst the most rapidly actionable levers to increase access, many cultural policy regimes act upon proximity and transport availability rationales (Evans & Foord, 2008) to equalise access to culture across cities. Results show that, although higher distance from cultural institutions reduces the levels of cultural consumption, socio-economic determinants have a much bigger impact on the studied phenomenon, arguing in favour of more structural policy approaches.

If we intend to use cultural consumption as a tool to make our urban societies more inclusive, understanding which are the factors that create spatial inequalities in the way different urban locations differently consume the city’s cultural offer becomes pivotal to formulating efficient evidence-based cultural policies. We show that—in accordance with urban equity theories that assert the existence of a correlation between the location of public facilities and different measures of accessibility (Tahmasbi et al., 2019; Taleai et al., 2014)—spatial distance plays a negative impact on neighbourhood’s levels of cultural consumption.

At the same time, we also show that socio-economic factors, mostly in terms of human capital and income levels, play a bigger effect in the description of cultural consumption levels, suggesting that structural patterns of inequalities have a stronger influence on patterns of inequalities in cultural consumption and arguing in favour of urban cultural policies that integrate location-driven rationales with approaches aimed at equalising access for socio-economically disadvantaged targets.

7 Conclusions

This research is grounded in a theoretical framework that sees culture and cultural consumption as a powerful policy lever to reduce social exclusion in urban communities. It therefore aims at deepening the understanding of urban cultural consumption phenomena by including the spatial dimension in the analysis of cultural consumption determinants and by exploring the effect played by spatial proximity with cultural institutions over cultural consumption habits.

Previous research has argued for the importance of understanding the role that place and context play in describing difference in cultural consumption; both by showing that neighbourhood effects, particularly at a sub-municipal level of granularity, have a significant descriptive role (Widdop & Cutts, 2012) and by proving that spatial access measures have a positive effect on levels of cultural participation (Brook, 2016). Moving from such results, this paper introduces several methodological and operational novelties to the academic literature on the topic. First, it moves from the intuition of using human mobility data towards cultural institutions as a proxy of individual levels of cultural consumption, in order to overcome the traditional lack of data on cultural phenomena which has so far led to the almost exclusive use of survey data for analysing these phenomena. In this scenario, it is no coincidence that both Brook (2016) and Widdop and Cutts (2012) focus their analysis on the UK, one of the few countries on which cultural consumption data are available at sub-municipal level (via the Taking Part Survey). The operationalisation of human mobility data as a proxy for cultural consumption levels allows, as shows in this paper, to replicate such analyses over different geographical areas without depending on the availability of large-scale survey data.

Second, thanks to the use of GPS data, we are able to analyse the effects played by several explanatory variables on the intensity of cultural consumption levels, integrating previous literature that mostly focussed on probit and logit regression models (Courty & Zhang, 2018; Falk & Katz-Gerro, 2015). Third, we control for spatial dependence in the analysis, showing the existence of positive and significant spatial autocorrelation in the Models’ residuals and concluding that Spatial Lag models are a better fit. We therefore show that the traditional socio-economic determinants have a descriptive effect on the level of cultural consumption not only at the individual level but also at the territorial one (Table 5).

Table 5 Moran’s I test for spatial autocorrelation on Model 2 residuals

At the same time, although Table 6 shows the results of a Spatial Lag regression model for all the spatial weight matrixes, it must be pointed out that Fig. 1 indicates a Spatial Autoregressive Combined (SAC) model as a better fit for the shortest spatial scale (W = 2) and that different spatial regression models imply real interpretations. Whilst a Spatial Lag model assumes spatial effects to be propagated over long distances, a Spatial Error model assumes the spatial dependence to be driven by unobserved characteristics that are specific to each location, implying therefore that the spatial effects are localised. The SAC model includes autocorrelation both in the dependent variable and in the error term. From the comparison of BIC values amongst different models’ specifications and with different spatial weight matrixes (Fig. 1), it emerges that the SAC specification is the best fit at the smallest spatial scale (W = 2 km) suggesting a potential change in the nature of the studied phenomenon when explored at smaller territorial scales (Table 7).

Table 6 Spatial Lag models’ results
Table 7 Effects of explanatory variables—W = 2 km

The research has several limitations, mostly derived from the structure of the data used for the analysis. The human mobility dataset used covers only a small portion of the entire population, resulting in a high degree of noise in the model’s dependent variable. This can be visually assessed from the spatial distribution of the frequency of cultural consumption (“Appendix 2”—Fig. 4), which shows less precise spatial patterns than other variables based on other data sources (“Appendix 2”—Figs. 5 and 6). At the same time, currently available secondary data on socio-economic determinants present several criticalities at decreasing spatial scales, as, due to the lack of significant amounts of information, INSEE does not release income and education data for many IRIS zones. This results in the loss of observations for smaller zones, which are mostly concentrated in core urban areas and therefore biases the analysis in favour of peripheral neighbourhoods. At the same time, by using anonymised human mobility data, we are not able to control for the effect played by factors such as gender and age on the studied phenomenon, nor to control for the price of the cultural goods consumed.

Notwithstanding its limitations, the analysis opens the doors to interesting future research on the topic. We demonstrate the possibility of studying cultural consumption phenomena through non-traditional data, overcoming the need to resort to survey data which is often expensive and time-consuming to collect. By introducing the existence of spatial dependence in urban cultural consumption phenomena, we argue for additional research focusing on the effect played by different cultural policies in equalising spatial inequality patterns. Research, thanks to the availability of GPS data on cultural mobility, could focus on predicting the effect of location decisions for new cultural institutions on the consumption of neighbouring residents, providing an important evidence-based tool for urban cultural policy-makers.

To conclude, we show that space matters in the analysis of urban cultural consumption phenomena, not only as one of its determinants (significance of the proximity variable in the regression models) but also as the agent that influences the effects played by other determinants through the existence of spatial dependence in the models used to describe cultural consumption patterns. At the same time, we argue that the effect played by proximity with cultural institutions is not enough to reduce cultural consumption inequalities and that cultural policies need to consider the complexity brought by the stratification and interplay of all variables describing cultural consumption.