Does the built environment shape commuting? The case of Lyon (France)

Is built environment the most influential factor on travel behavior when compared to individual socioeconomic characteristics? This paper extends the empirical knowledge by providing and comparing quantitative estimates of these various effects on both commuting distance and mode choice in a European city spatial context, while using up-to-date and novel methodology. Eight indicators of built and social environment are identified in order to characterize clusters of residential locations, giving a rich view of spatial and social diversity of locations. To disentangle the causal effects of residential self selection and built environment, both sample selection and specific matching preprocessing ("coarsened exact matching", a novel approach in the field) are implemented. Regarding commuting distance, the true effect of built and social environment appears modest with an increase in the range of 10-20%. It comes behind individual socioeconomic characteristics such as car availability and qualification. Regarding commuting mode choice, again the true effect of built and social environment is modest, with a nearly 20%pt increase of car share and around 10%pt decrease or public transport share for the most prominent effects, and it comes behind car availability. These results suggest the primary importance of influencing directly car use, if not car ownership, in the European context, while trying to modify the built environment would provide only limited results.


INTRODUCTION
Urban sprawl is often viewed as an undesirable development type because of its various harmful consequences regarding the air quality and climate change (through vehicle miles travelled), traffic congestion, physical health and obesity, and public service costs (Ewing and Hamidi, 2015). At least two decades of controversy on the respective advantages of compactness and sprawl have fostered numerous studies, particularly in the transport and landuse literature. On the specific topic of climate change mitigation, on the basis of a literature review, Desjardins (2011) concluded that planning's contribution could only be modest. However, the debate is still lively as shown by the controversy reopened in the Journal of the American Planning Association following Stevens (2017) paper. The main issue at stake is the potential to moderate vehicular travel intensity in urban areas through changes in the built environment.
This paper analyzes and estimates the respective scales of the influence of built environment (BE) and socioeconomic characteristics (SEC) on travel behavior (TB) in an European context, the Lyon urban area. BE has originally been defined through the famous 3Ds of built environment: Density, Diversity, and Design, initially coined by Cervero and Kockelman (1997) and since widely used in North American studies. 3D has grown to 5D with the addition of Destination accessibility and Distance to transit (Ewing and Cervero, 2001). We extend these dimensions to social ones, reflecting additional factors which affect residential location choices. To this adds the critical concern of residential self-selection (RSS) which would mediate the causal relationship between BE and TB (Guan et al, 2019).
Due to data availability we focus our analysis on commuting behavior of working people. Our research questions are basically the following. Does built environment have the most influential effect on commuting behavior (commuting distance and mode choice) when compared to individual socioeconomic characteristics? How can these effects be computed?
Eight indicators of built and social environment have been identified in order to characterize residential locations: they give a richer view of spatial and social diversity of locations than a unique indicator such as distance from the agglomeration centre or density. RSS is tackled through socio-economic variables attached to the individual and related to the most relevant attributes for residential location choice. RSS is controlled through two approaches, the sample selection approach for commuting distance and the matching approach for mode choice. Regarding matching preprocessing, the use of "coarsened exact matching" is novel in the transport and land use field. We extend the empirical knowledge by providing and comparing quantitative estimates of the various effects, whether BE or individual socioeconomic characteristics, on both commuting distance and mode choice in a European spatial context.
In the first section we give an overview of the literature. The second section sets out the data elaboration and the statistical methods. The third section shows the results while the fourth one discusses them and the fifth one concludes.

LITERATURE REVIEW
The literature on the relationships between built environment and travel behavior is "vast but messy" (Handy, 2017). Moreover, the literature points at the important issue of residential self selection. Finally, the methodological aspects of this issue are reviewed. 4

The relationships between built environment and travel behavior
Based on a synthesis of 50 empirical studies Ewing and Cervero (2001) found that trip lengths are primarily a function of the BE and secondarily of socioeconomics (SEC). Mode choices depend on both but probably more on SEC. Pursuing with a meta-analysis of more than 200 studies Ewing and Cervero (2010) found that vehicle-miles of travel (VMT) is most strongly related to destination accessibility, that walking is related to land use diversity, intersection density and the number of destinations within walking distance. Bus and train use are related to proximity to transit and street network design. Population and job densities are only weakly associated with travel behavior (TB) when the other 4Ds are controlled. Both reviews included studies comparing various areas, whether metropolitan areas, districts or neighborhoods in U.S. regions.
By pooling household travel and BE data from 15 various regions in the USA (regional household travel surveys),  found that socioeconomic influence is the strongest even if BE and transit service also influence household travel decisions. Among the 5D the strongest ones are diversity, design and destination accessibility, while the weakest is density.
In a more recent meta-analysis (based upon a meta-regression of 37 studies) focusing on VMT and controlling for self-selection (see below), Stevens (2017) found that compact development (mainly the density and distance to downtown) does make people drive less but concluded that the impact is "fairly small". These conclusions reopened the controversy on the effects of BE on travel (see the comments in the same and following issues of JAPA 1 ).
However, the previous results are mostly based upon empirical case studies in the USA. On the European side, characterized by denser cities and higher walking and public transport use in general, the literature is more recent. Based on longitudinal analyses of the German nation-wide travel survey, focusing on mode choice, Scheiner (2010) concludes that the increase in car use is mostly explained by the increase in car ownership while BE appears to have an impact on car use for car owners. In The Netherlands, focusing on commuting, Susilo and Maat (2007) show with longitudinal analyses of the Dutch national travel survey that commuting distance (whether by car or rail) is barely influenced by urban forms and travel accessibility but more by job locations in line with individual profiles. In the UK Aditjandra et al (2012) applied structural equation modeling to a panel of respondents who reported residential relocation in the metropolitan area of Tyne and Wear. They found that neighborhood characteristics do influence TB after controlling for residential self-selection. With crosssectional data on commuting behavior in the Lisbon metropolitan area (Portugal), controlling for self-selection, de Abreu e Silva (2014) found that attitudinal variables significantly influence TB but do not cancel the effects of land-use patterns on this behavior. In Dublin (Ireland) Humphreys and Ahern (2019) observed residential self-selection but not to such an extent that it is more important than land-use factors in determining modal-split characteristics.
Empirical studies on the link between BE and TB are lacking in France, except for two. Pouyanne (2005) explored the interactions between urban forms and travel patterns inside the metropolitan area of Bordeaux. Aguilera (2005) analyzed the increase in commuting distances in the three metropolitan areas of Paris, Lyon and Marseille. However, both studies were conducted at an aggregate zonal level inside the urban areas, and not individual level. The debate is lively: it is generally admitted that density is a strong determinant of mobility pattern and that making cities (again) compact will help to reduce car dependence. This position is of course criticized by some authors (see e.g. Charmes, 2010) who point that the issue at stake is not continuous sprawl but rather scattering of locations. 5

The issue of residential self selection
An important issue arises in the causal mechanism linking BE and TB, that is to say the residential self-selection (RSS). RSS hypothesizes that residents may choose a BE that is consistent with their attitudes (AT) and preferences regarding travel modes or land use configuration (for instance moving to a dense and diversified neighborhood in order to be able to walk to various amenities): this would bias the link evidenced between BE and TB and, according to the majority of the literature, overestimates the influence of BE.
Empirical findings regarding RSS are documented through 38 studies (most of them from North America, five from UK, Germany and The Netherlands) in Cao et al (2009) and some of the recent studies previously quoted also address this issue. Cao et al noticed that the effect of BE is statistically significant after taking account of RSS but that RSS's influence is rarely estimated. Mokhtarian and Cao (2008) reviewed the methodologies used in these studies and recommended usage of longitudinal methods. Mokhtarian and van Herick (2016) reviewed ten studies (all comparing locally neighborhoods in various U.S. regions) quantifying the RSS effect and evidencing a great variation of values across studies. Cao and Chatman (2016) underlined that travel preference is only one of the factors influencing residential choices and that the resulting effect of BE and RSS on TB may come from various influences. These include elasticities of travel to BE, the structure of preferences in the population, choice factors other than BE (i.e. social environment or housing type) and the supply of different development types. This explains why the influence of BE on TB can be over or underestimated. In their review of the recent literature on RSS Guan et al (2019) discussed the issue of attitudes measurement. They underlined that beyond travel attitudes and preferences a multitude of subjective factors may contribute to RSS, including life style, values, norms and so on. Moreover, it is not clear whether attitudes are exogenous or endogenous (i.e. influenced by both BE and TB) since there are probably feedback effects from BE and TB to AT.
Moreover, the RSS issue based on travel attitudes is controversial. For instance Ettema and Nieuwenhuis (2017) found in a study on households who relocated recently in three Dutch transit-oriented development locations near The Hague that travel attitude "is insufficient to fully reflect self-selection process". Even Wang and Lin (2019) found no evidence of RSS in their case study in Beijing. In the two Norwegian metropolitan areas of Oslo and Stavanger Wolday et al (2019) found no strong empirical basis in support of controlling for travel-based residential self-selection. For Naess (2014) the RSS issue based on travel attitudes is like a "tempest in a teapot". Following the discussion of this paper by van Wee and Boarnet (2014) it can be concluded that: a) RSS based on socio-economic variables should be distinguished from RSS based on attitudes; b) there is a consensus that the former is important to be included; c) the role of attitudes is not fully understood yet; and d) there is a need to estimate the respective influence of BE and RSS in TB.
Regarding RSS based on socio-economic variables, a recent survey of the literature on residential location choice models (Schirmer et al, 2014) shows that the most significant attributes of location are: a) regarding BE: built density, proximity to transport networks, open space and green area, land use mix (residential and commercial); b) regarding point of interests (POI): education, retail, recreation and sport, transportation facilities, urban amenities; c) for socioeconomic environment: population density, household types (size, working or not, families or not), origin or race, incomes, density of jobs, school quality; and d) for employment accessibility: generally a gravity measure.

Matching and sample selection approaches
When trying to disentangle the causal effect of BE and RSS a randomized experiment would be ideal. However, such kind of experiment is generally infeasible in social sciences and the scholar must generally rely on observational data (Winship and Morgan, 1999). These authors introduced the issue thanks to a conceptual framework distinguishing a "control" group (in our case it could be those living in the city centre) and a "treatment" group (those living in the suburb). In order to compare the outcome (for instance commuting distance) between the "treatment group" and the "control group", the bias resulting from a potential correlation between the outcome variable and the treatment assignment needs to be eliminated. Winship and Morgan reviewed and discussed strengths and weaknesses of various statistical and econometric methods available, that is, regression methods for cross-sectional data -including selection models, matching on propensity score and instrumental variables -and methods for longitudinal data. Mokhtarian and van Herick (2016) outlined and discussed the basics of propensity-score matching and sample selection approaches.
According to Ho et al (2007) preprocessing of data with matching has the interesting potential to remove some of the biases referred to above. By matching pairs of units that are (nearly) identical in observable covariates, one unit assigned to the treated group and the other to the control group, one mimics a randomized experiment. This method is said to reduce "imbalance" between the two groups. In our case for example observable covariates would include individual socioeconomic characteristics (SEC) such as age, gender and so on, and matching would imply to find pairs of individual with similar SEC, one living in the city centre (control), the other in the suburb (treated). Among matching methods, propensity score matching (PSM) (Rosenbaum and Rubin, 1983) is commonly applied. PSM has been applied in transport and land use studies in several studies Cao, 2010;Cao and Fan, 2012;Cao and Schoner, 2014).
However, the matching approach assumes that all the selection is on the observables (Ho et al, 2007;Winship and Morgan, 1999), i.e. the probability of being assigned to the treatment condition is only a function of the observed variables. This assumption may not be valid if unobserved variables (such as attitudes or preferences) could be correlated simultaneously with the probability of choosing to live e.g. in the suburbs and the travel behaviour outcome. In this case the sample selection approach (Heckmann, 1978) is more appropriate since it controls also on the unobservables. It combines a selection (probit) model with two outcome models (linear regressions), one for each group (control and treated). However, as recalled by Bushway et al (2007) some methodological pitfalls must be avoided in the application of this approach. The main one is the importance of "exclusion restrictions" which means that some variables must be included in the selection equation (they must have a power to predict selection) while they should not be in the outcome equation (and they should not have a strong power of prediction on the outcome). The sample selection model will be applied to a continuous outcome such as commuting distance. Regarding commuting mode choice the relevant outcome model is a discrete choice model (multinomial logit). However, there is no simple analog of the Heckman method for discrete choice models (Bushway et al, 2007). Thus for commuting mode choice the matching approach will be applied before the logit estimation.
However, most matching methods (including PSM) in real applications need repeated trials to obtain a better balance between the two groups because their application in a single use does not guarantee imbalance reduction and hence reduction in bias (Iacus et al, 2012). Coarsened Exact Matching (CEM) is a matching method which helps to avoid these pitfalls (for theoretical aspects see Iacus et al, 2011; for details about the advantages of CEM over many matching methods see Iacus et al, 2012). Measuring imbalance between the empirical distribution of pretreatment covariates of the treated and control group is done with an L1 distance (also called Manhattan distance). For that the covariates need to be discretized (for continuous variables) or recoded (for discrete variables). Through this coding variables' values are "coarsened" (i.e. grouped) so that they have the same meaning according to the researcher assumptions. This is the case for instance for age or household size. Some variables are already coarsened and taken as they are, like gender, or may be recoded like qualification level which is coarsened along main diploma stages. Hence in the following we apply CEM preprocessing.

METHODOLOGY
In order to implement the treatment/control methodology we need to define the substance of grouping. Since we are studying the influence of BE on TB it is justified to base the grouping on spatial areas. Thus our empirical strategy consists in: • first, elaborating "treatment" and "control" groups: this will be done via clustering residential locations; • second, selecting the variables at the individual level which capture the self-selection phenomenon and will be used with both sample selection and matching (CEM) approaches; • third, estimate the models, compute the treatment (BE) effects and compare them with individual socioeconomic effects.
First, we set out the elaboration of our data before detailing in a second subsection the statistical methods implemented.

Data collection and elaboration
Our study focuses on Lyon area, given the availability of data and our specific knowledge of this area acquired through various previous studies. Various delimitations exist in this area, as elaborated by the INSEE. The "urban unit" perimeter is based on the physical continuity of development (it aggregates communes with buildings less than 200 meters from each other), with 127 communes in the case of Lyon in 2020. The "urban area" perimeter is based on a functional approach since it aggregates to the urban unit the communes of which at least 40% of resident working people work in the urban area (these communes are the "periurban ring"). There are overall 498 communes in the Lyon "urban area". Our aim is to cut ourselves from either the "physical" or the "functional" perimeter by choosing a quadrilateral around the Lyon area centre, gathering communes both from the urban unit and the periurban ring. Thus our study area is a 50 x 30 km rectangle around the city of Lyon (France). It contains 123 "communes" (including the Lyon city with its nine "arrondissements" or districts distinguished), of which 20 belong to the periurban ring, with approximately 1.5 million inhabitants including around 750,000 working people (see Figure 1 hereafter). This map shows the main transport infrastructures (divided highways and semi-heavy public transport -subway and tramways), the administrative boundaries of the "communes" (or municipalities) and their density of inhabitants.

Data on commuting
Ideally we would have based our analysis on a mobility survey related to overall daily mobility of the whole population (both households and their individual members), as measured for instance in conventional household travel surveys (HTS). However, the sampling of the HTS in Lyon covers mainly the dense urban area and is parsimonious in low density outskirts. This is why we rely upon another survey focused on (work) commuting only which has the advantage of a much higher sampling rate and more variability in the representation of various locations in the urban region. Therefore we use an extract from the French national census about the population holding a job (INSEE MOBPRO 2013). This institutional source has a high sampling rate (around 27%) and provides dozens of individual information including the usual home-work transportation mode.
Focusing the analysis on individual home-work mobility is specifically relevant regarding our research questions. Indeed work activity has some mandatory aspects such as activity scheduling and location which influence travel mode choice and distance travelled from home. These mandatory aspects do not apply to other activities such as shopping, personal matters or leisure, where scheduling and location are freer. This is why it is relevant to perform a separate analysis for commuting behavior. Moreover, as shown by Salon (2015) the influence of BE on VMT depends on travel purpose.
However, it should be noted that we only know some characteristics of the commute, that is to say the "usual" commuting mode and the municipalities of residence and workplace, but not the frequency (i.e. daily or not). Travel mode is distinguished as [Private car], [Public transportation], [Two-wheeled vehicle (bike or motorcycle, not distinguished till now in the census)], [Walking]. Distance between home and workplace is then measured between centroids of the respective communes (computed with IGN shape file GEOFLA) while the intra-commune distance is measured as = 0.5 * � where is the surface of the commune i.
Furthermore, this data source contains some socio-demographic variables related to the commuter (see below). Overall statistics on commuting in the sample (N=201,355 individuals) are given in Table 1. 9 Figure 1: Map of the Lyon study area

Covariates for clustering communes for treatment and control groups
The "commune" is the smallest administrative subdivision in France. Since most of our land use data are available at this spatial level, this will be the unit of analysis for grouping areas. Thus our measure of "treatment" is based on a typology of communes. Our aim is to elaborate a set of two or more groups of communes across the study area. When arranged by pairs these groups will play the role of "control" and "treatment" for the individuals living within these groups of communes.
Since our study area is heterogeneous from a built environment and social perspective it would be irrelevant to base this typology only on a distance to a city center. Locations characterization is multidimensional. This is why we use clustering. The choice of variables for clustering results from a compromise between what is indicated by the literature and data availability in our study area (see Table 2). Some of these variables address the potential impact of BE on commuting behavior Cervero, 2001, 2010) and include the Ds as typically measured . Commuting distance is impacted by density and jobs accessibility (see "Destination accessibility" in Table 2) but also with the proximity to transport networks ("Distance to transit" -subway or tramway -and "Distance to main roads") which may favor longer distance commute through the speed they offer. Commuting mode may be influenced by these proximities to transport networks but also by density and the possibility to walk ("Design", i.e. "walkability" index). "Diversity" reflects the mix of land use through points of interest. To this we add two variables measuring the social environment of the location, one relating to population ethnic origin, the other to income (as pointed in the literature review above; Schirmer et al, 2014). The eight variables are recapped in Table 2. Population density of the commune: it is the ratio of population to commune area (in square kilometers) Distance to transit Distance to transit network: the distance from the centre of the commune to the nearest heavy public transportation station (subway or tramway) Distance to main roads Distance to major road network (divided highways): the distance from the centre of the commune to the nearest road interchange Design "Walkability" in the commune (derived from Frank et al, 2010): maximal number of 3-way intersections within a floating radius of 1 kilometer in the commune Diversity POI in each commune are available with the "Base Permanente des Equipements" (BPE, 2015). This database records the presence of equipments and services categorized in 27 types covering public services, banks and other commercial services, retail, education, health services and so on. We measure the intensity of POI as the presence, in the commune, of at least one equipment for each of the 27 categories. Thus the variable may vary from 0 (no equipment at all) to 27 (all 27 categories of equipments are present in the commune) Destination accessibility Accessibility to jobs: according to Hansen's indicator of accessibility (1959) it is computed for each commune i as ∑ 2 where is the number of jobs in commune k and is the distance between the commune i and the commune k, with = 0.5 * � Social environment Percentage of immigrants in the commune (those who are born as foreigners outside France and living in France, whether still foreigners or naturalized French) Social environment Median of the "equivalised disposable income" in the commune: this is the total income of a household, after tax and other deductions, that is available for spending or saving, divided by the number of household members each weighted according to the OECD equivalence scale (FiLoSoFi, 2015) Note that by clustering communes of residential location of the individuals, the transit accessibility of the workplace (destination) seems to be ignored. However, by including the "Distance to transit" in the commune of residence -at the origin of the commute -the potential of using transit to access to workplace is dealt with.
Distances are first computed as Euclidean distances starting from the commune centre defined as the location of the town hall (source Geofla). Then a correction factor of 1.3 is applied to take account of average network design in urban areas. Descriptive statistics of the eight variables are given in Table 3. As shown in Table 3, given their dispersion some variables are transformed in order to approximate a normal distribution: percentage of immigrants, population density and jobs accessibility are transformed into logarithms; distances to transit and main roads, and walkability index are transformed into square root. Even after this transformation the level of correlations between them is high (see Appendix A): density is highly correlated with jobs accessibility, POI 27, walkability and distance to transit; jobs accessibility is highly correlated with POI 27, walkability and distance to transit. This justifies using clustering in order to synthesize these various BE indicators.

Choice of covariates for individual self-selection of residential location
According to the previous literature survey of the most significant location attributes influencing residential location choice, we can identify the main dimensions of individual characteristics which could capture the preferences in location choice: these include life cycle features (age and family size), wealth through income level, social profile, immigrant or not and preference for car mobility.
As suggested in the literature review coarsened exact matching (CEM) preprocessing is applied (Iacus et al, 2012). Measuring imbalance between the empirical distribution of pretreatment covariates of the treated and control group is done with an L1 distance (also called Manhattan distance). For that the covariates need to be discretized (for continuous variables) or recoded (for categorical variables). Through this coding variables' values are "coarsened" (i.e. grouped) so that they have the same meaning according to the researcher assumptions. This is the case for instance for age or household size. Some variables are already coarsened and taken as they are, like gender, or may be recoded like qualification level which is coarsened along main diploma stages.
Again we have to make a compromise with the availability of variables in the MOBPRO database. We capture lifecycle stage through individual's age (discretized in classes) and household size. Income is unfortunately not available, so we approximate it through the housing tenure status (owner, social housing tenant, other tenant) and the qualification level (the highest diploma obtained by the individual). Person origin is measured through the status of immigrant or not. Preference for car mobility is captured through the car availability for the individual: we divide the number of cars available in the household by the number of working people and discretize this ratio (zero, less than one, one or more). We also have the variable "Living in a house" which could have been a proxy for preference for a house. However, this variable can be considered as partly a consequence of "treatment" and it should not be included in the matching process (Ho et al, 2007). Statistics over the whole sample are given in Table 4.
The same set of variables is used for the selection equation in the sample selection approach (i.e. the set of Z variables, see below).

Choice of covariates for outcome equations
A subset of these variables is selected for outcome equations, the same for commuting distance and mode choice. These are age, gender, housing tenure, qualification and car availability. They form the set of X variables in the outcome equations (see below). "Household size" and "Immigrant" individual status are included in the Z set and not the X set. These two variables are hypothesized to have a strong influence on residential area choice but not on commuting distance, nor on commute mode choice. Thus the condition of "exclusion restriction" is respected.

Statistical methods
In this section the methods for computation of treatment effects are presented, first with the sample selection approach for commuting distance, second with the matching approach for mode choice. Third, the computation of the influence of BE due to RSS is presented.

14
The first two equations are outcome equations and * is a latent variable which generates ( ) the observed treatment decision with two values, 1 (treated state) and 0 (untreated state).
The outcome equation may be rewritten The average treatment effect for a randomly chosen individual is then where ̅ is the vector of sample means of the X variables for the whole sample.
With the assumption of jointly normally distributed errors, the average treatment on the treated is where 1 and 0 are variance parameters, 1 and 0 are correlation parameters, is the standard normal probability density function and ∅ the standard normal cumulative density function.
In the sample selection approach ATE is the "true effect" of BE on any random individual, when self-selection is controlled (Mokhtarian and van Herick, 2016). TT is the total effect of BE on treated group. According to equations (3) and (4) TT can be decomposed in two components: the first one is the ATE for treated individuals which we denote ; the second one is the influence of self-selection which we denote .
where � is the vector of sample means of the X variables for the treated group only 2 .
This model, also referred to as the "mover-stayer" model, is estimated in two steps, first the selection (probit) model to estimate the coefficients of Z and then the outcome equations by linear regressions, with Limdep 11 (Econometric Software Inc.). The full information maximum likelihood estimation has been dismissed because of convergence issues. The twostep method is more robust (Bushway et al, 2007), although standard errors are large.
The effect of BE can be compared to the effects of individual situation which is measured here by socioeconomic characteristics (SEC, selected previously). The marginal effects of these covariates are computed as follows. For a categorical covariate each category (except an arbitrary base category) is binary: the marginal effect of this category (e.g. being female) compared to the base case (here being male) can be computed as where y 1 and y 0 are the values of the dependent variable when the category is 1 (resp. zero) and is the estimated coefficient of the category.

Matching approach and logit for mode choice
Following Ho et al (2007) the average treatment effect (ATE) is the expected difference in outcome measured over the whole population (1) where i denotes the n individuals, is the potential outcome (e.g. car use) and is the set of individual covariates.
Another quantity of interest is the "average treatment of the treated" (TT) measured on matched individuals in the treatment group, which is expressed as = where = 1 for treated individuals, zero otherwise In the matching approach ATE is the total observed influence of BE when self-selection is not controlled for. TT is the "true effect" of BE on treated group, that is to say when selfselection is controlled.
Regarding commuting mode choice the relevant outcome model is a discrete choice model (multinomial logit). The CEM matching approach implemented in R (R Core Team, 2019) is applied. The logit estimations are performed with Nlogit 6 (Econometric Software Inc.) Regarding commute mode choice, the effects of individual SEC can also be estimated. Logit models of commute mode choice are estimated for treatment group of each of the six pairs of control/treatment in the matching approach. Only individual characteristics are available (with no alternative specific attributes such as time, cost and so on). They do not vary across alternatives, thus we make them alternative specific. We also include commuting distance (in logarithm) in order to control its effect on mode choice.
In order to compute the marginal effects of the various dummy covariates on modal shares (e.g. setting gender as male and then female and compare the results) we use the simulation capability of Nlogit (see Hensher et al, 2015). The changes in percent share come from simulations computing the predicted choice probability times the number of observations in the simulated sample (they are reported only when at least one mode-specific coefficient of the covariate is significant at the 95% level).

Computing the influence of BE due to self-selection
Some authors (e.g. Mokhtarian and van Herick, 2016) have explicitly computed a quantity named "built environment proportion" (BEP), defined as the ratio of the true BE effect (when RSS is controlled for) to the total effect of BE. While the numerator is ATE in the sample selection approach, the restriction of the denominator to TT is inappropriate according to Mokhtarian and van Herick (2016). Inverting the labeling of control and treatment would produce different results. It seems more natural to include in the denominator not only TT but also TUT, that is a quantity similar to TT computed on the untreated individuals, in order to obtain the total effect. However, as shown by van Herick and Mokhtarian (2019) this BEP measured in the sample selection approach is equal to 1.0 3 . Moreover, different methods for measuring the BEP on the same set of data yield substantially different results (van Herick and Mokhtarian, 2020).
Despite this limitation we compute this ratio for commuting distance and also the influence of BE due to self-selection that is to say (1 − ) * 100 in order to compare our results with those found in the literature (basically those listed in Mokhtarian and van Herick, 2016). Regarding the matching approach, the BEP is computed as the ratio of TT to ATE, following again Mokhtarian and van Herick (2016).

RESULTS
We first show the typology of communes followed by the effects computed on the commuting distance with the sampling selection approach and then on the commute mode choice with the matching approach.

Typology of communes
The sample consists of 123 communes, the commune of Lyon being divided into its nine districts ("arrondissements"). The eight variables (see Table 3, some of them transformed) are centered and scaled. A hierarchical clustering (not shown here) indicates a relevant number of clusters around four. We then perform a kmeans clustering technique based on the algorithm of Hartigan and Wong (1979) implemented in R. Clustering is repeated with five random starts with a number of clusters going from three to five. Finally a four clusters typology is retained with a ratio of between sum of squares to total sum of squares of 65.5 % and sizes of 23, 23, 42 and 35 observations (communes). Summary statistics per cluster are given in Table 5 while detailed statistics are given in Appendix B. 16 123 * statistics per commune averaged over the cluster ** for the ease of interpretation of clusters regarding income, jobs accessibility and walkability index, the respective figures are coded in base 100, 100 being the maximum mean observed in one of the clusters These clusters are described in Appendix B and shown on the Lyon area map in Figure  2. We give a "nickname" to each cluster, summing up their profile: "Urban centre", "Suburb", "Periurban" and "Deep periurban" respectively. Roughly speaking the communes of the three outer clusters ("Suburb", "Periurban" and then "Deep periurban") are respectively located at greater distance from the city centre, a monocentric feature which was expected. However, while the communes of cluster 1 "Urban centre" are contiguous this is not the case for the three other clusters. This reflects the polycentric feature of the conurbation with a spatial and social diversity of communes regarding the eight covariates at stake in the clustering. Table 6 shows the distribution of individuals according to their residence in the four clusters of communes. Since there are four clusters the pairing of clusters in control and treatment groups yields six pairs. The control group is systematically the cluster with the highest level of average density in the pair, the other being the treatment group.

Figure 2: Map of the four clusters of communes
In the following the treatment effect is analyzed, first for commuting distance then for commute mode choice. Table 7 gives the number of individuals and the mean commuting distance (one way) in each cluster. The detailed results for the six series of sample selection models involving for each pair of clusters a binomial probit model (selection) and two linear regression models (one for treated and one for control group) are given in Appendix C. Table 8 below recapitulates the various treatment effects which are computed from model results. ATE, which is the true BE effect, is an estimate of the increase in daily commuting distance (one way) for a randomly selected commuter. For instance (see 4 th column of Table 8), for pair 1-2 a commuter is expected to increase his or her one way commuting distance by 1.73 km on average when living in cluster 2 ("Suburb") compared to living in cluster 1 ("Urban centre").

Commuting distance with the sample selection approach
TT is an estimate of the increase of daily commuting distance for a commuter living in an outer cluster (treatment) when compared with a commuter living in the inner cluster (control), all other covariates being constant. It aggregates the true effect of BE and the effect of residential self selection. TT varies between 1.42 km (pair 2-3) and 5.83 km (pair 1-4).
ATEt is the true effect of BE on the treated. It is mostly similar to ATE, but computed on individuals belonging to the treated group. The ratios of ATEt to mean distance in the treated group show that the influence of BE on commuting distance for those individuals amounts to about 10-20% of this distance.
The outlier results of ATE and ATT for pair 3-4 lead us to assess the robustness of the estimation to varying sets of covariates. One of the two "exclusion restriction" covariates is "Immigrant" which is significant in the (probit) selection equation in the initial estimation. If we exclude this variable from the estimation we obtain a huge value for ATE expressed in km (the constant for the least squares regression model of treated cases goes around 16, where the dependent distance variable is coded in logarithm). The same kind of variation occurs, as a test, for pair 1-2, where both "exclusion restriction" covariates are significant in the initial probit model: for instance, excluding "Immigrant" covariate produces an ATE of 4.76 km and a TT of 2.03 km. We come back to this point in the discussion section.
The marginal effects of individual SEC are computed and can be compared to the BE effect for the treated groups (ATEt). They are shown in Table 9 and illustrated in Figure 3.   Table 10 gives the shares of commuting travel modes in each cluster. As explained previously the matching approach is applied before the analysis of "treatment effect" on commuting mode choice. The CEM algorithm implemented in R is used (Iacus et al, 2009). This algorithm performs exact matching of individuals on the six individual covariates selected above, by first sorting all the individuals in strata, each of which has identical values for the six covariates. These are household size, age, housing tenure, car availability, immigrant status and qualification (see Table 4). Then individuals from one group who cannot be matched with anyone of the other group are discarded. Since we have enough data we opt to a k-to-k solution which gives the same number of treated and control units within each strata. Table 11 below shows the result of k-to-k matching for each of the six pairs. The percentage of matched individuals in each treatment group is close to 100%.  Table 12 gives for each pair of clusters the number of matched observations (equal for the control and the treatment group given the k2k matching) and the shares of commuting travel modes in each group of matched individuals. Then it gives the TT (the difference of shares of the four travel modes in percentage point between the two matched groups) which is the true BE effect, ATE the observed difference in travel mode shares between the two initial groups (all individuals, including not matched), the ratio of TT to ATE (which is the BEP) and the influence of BE due to RSS. Besides the treatment effect in the treated group, the effects of individual SEC covariates can also be estimated. The results of the six logit mode choice models and the marginal covariate effects are given in Appendix D. They are represented as six bar charts in Figure 4.

Commute mode choice with the matching approach
For instance, the first bar chart (top left) shows for Pair 1-2 the change in percentage point for each of the four modes shares induced by a change in dummy covariates. A change in car equipment level "Cmore1" from zero to one (there is at least one car per working people in the household) induces a 52%pt increase of car share in commuting and a 39%pt (resp. 8%pt) decrease in public transport (resp. walking) share. These levels of impacts of car equipment are estimated independently of other covariates and can be compared to the "treatment of the treated" effect (TT, already shown in Table 12), i.e. the BE effect for those living in cluster 2 ("Suburb"), when compared to those living in cluster 1 ("Urban centre").

DISCUSSION
The results are commented regarding first commuting distance, second mode choice. Then they are put in perspective with the empirical literature. Finally, the limitations of this study are discussed.

Commuting distance
Regarding commuting distance, with the sample selection approach, the true BE effect measured by the ATE is shown as meaningful (see Table 8), especially when comparing "Urban centre" to "Suburb" (an increase of 31%), "Periurban" (40%) or "Deep periurban" (37%), if we except the outlier result of pair 3-4 (see below). The true BE effect on the treated (ATEt), i.e. on those living in outer clusters, is also meaningful but lower, and modest since the increase of the mean distance in the treated group goes from 12% to 22%. ATE is lower than TT in general, evidencing some self-selection. The influence of BE due to RSS has large variations: it goes from 12% for the pair comparing "Periurban" to "Suburb" to 64% for the pair comparing "Deep periurban" to "Urban centre" (excluding the outlier results of pair 3-4, see below). However, we find a case of reverse influence of selfselection, that is for pair "Suburb" -"Periurban" (Pair 2-3 in Table 8). This can be referred to the concept of residential neighborhood type dissonance, where a person is considered dissonant if his or her current residential neighborhood does not coincide with his or her preferences regarding land use or travel (Schwanen and Mokhtarian, 2005; for a recent discussion of this issue see Guan et al, 2019). However, as pointed by Cao and Chatman (2016) travel preference is only one of the factors influencing residential choices, which could explain the lower overall effect of BE (including RSS) when compared to its true effect.
When compared with individual socioeconomic characteristics, the true effect of BE on the treated (ATEt, see Table 9 and Figure 3) comes never first, but after qualification level and car availability (at least a car in the household). When treated cluster is 2 "Suburb" or 3 "Periurban" the effect of car availability (one or more per commuter) amounts to 27%-36% increase of commuting distance. It is much higher when treated cluster 4 "Deep periurban" is considered, with a range of 80-100% (that is a doubling of commuting distance).
A limitation regarding the application of the sample selection approach is shown through the outlier results of ATE and ATT for the pair "Periurban" -"Deep periurban" (Pair 3-4 in Table 8). We assessed the robustness of the estimation to varying sets of covariates and found large variations in the estimation of the treatment effects. This point was also raised by Zhou and Kockelman (2008). Such results may indicate limits in the specification of the model where other more relevant variables might have been included (depending on their availability). Overall one cannot exclude the potential for "omitted variables". This could also explain why there are such large variations of treatment effects measured in the empirical literature.

Commute mode choice
First, the largest TT effects (the BE true effects in the matching approach) are observed for the three pairs where cluster 1 ("Urban centre") is the control group (see Table 12). For these pairs the BE produces an increase of car share between 17.1 and 19.0 %pt, a decrease of both public transport share between 9.2 and 10.9 %pt and walking share around 5 %pt. This is meaningful while not high. Moreover, regarding again these three pairs, the influence of BE due to RSS is also meaningful: from 19 to 45% for walking, 27 to 53% for car and 50 to 62% for public transport.
Regarding the three outer clusters, that is to say cluster 2 ("Suburb"), 3 ("Periurban") and 4 ("Deep periurban"), the variations in modal shares between them is much lower, and thus the TT-BE effect is significantly lower.
When comparing TT-BE effect with individual socioeconomic characteristics, the overall picture is that for each pair of control/treatment groups the impact of car availability "CMore1" (at least one car per working people in the household) is the highest over all other effects (see Figure 4), with a range of 19%-52%pt. The level of car availability "CLess1" (non zero but less than one car per working people) comes second for pair 1-2, 2-3 and 2-4, with an overall range of 6%-20%pt. The next highest effect is the TT-BE effect. Then come the socioeconomic covariates with gender and age categories as the most prominent effects.
Moreover, for the last three pairs (2-3, 2-4 and 3-4) comparing "Suburbs", "Periurban" and "Deep periurban", while car availability remains as the highest effect, the TT-BE effect comes this time after individual socioeconomic covariates effects (such as gender or age).
Overall, the BE has a meaningful effect in favoring car share to the detriment of public transport and walking, when moving from an urban centre characterised by density of jobs and high supply of public transport. However, car availability effect is in the forefront, and far above the BE effect when considering full car availability. The BE effect comes second for the three pairs comparing "Urban centre" to the three outer clusters.

From the point of view of the empirical literature
Our results are in line with Stevens (2017) conclusion since we find a moderate impact of BE on commuting distance (10-20%). Specifically regarding distance our results are opposite to those of Ewing and Cervero (2001) since we show that SEC are primarily influencing commuting distance, before BE.
Regarding mode choice, our results are partly in line with Scheiner (2010) in the case of Germany, who shows the critical effect of car equipment whatever the distances travelled but also the strong effect of BE, reducing car use in cities opposed to suburbs and rural areas.
Regarding the relative scales of treatment effects, two studies using the sample selection approach can be referred to. Zhou and Kockelman (2008), when comparing Suburb/Rural (treated) to CBD/Urban (control) in Austin (Texas), obtained 58% for the ratio of ATE to TT. From Cao (2009) comparing four traditional (control) and four suburban neighborhoods (treatment) in Northern California the ratio ATE/TT (25.8/33.8) can be computed as 76%. Our ratio ATE/TT (which can be inferred from Table 8) goes from 36% to 88% (excluding the outlier result of pair 3-4 and the reverse result for pair 2-3). Thus our results are broadly in line with the two previous studies in an US context.
Finally, regarding the influence of BE due to RSS our range of values between 12% and 64% computed for commuting distance with the sample selection approach is comparable to the review by Mokhtarian and Van Herick (2016) with a range between 10 and 42%.

Limitations of this study
A first limitation would be that using car availability as an exogenous variable in a model linking BE and TB would induce a simultaneity bias (a particular form of endogeneity bias as identified by Mokhtarian and Cao, 2008). Indeed, car ownership is playing a mediating role between BE and car use, as shown by Van Acker and Witlox (2010), Cao et al (2007) or Ding et al (2017) through structural equation modeling. BE and SEC influence car ownership, thus indirectly travel choices, while BE also influences directly travel choices. However, both our matching and sample selection approaches include car availability in the selection covariates and thus control the mediating role of car ownership. By this way we believe having minimized this bias risk.
A second limitation is inherent to the material which is available. The first is that travel attitudes are not considered, basically because such data are not available in our study. However, by implementing the sample selection approach such "unobservables" are controlled. Regarding mode choice the matching approach reduces the biases from self-selection although not as thoroughly as the sample selection approach. Yet preference for car commuting is partially captured in the matching through the car equipment variable.
Third, the effect of each built attribute cannot be analysed separately, due to correlation between most of them. This reflects the current spatial structure of the city and its past urban policy (e.g. transit infrastructure is supplied where density is high). With clustering we think to have overcome this issue. Moreover, the clustering we propose is a progress when compared to studies which only consider the distance to the city centre.
Fourth, using only commuting travel data may limits the scope of the results since the BE can influence other trips like e.g. shopping or leisure trips.
A further limitation is that the clustering of communes obtained with covariates measured at the level of the commune is subject to the Modifiable Areal Unit Problem (Openshaw, 1984). This issue is exacerbated by the fact that data are till now available only at the commune level which is an especially rough division of geographic space. Associated to the MAUP are both a scale effect, given the communes surfaces -which impacts also the measurement of home-work distance, and a zone effect which depends on the boundaries of communes. Alternative aggregations whether at finer levels or with different boundaries would probably give different outcomes in clustering. Such alternative was not possible in our case study given the dependency on administrative data given at the commune level. This issue is also raised by Guan et al (2019), for instance on the possibility that people may value regional accessibility (relevant for job access in our application to commuting) more than neighbourhood-level accessibility in residential choice. This issue is left open. However, the application of our methodology to other urban areas would help to test the robustness of the results regarding the relative effects of BE and individual socioeconomic characteristics in various spatial contexts.

CONCLUSION
The links between built and social environment on the one hand and commuting behavior on the other hand have been explored with a control of residential self selection, thanks to publicly available data. This makes this study reproducible in other areas with similar data availability.
Regarding commuting distance, the true effect of built and social environment appears modest with an increase in the range of 10-20% of this distance when comparing e.g. "urban centre" and "periurban". This comes behind some individual socioeconomic characteristics such as car availability and qualification. Car availability produces an increase of commuting distance in the range of 27%-36% when moving to suburb or periurban and even as high as 100% in the deep periurban. Regarding commuting mode choice, again the true effect of built and social environment is modest, with a nearly 20%pt increase of car share and around 10%pt decrease or public transport share for the most prominent effects. It comes behind car availability, especially when there is at least one car per working people in the household with an increase in car share ranging from 20%pt to 50%pt. However, these results are valid subject to the limitations raised previously, i.e. measurement issues and MAUP.
When it comes to policy implications, in the European context, these results suggest the importance of influencing directly car use, if not car ownership, while trying to modify the built environment would provide only limited results, at least from the point of view of commuting behavior. The policy measures are known, whether parking policy, carpooling with high occupancy reserved lanes, restriction of road capacity devoted to car traffic or even road or congestion pricing. In cities where housing is expensive and the property market rather seized up, trying to bring workers homes closer to their jobs locations would look rather vain.
One perspective of this study would be to expand the analysis to other European urban areas with comparable available data.

Description of the clusters
Cluster 1 ("Urban centre") groups together 23 communes with a high population density (6,700 inhab./km² on average), including the two main communes (the nine districts of Lyon and Villeurbanne) and the greatest in size adjoining communes. The immigrant level is the highest (16% compared to overall average of 8%). The median disposable income is 76% of the maximum (observed in cluster 3). These communes benefit from the maximum jobs accessibility and on average nearly the whole set of equipments (24 compared to POI 27 categories). They have the best walkability (maximum index) and their center is close to main road interchanges (less than 2 km on average) and heavy transit network (3.1 km on average with a minimum of 570 m).
Cluster 2 ("Suburb") groups together 23 communes with one fifth of the average density of cluster 1 and includes average sized communes located mostly in the second ring of suburbs. The immigrant level is also above the average (11%) and the median disposable income is 81% of the maximum. These communes have a markedly lower jobs accessibility (23% of the maximum) but a good set of equipments (19 on average). They have a good walkability (72% of the maximum) and their center is close to main road interchanges (2 km) but remote from heavy transit network (nearly 14 km on average).
Cluster 3 ("Periurban") groups together 42 communes with low density (one tenth of cluster 1) and includes average sized communes beyond the second ring of suburbs. The immigrant level is lower than the average (6%) and the median disposable income is at the maximum. These communes have a low jobs accessibility (16% of the maximum) and a moderate set of equipments (15 on average). They have a moderate walkability (49% of the maximum) and their center is close to main road interchanges (about 3 km) but very remote from heavy transit network (more than 16 km on average).
Cluster 4 ("Deep periurban") groups together 35 communes with very low density (one thirtieth of cluster 1) and includes mainly little sized communes in the peri-urban. The immigrant level is the lowest (4%) and median disposable income is near the maximum. These communes have a very low accessibility to jobs (6% of the maximum) and a poor set of equipments (10 on average). They have a poor walkability (26% of the maximum) and their center is reasonably close to main road interchanges (about 4 km) but very remote from heavy transit network (more than 26 km on average).