Optimising measurement of health-related characteristics of the built environment: Comparing data collected by foot-based street audits, virtual street audits and routine secondary data sources

The role of the neighbourhood environment in in ﬂ uencing health behaviours continues to be an important topic in public health research and policy. Foot-based street audits, virtual street audits and secondary data sources are widespread data collection methods used to objectively measure the built environment in environment- health association studies. We compared these three methods using data collected in a nationally representative epidemiological study in 17 British towns to inform future development of research tools. There was good agreement between foot-based and virtual audit tools. Foot based audits were superior for ﬁ ne detail features. Secondary data sources measured very di ﬀ erent aspects of the local environment that could be used to derive a range of environmental measures if validated properly. Future built environment research should design studies a priori using multiple approaches and varied data sources in order to best capture features that operate on di ﬀ erent health behaviours at varying spatial scales.


Introduction
The role of the neighbourhood built environment in influencing diet, physical activity and health outcomes across the life course has received considerable attention in public health research and policy Calogiuri and Chroni, 2014;Caspi et al., 2012;Charreire et al., 2010Charreire et al., , 2014Christian et al., 2015;de Vet et al., 2011;Ding and Gebel, 2012;Ding et al., 2011;Dunton et al., 2009;Goodwin et al., 2013;King, 2015;Saelens and Handy, 2008;World Health Organization, 2010). The built environment has been defined as the physical environment constructed by human activity (Saelens and Handy, 2008), and built environment-health research ideally aims to capture and understand the impact of both contextual (i.e. nature of the area such as access to services) and compositional factors (i.e. nature of residents reflecting the collective social functioning of an area) on people's health behaviours (Cummins et al., 2007;Macintyre, 2007;Macintyre et al., 2002).
Potentially relevant built environment factors have been studied using objective measures involving the use of primary and secondary spatial data (Thornton et al., 2011) and there is an extensive body of research using a range of different data collecting approaches (Brownson et al., 2009;Charreire et al., 2010;Krenn et al., 2011;Schaefer-McDaniel et al., 2010). Much research in this area was initially driven by the availability of secondary data (Macintyre et al., 2002). Such routine data combined with Geographical Information Systems (GIS) can be used to construct environmental measures, including density and spatial availability, walkability indices and undertake spatial analysis and modelling to examine the impact of the neighbourhood environment on people's health behaviours and outcomes Caspi et al., 2012;Leslie et al., 2007). However, routinely available spatial data have well-recognised limitations, including problems with the use of administrative boundaries to define neighbourhoods and the limited types of environmental exposures that can be investigated (Cummins et al., 2005;Lucan, 2015). Secondary data sources may also give rise to issues of specificity (i.e. the proportion of shops that are correctly identified as being specific types for an analysis e.g. retail food outlets (Fleischhacker et al., 2013)) and misclassification of environmental exposures. When possible, neighbourhood audits should be conducted to confirm the validity of routine data sources (Cummins and Macintyre, 2009;Fleischhacker et al., 2013;Lucan et al., 2013;Pliakas et al., 2014). Systematic neighbourhood audits using foot-based or virtual street audit tools, such as Google Street View (GSV) and Bing Maps (or Microsoft Virtual Earth), have been used to collect primary data on factors theoretically relevant but not available in existing routine data (Brownson et al., 2009;Shareck et al., 2012;Wu et al., 2014). The majority of tools available, especially using GSV, have been developed for specific North American towns  or for the UK (Griew et al., 2013;Wu et al., 2014). Neighbourhood audits involve direct on-foot or virtual observations by trained observers who use checklists to observe and rate physical and social attributes of neighbourhoods. The geographic unit of recorded observation in audits is the face block (e.g. the block segment on one side of a street (Clarke et al., 2010;Sampson and Raudenbush, 1999)) or street segment . Street segment measures are typically developed in GIS by dividing the street network within the study area into road sections termed 'links' (Bethlehem et al., 2014;Griew et al., 2013) or by generating intersection to intersection segments (Badland et al., 2010). Some current audit instruments can be found at http:// activelivingresearch.org/. Designing systematic audit tools that have measurement validity, reliability and specificity relevant to both health outcomes of interest and the context of a study have been identified as an important area of methodological research (Shareck et al., 2012;Zenk et al., 2007).
Using GSV to conduct virtual street audits is easy, cheap and safe as well as being transparent as it is available to the general public, public health and planning researchers and practitioners . Systematic, foot-based, street audits are relatively expensive and time consuming (Badland et al., 2010;Ben-Joseph et al., 2013;Wu et al., 2014). The growing use of GSV audits to capture exposures relevant to physical activity and food environments have prompted some studies to conduct comparisons with foot-based audits King, 2015). However, the majority of these comparisons have focused on a limited number of environmental dimensions .
Detailed data collected by direct observation can produce valuable information for those who can act on the findings, such as urban and transport planners and policy makers (Brownson et al., 2009). Most foot-based audit studies have focused on single risk factors, e.g. physical activity (Lee et al., 2005;Pikora et al., 2002) or diet Saelens et al., 2007) or occasionally a combination of the two (Bethlehem et al., 2014), despite the complex, multifactorial aetiology of cardiovascular and other chronic diseases. To date there are no tools simultaneously capturing dimensions of the built environment that may be relevant for influencing multiple health behaviours (e.g. diet, physical activity and alcohol intake) that together contribute to improving complex population health outcomes such as obesity.
This paper aims to explore objective measurement approaches for health related aspects of the built environment by comparing built environment data captured by secondary data sources, foot-based and GSV audits. To our knowledge, no other study has simultaneously compared primary data from foot-based audits with remote-sensing virtual street audits and secondary data sources. This methodological comparison aims to enable researchers to make better informed decisions in the design and analysis of large scale epidemiological studies of the effect of the built environment on health outcomes.

Audit tools
We developed a new instrument, the 'Older People's Environments and CVD Risk' (OPECR) tool, initially as a foot-based audit tool to capture detailed features of the local environment particularly relevant to older people's health behaviours. OPECR was designed as a data collection pro-forma document (i.e. paper form) to collect geographical data relevant to older people's behaviours by direct observation of local neighbourhoods. The tool consists of 100 indicators, including density measures (i.e. density of food shops and alcohol outlets), price and availability of selected food, alcohol and tobacco products, measures of "walkability" of the environment for older people (e.g. connectivity of streets, road speed, traffic volume, quality of pavements and pedestrian crossings), transport accessibility and connectivity (e.g. bus stops and routes) and land use mix. The audit tool is available as supplementary material (Appendix S1). The comparisons presented here are nested within a wider study of the association between aspects of the neighbourhood environment and physical activity, dietary behaviours and cardiovascular disease risk in older adults in 20 UK towns .
The OPECR tool was modified to assess neighbourhood environments remotely using GSV to allow for the comparison of the two techniques. Only minimal adaptations of the street-audit tool were required as it was still possible to assess the majority of environmental features virtually. Information on prices in shops, traffic volume and litter were removed, whilst variables to capture the quality and date of the GSV image were added.

Primary data collection
Fieldworkers were recruited to conduct foot-based audits in 20 towns across the UK (17 in England and 3 in Scotland) that were included in two national cohort studies, the British Regional Heart Study (BRHS) (Walker et al., 2004) and the British Women's' Heart and Health Study (BWHHS) (Lawlor et al., 2003). Fieldworkers were fully trained in the use of the OPECR tool and supervisors conducted frequent field visits in each town to ensure data collection quality. A street segment was the unit of data collection and was defined as the length of a road that does not change in name or distinctly in character. The start and end point of the segment were recorded and these were used as reference points for the GSV audit. The lower layer super output area (LSOA) was used to draw maps of the audited areas using Google Maps.
We piloted the audit tool in two towns, Bristol and Guildford, in September-October 2009. Fieldworkers were asked to conduct concurrent independent repeat audits of specific segments in order to assess inter-rater reliability. Data collection for the foot-based audits for the remaining towns took place between October 2011 and September 2014. Fieldworkers worked in pairs and recorded all relevant aspects of the OPECR tool for both sides of the segment. All street segments were audited in all LSOAs in study towns where study subjects lived. Foot-based audit data were entered into an Access database before being exported into Stata 14.1 (StataCorp LP, 2015).
The GSV audit was conducted in a subset of two study towns (Ipswich and Newcastle-under-Lyme) chosen from the original 20 because they were towns of similar population size located in different English geographical regions with different deprivation rankings (Index of Multiple Deprivation 2010) to increase the variability of the environmental data. In 2012, Ipswich, located in east England, had a population of 134,500 and was ranked 87 out of 326 English local authorities in terms of area deprivation, while Newcastle-under-Lyme, located in north west England, had a population of 124,000 and was ranked 152 in terms of area deprivation (Department for Communities and Local Government, 2013). Foot-based audits in Ipswich were conducted between October 2011 and December 2011 and in Newcastle-under-Lyme between November 2013 and December 2013. GSV audits in Newcastle-under-Lyme were conducted between May 2014 and August 2014 and in Ipswich between December 2013 and May 2014. In Ipswich, the majority of GSV imagery was uploaded seven months after the foot-based audits. The time difference between foot-based audits and GSV imagery upload in Newcastle-under-Lyme was between 1 and 4 years.
Two fieldworkers, independent of those collecting foot-based audit data, were trained in the data collection procedure using GSV. The GSV fieldworkers were unable to access foot-based audit data so that GSV data collection was completely independent. Street segments in GSV were defined by matching the start and end points of segments audited in the foot-based audits. The fieldworkers virtually 'walked through' the street segments, first recording one side and then the other side of the segment. GSV audit data were entered into an Access database before being exported into Stata 14.1 (StataCorp LP, 2015).

Secondary data collection
Selected relevant measures of the physical environment at the LSOA level in the 17 English towns were generated from secondary data sources using ArcGIS 10.3 (ESRI, 2014). Seven environmental variables were examined that were treated as a priori directly comparable and proxy variables for features collected by foot-based audits. These related to freely available secondary data sources on transport, access to services, green space and land use mix. Transport related variables included annual average daily flow (AADF) for all motor vehicles and counts of bus stops in 2014. An AADF is the average over a full year of the number of vehicles passing a point in the road network each day. An AADF is measured in major roads (Motorway and A-class roads) and minor roads (B-roads, C-roads and unclassified roads). The raw manual counts are collected by trained enumerators over a period of one hour (Department for Transport, 2015). Data on the road network (motorways, A-roads and total road network) were obtained from Digimap Meridian 2 National (Digimap Ordnance Survey, 2015). Data on road traffic injuries for 2014 were obtained from the Department for Transport. Location of bus stops and stations was available through the National Public Transport Access Nodes database (Department for Transport, 2014). Location of General Practices and dental surgeries (in 2006), and pharmacies (in 2004) was available through the Neighbourhood Statistics website (Office for National Statistics, 2015). Data on the percentage of area that is public green space in LSOA was available through the Generalised Land Use Database Statistics for England (in 2005) (Office for National Statistics, 2015). Using the same database we constructed a spatial entropy score (SENS) using four different land types; residential, non-residential, green space and other. The SENS is calculated using the formula where P is the proportion of each land type i in the LSOA and N is the number of land types. The SENS is a measure of evenness and ranges from 0 to 1 with higher values representing a more heterogeneous and evenly mixed land-use per LSOA (Pliakas et al., 2014).

Statistical methods
Inter-rater reliability (IRR) was assessed for both the foot-based and GSV audit tools. Agreement was assessed for two independent observers for each segment covered by the analysis. The kappa statistic was used to assess inter-rater agreement for categorical variables (Armstrong et al., 1992;Fleiss, 1981;Landis and Koch, 1977) and intraclass correlation coefficient (ICC) for one-way random effects model (Shrout and Fleiss, 1979) was calculated to assess agreement for continuous variables including counts. Bias-corrected confidence interval for the kappa statistic was calculated using 1000 bootstraps. Categories for agreement level suggested by Landis and Koch (Landis and Koch, 1977) were used to describe kappa statistics 0.80-1.00, almost perfect; 0.60-0.79, substantial; 0.40-0.59, moderate; 0.20-0.39, fair; and 0.00-0.19, poor) and values suggested by Cicchetti (Cicchetti, 1994) were used for ICC 0.75-1.00, excellent; 0.60-0.74, good; 0.40-0.59, fair; and 0.00-0.39, poor). Agreement and IRR was assessed for all individual and composite (eg. shops and services or any amenities) audit variables in their original scale and in the format in which they were collected.
In this paper, we hypothesize that foot-based audits are theoretically likely to represent the most accurate method of capturing fine detail local environment features required for health behaviour studies whilst acknowledging that this may not act as a 'gold standard' method (Gasevic et al., 2011). We have therefore examined criterion validity (e.g. the level of agreement between the physical (criterion) and virtual (test) measures) using the ICC as done in previous studies (Badland et al., 2010). We report Pearson correlation coefficients for continuous variables in Appendix S2. With a real 'gold standard' the Pearson correlation coefficient provides an indication of the direct relationship with power loss, efficiency and bias from using the error-prone (test) measure (Checkoway et al., 2004). A walkability index was constructed for both the foot-based and GSV audits using latent class analysis (LCA). The following 10 built environment variables were used: pavement quality, lowered curbs, barriers on pavement, pavement width, pedestrian traffic, road use, road connectivity, traffic calming measures, lamp posts and road crossings (see Appendix S3 and Hawkesworth et al., 2015). For all variables, the comparison of the GSV audits with foot-based audits was conducted at the segment level. Correlations and agreement could not be computed for health promotion adverts because there was no variation in GSV ratings.
Segment level data from foot-based audits were aggregated at the LSOA level to allow analysis and comparison to secondary data. Comparison of data collected by secondary data sources and footbased audits was assessed using the ICC from one-way random effects models using the cut-offs described above. We attempted to find and best define routinely available secondary data that reflected data captured by the foot-base audits. By definition, it was not possible to always compare like with like but we aimed to assess how well different built environment descriptors were correlated from the different data sources. Total traffic in foot-based audits was counted for a 5 min interval during each segment audit and was estimated for 1 h in order to provide a more consistent scale when comparing traffic volume between the foot-based audits and secondary data. The predominant land use category was used to estimate the proportion of segments that were open green areas and to generate an entropy score at LSOA level in the foot-based audits. For this paper, the dataset of 17 English towns was used to compare selected environmental variables between primary and secondary data sources at the LSOA level because some secondary neighbourhood level data is collected differently in Scotland compared to England. All analyses were conducted in Stata 14.1 (StataCorp LP, 2015).

Results
Summary statistics for selected environmental variables are presented in Table 1. Complete descriptions for the remaining environmental variables are available in Appendix S4. Environmental data were collected for 820 LSOAs, where participants of the BRHS and BWHHS cohort studies live, in 17 English towns through foot-based audits and secondary data sources. A total of 1,396 segments covering 87 LSOAs in two English towns (Ipswich and Newcastle-under-Lyme) were audited through foot based and GSV audits.
Below we report on the IRR of foot-based and GSV audit tools and comparisons between foot-based audits, GSV audits and routinely collected secondary data sources.

Inter-rater reliability of foot-based audits
The analysis of IRR for the foot-based audits included 174 repeat segments in 29 LSOAs in the study pilot towns of Guildford and Bristol.
Five different fieldworkers, working in rotating pairs, conducted repeat segments as part of the reliability study. Overall there was extremely good agreement between different foot-based auditors on all the variables included on the audit tool ( Table 2). Some of the more subjective variables such as the quality of pavements or the adequacy of lowered curbs showed less agreement than more objective variables such as road connectivity. Counts of amenities, shops and services showed excellent agreement. The poorest agreement was shown for counts of health promotion adverts, which were rarely seen (Tables 1,  2). In contrast, counts of adverts for unhealthy products (food, alcohol or cigarettes) showed excellent agreement between observers (Table 2).

Inter-rater reliability of GSV audits
The analysis of IRR for the GSV audits included 104 randomly selected segments (13% of total segments) in 43 LSOAs in Ipswich. Two fieldworkers collected repeated segment information as part of the reliability study. The IRR for comparing use of the GSV audit tool showed similar results to that of the foot based audits (Table 2). More objective items, such as transport-related items and counts of shops and services, had a better agreement than more subjective items (ie. items required judgement by the observer), such as availability of green space. Unsurprisingly, in the GSV audits, agreement between fieldworkers for counts of adverts and for aesthetic measures was fair or poor, possibly due to the limited ability to view such items through GSV. Similarly, for built environment items better agreement was observed in more objective road network related items (e.g. connectivity) but agreement was lower for more subjective items, or finer detail items, such as lowered curbs and slope that may only be easily assessed by audits done on the ground. Land use items showed moderate to substantial agreement (Table 2).

Comparison of foot-based and GSV audit tools
The analysis of agreement showed that transport-related items (e.g. road crossings and bus stops) and counts of shops and services had the highest ICC values (Table 3). More specifically, ICC was 0.92 (95% CI 0.92-0.93) for bus stops, 0.76 (95% CI 0.74-0.78) for road crossing and 0.88 (0.87-0.89) for shops and services. Lower agreements were observed for the majority of the more subjective items or those items that are harder to measure in GSV, such as access to green space (ICC=0.33, 95% CI 0.29-0.38) and footpaths (ICC=0.37, 95% CI 0.32-0.41), counts of adverts (ICC=0.18, 95% CI 0.13-0.24), built environment items (except road connectivity; kappa=0.80, 95% CI 0.77-0.83) and aesthetics, particularly security measures (kappa =0.19, 95% CI 0.15-0.25) and graffiti (kappa =0.04, 95% CI 0.00-0.08). From the land use field, predominant land use showed substantial agreement (kappa =0.63, 95% CI 0.59-0.67) whereas secondary land use showed fair agreement (kappa=0.33, 95% CI 0.29-0.38). The class-specific latent class indicator probabilities for the foot-based and GSV LCAs for the walkability index are given in Appendix S3. The characteristics of the latent classes in the foot-based audit and GSV LCAs were very similar, with class 1 in the foot-based audit LCA equivalent to class 2 in the GSV LCA; class 2 in the foot-based audit LCA equivalent to class 3 in the GSV LCA; and class 3 in the foot-based audit LCA equivalent to class 1 in the GSV LCA. The 3-class model was therefore chosen as most appropriate in both GSV and foot-based LCAs. The poor agreement in the walkability index may reflect the fair to moderate agreement observed for the built environment variables that were used in the LCA to construct the index (Table 3).

Comparison of foot-based audit with routinely available secondary data
We compared nine variables related to transport, services and land use between our foot-based audit data and secondary data measures. Overall, there was fair or good agreement for items that we were able to directly compare (Table 4). This included counts of bus stops (ICC=0.56, 95% CI 0.51-0.60) and medical services (ICC=0.79, 95% CI 0.77-0.82) that were measured using very similar approaches in the two methods. However, variables for traffic volume, proportion of green space and land use mix were generated or available in ways that may not be directly comparable between the two methods and produced very poor agreement (Table 4).

Resources required: comparison of time and estimated costs in collecting foot-based audit, GSV audit and routinely available secondary data
We estimated that foot-based audits took, on average, 20 days per town to complete and an additional five days per town were spent for data entry. GSV audits took, on average, 12 days per town to complete. Time taken per audited segment was considerably higher in foot-based compared to GSV audits (9.5 vs 6.4 min per segment). Foot-based fieldworkers audited, on average, 15 segments per pair/day whereas the GSV fieldworkers audited 55 segments per person/day. Staff travel time and expenses to audit areas meant considerably higher costs for foot-based audits. Foot-based audits in the two towns were done in approximately the same season and took the same time to complete. Overall, there were no quality issues with GSV imagery. The identification, use and processing of secondary data sources required considerably less time and covered 820 LSOAs across the 17 English towns. However, only a small proportion of audited items were generated using these freely available secondary data sources as only a small number of variables were comparable between foot-based and these sources (Table 5).

Main findings
To our knowledge this is the first study to compare three different methods of objectively measuring features of the built environment used to examine the association between neighbourhood and health. This study demonstrates a good agreement between foot-based and GSV audit tools used to collect primary data on a range of built environment domains, but highlights the fact that secondary and primary data sources are often measuring very different aspects of the environment. Both direct observations (either on foot or virtually through GSV) and routinely available secondary data sources have a role to play in characterising the environment in studies of the impact of the built environment and health (Fleischhacker et al., 2013;Krenn et al., 2011;Schaefer-McDaniel et al., 2010) but this study highlights the importance of understanding how well the approaches assess the different environmental domains. Secondary data are often used as a proxy measure of aspects of the environment such as the 'walkability' of an area assessed by street connectivity and residential density (Leslie et al., 2007). In contrast, direct observations can provide much more detailed and context relevant environmental data required to understand the complexity of the associations between people and place (Feng et al., 2010;Sallis et al., 2006;Van Cauwenberg et al., 2011).
The results of the IRR analysis demonstrated higher values in the foot-based compared to the GSV audits. This may be partly explained by the larger variation between segments in foot-based audits, and IRR analysis undertaken in two towns compared to one town for GSV audits. The comparison of GSV audit against foot-based audit demonstrated a good to excellent or substantial agreement for almost all of the more objective items of the tool. These include transport-related items, shops and services, and the road connectivity item from the built environment section. There was less concordance for more subjective assessments, such as pavement quality. Our study findings are consistent with findings from a recent systematic review on using remote geospatial tools to characterise the neighbourhood environment . Virtual street audits are promoted as a resource-efficient alternative to foot-based assessments (Fleischhacker et al., 2013;Wilson et al., 2012) and the results of our study mostly confirm this, although some of the difference in the time taken per audited segment between the foot-based and GSV audits may be explained by the absence of data collection for shop prices and traffic volume during the GSV audit. Other studies have also shown that GSV audits are less time consuming (Bethlehem et al., 2014;Charreire et al., 2014) or found no time difference (Wu et al., 2014).
Comparisons of secondary data sources to foot-based audit data demonstrated mixed results in this study. There was fair to excellent agreement for items that were measured using similar scales (e.g. counts of bus stops). The use of different methodologies or scales used to collect data may explain the lack of agreement or presence of negative ICC scores for some items (e.g. traffic volume). Such negative ICC values are not theoretically possible but could be observed in estimates when the observed variation between units (ie. LSOAs) is even less than you would expect given the differences within LSOAs Table 4 Comparison of the foot-based audits vs secondary data at LSOA level using Intraclass Correlation Coefficients (n=820, unless otherwise stated). a .

Variable
Individual ICC 95% CIs Agreement  Only LSOAs with at least one count point that links the AADFs to the road network.

Table 5
Comparison of time and potential costs for managing, collecting and cleaning data using the three methods to objectively assess the neighbourhood environment.  (Giraudeau, 1996). We found that identifying and processing secondary data sources that are publicly accessible were less time consuming than foot-based and GSV audits, but these could only be used for a very limited number of environmental measures we wanted to study. Previous built environment studies comparing secondary sources with foot-based audits were usually undertaken to assess the validity and/or reliability of secondary data sources (Evenson and Wen, 2013;Fleischhacker et al., 2013). We could have used other secondary data sources, however use of such sources usually involves requesting data from local government (Burgoine and Harrison, 2013;Cummins and Macintyre, 2009;Lake et al., 2012), licence agreements (Burgoine and Harrison, 2013) or commercial business listing companies (Evenson and Wen, 2013;Lake et al., 2012;Liese et al., 2013;Paquet et al., 2008). Such approaches require additional, sometimes considerable, time and financial resources for data acquisition, data cleaning and processing (Evenson and Wen, 2013).

Methodological issues
A number of methodological issues arise when considering use of foot-based and/or virtual street audits and secondary data sources including the quality and completeness of data collection, data aggregation, documentation and management, auditing times and research costs, and temporality (Brownson et al., 2009;Charreire et al., 2014). Selection of one data collection method over another should depend on the needs of the study, the research questions and costs and benefits related to the use of each method Curtis et al., 2013;Gravlee et al., 2006;Shareck et al., 2012). Footbased audits require training and monitoring of staff to ensure data quality both during collection and data entry (Brownson et al., 2009). The same applies for virtual street audits in GSV. High ICC scores for an item require adequate operational definition and proficient observers (Shareck et al., 2012;Zenk et al., 2007). Good data management protocols may reduce the risk of missing data in foot-based audits. However, data completeness and quality is a known limitation for GSV as coverage may not be geographically complete, images may be obscured  and visual inspection of the area is limited to images provided on a particular date, and therefore out of the control of observers. Data documentation and detailed item definitions in a tool is important to enable replication, as there is heterogeneity in defining environmental items between tools (Brownson et al., 2009;Charreire et al., 2014). GIS-based measures produced from secondary data sources may require substantial time to clean, manage and analyze especially if data come from less routinely collected sources (Brownson et al., 2009). An important consideration is validity of secondary data sources as inaccuracies in these data may obscure relationship between neighbourhood and health (Cummins and Macintyre, 2009). Equally important, the use of different spatial scales to aggregate segment-level data may make comparisons of secondary data sources with foot-based audits and comparison with other studies problematic, as neighbourhood definition may vary considerably. This also raises the issue of measurement error, scale and specificity of environmental items included in the audit tool (Brownson et al., 2009;Fleischhacker et al., 2013;Pliakas et al., 2014;Shareck et al., 2012). For example, we were able to directly compare variables expressed as counts on a continuous scale (e.g. count of bus stops and shops) when these were available in this format in both the foot-based audits and in secondary data sources. However, categorical variables in foot-based audits (eg. land use variables) were converted to a proportion of segments having a specific characteristic (ie. green space) within a LSOA. This measure depends on how segments were defined, number of segments audited in the LSOA and the area of the LSOA. It is therefore likely to differ from the equivalent measure from secondary data sources, where land use items are usually expressed as a percentage of the area of the LSOA, even if each segment measure is accurate. Thus we could not directly compare secondary data sources to foot based audits for some items. Finally, the use of a tool designed to concurrently measure a wide range of environmental dimensions relevant to multiple health outcomes, for example studies attempting to capture the 'obesogenic environment', may have limited utility if information is missing on specific features of the environment that are theoretically linked to specific health outcomes (Shareck et al., 2012).
To increase accuracy of reporting, and decrease problems with matching segments between foot-based and GSV audits it is crucial to identify discrepancies in the start and end points of the segments (eg. road names recorded on the ground may be different to the way roads are recorded in GSV). An important consideration for GSV audits is temporality due to the date imagery was uploaded, which might cause problems for some of the items audited . In our study, one study town (Ipswich) had a very small time gap between GSV and foot-based audits (7 months) but the gap was much longer, between 1 and 4 years, for the other study town (Newcastle-under-Lyme). Daily variation, for example in pedestrian traffic or adverts, may also explain low agreement scores between the field-based and GSV audits Wu et al., 2014). Subjective items have been found to be consistently less reliable than more objective items (Wilson et al., 2012), however, another plausible explanation is that such items may be easily missed in virtual audits due to insufficient resolution in GSV (Wu et al., 2014). This was particularly difficult when trying to distinguish the different type of shops or identifying parking restrictions.

Future research directions
Our paper contributes to the development of methods for measuring elements of the built environment in future public health research. Such studies using a wide range of data sources and methods to assess links between neighbourhood environments and multiple health behaviours will help develop tools that include more refined measurement items and inform how best to allocate research resources (Brownson et al., 2009).
This study sets out some of the advantages (and disadvantages) for using a single primary data collection tool that can be used for both foot based and GSV audits across multiple environmental domains. A single data collection audit tool may not always be the most useful or feasible method of capturing multiple potential exposures operating at different spatial scales. Future built environment research should consider the need to design studies a priori using multiple methodological approaches and data sources to capture features that operate on health behaviours at different spatial scales. Engagement and participation of people living or using a neighbourhood should be sought as this is important in the development, or adaptation, of audit tools with the inclusion of items that may be particularly relevant for specific population groups or contexts (Brownson et al., 2009). Studies of community perceptions of environmental features, or of public-environment interactions such as mapping activity spaces for different population groups (Milton et al., 2015) should also be included where possible to capture, and contrast, different influences on health behaviours.
Capacity building is an important component of built environment research as data collection through observation or virtual audits requires considerable investment in staff time and training (Brownson et al., 2009). Recruiting fieldworkers locally can achieve cost savings. Familiarity of the use of GSV is essential to reduce time taken to undertake audits remotely (Badland et al., 2010). As footbased audits may be superior to other methods when collecting fine detail environmental features, such as path quality or in-store measures, the use of mobile and web applications, tablets, and smartphones that are linked to a central repository or database can be used to reduce time for data collection and entry substantially (Aanensen et al., 2009;Brownson et al., 2009;Gravlee et al., 2006). However, the use of this technology should be weighed against the cost of purchasing such equipment and software and their feasibility in the field should be tested, especially in low income settings or areas of high crime where they may pose a safety risk to fieldworkers (Brownson et al., 2009;Charreire et al., 2014). Remote audits may be a good solution where safety is of concern.
Collaboration with local governmental departments, including public health, planning, and transport, may provide valuable sources of routine secondary data that are not usually used (Brownson et al., 2009) although additional resources may be required to process the data for research use as it will be collected in differing ways. Improving the availability of spatial data from governmental departments (eg. licensed premises data) (Local Government Association, 2013) in a usable format would greatly reduce costs and time. More recently, crowdsourcing has been combined with GSV to assess street-level characteristics of the built environment (Hara et al., 2013(Hara et al., , 2014. Although not yet available, GSV may in the future provide the ability to view multiple dates of imagery in places that could offer the potential to remotely assess environmental change longitudinally at a finer spatial scale. This could present an opportunity to evaluate complex natural experiments such as area regeneration or urban design interventions Wilson et al., 2012). Finally, the use of GSV audit tools may be particularly valuable in large national or international projects which include multiple study towns or countries in order to assess diverse environments cost-effectively (Badland et al., 2010). However, ethical considerations should be addressed adequately in order to protect individuals' privacy as researchers are increasingly raising questions about the use of online tools, such as GSV, that indirectly identify where study participants live.

Conclusion
There is increasing interest in the potential role of the built environment in influencing health behaviours such as physical activity but understanding these complex relationships requires accurate methods for objectively measuring relevant environmental features. Our study again highlights the fact that primary and secondary data sources often measure very different aspects of the environment. There is often ambiguity in the literature about what secondary data can usefully capture in studies addressing the effects of place on health and this comparison helps to highlight the different hypotheses and scales that can be studied by these complimentary methodologies. GSV is a reliable tool, particularly for more objective physical measures of the built environment. We found foot-based audits are superior when collecting data on fine detail environmental features. The use of secondary data sources can be used to derive a range of environmental measures, if validated properly, and are useful when collected on a routine basis to avoid any temporality issues. Studies of the association between the built environment and health or health behaviours need to ensure that they are employing the most relevant and optimal methods for their research question whether this be direct observation, secondary data or surveys of individuals' neighbourhood perceptions. A single method may not always be the best approach for capturing multiple potential exposures operating at different spatial scales in complex health systems, such as 'obesogenic environments'. Future built environment research should consider the need to design studies a priori using multiple approaches and varied data sources in order to capture features that operate on health behaviours at different scales to provide a more complete understanding of the influence of the local environment on health.

Contributorship statement
KL, BA, JPC, CG and RM conceived the study. KL and SH designed and managed the environmental data collection with input from CG, KN and TP. TP generated the environmental variables from secondary data sources. TP conducted the data analysis with input from BA and RS. TP wrote the first draft of the manuscript with KL, SH and PW. JPC is Co-Director of the British Women's Heart and Health Study. RM is Co-Investigator of the British Regional Heart Study. All authors contributed to the scientific content of the manuscript and approved the final version for publication.