Deriving Low-Cost, Dwelling-Level Statistics for Exploring Urban Sustainability: Income, Land Surface Temperature, Environmental Attitudes and Swimming Pool Ownership

Improving urban sustainability requires an understanding of the determinants of resource consumption. The determinants of such consumption are poorly understood despite more than 40 years of investigation. Detailed exploration requires data at the dwelling scale. Such data are usually difficult or expensive to collect. This work derived dwelling-level statistics for each of ~ 200,000 dwellings in the city of Canberra, Australia. Swimming pool locations and size were derived from satellite imagery (a); household wealth was estimated from unimproved land value (b); summer daytime micro-climate temperature was estimated from thermal satellite imagery (c) and environmental attitudes were estimated from the interpolated percentage of Greens party votes from polling station election results (d). All four variables were correlated with residential water consumption. This work demonstrates how these explanatory variables can be derived from publicly available datasets and at low cost. It also shows their value in understanding the determinants of household water consumption.


INTRODUCTION
Water is one of the most important natural resources.It is a scarce resource without substitute for human life.Previously, water played a crucial role in the location, function and growth of cities [1].Despite technological improvements in the collection, transport and processing of water, it remains an important consideration for any city development and its continued prosperity.Many cities are facing water shortages [2].'Water stress' affects one quarter of cities worldwide [3].Beyond the essential life-preserving requirements, water consumption as a function of economic, social and environmental variables remains largely unexplained despite 40 years of investigation [4].
The efficiency of water use is increasingly important around the world.While consumption of potable water for drinking and cooking, which are essential to sustain life, is expected to remain essentially constant on a per capita basis, increased prosperity may lead to an increase in non-essential uses, such as, for gardening and swimming pools, as well as 'recreational' consumption, such as, the increasing frequency of personal 'hygiene' uses [5].In most cities the approach to water consumption has been to devise systems to supply water for a 'once only use' whether the water is ingested, used for pleasure or as the medium to transport waste (including human body wastes).The increasing demand has been previously met by increasing the catchment areas around the city and the size of storage reservoirs [6].Further challenging such demand-response 'solutions' is the changes in weather patterns due to human induced climate change which may reduce rainfall in some areas and reduce the currently available supply to some cities [7].The combination of a limited supply of water and increasing consumption, due to population increases and improved living standards, necessitates an understanding of the patterns of use in urban environments [8].Such an understanding is vital for policy makers who wish to improve water use efficiency and the sustainability of cities.
Prior to the 1960's, water consumption was thought to be constant with changes in price [4].With increased environmental concern for scarce resources during the 1970's, research began to investigate the determinants of consumption.Water began to be regarded as an economic good.It was believed that the quantity demanded would fall if prices were increased.This belief supported a large volume of research focusing on measuring the price elasticity of residential water demand (for a most recent review see [4]).The large volume of research in this area has resulted largely from a failure to 'explain' consumption according to economic models of 'demand'.Despite this failure, pricing mechanisms are widely used in Australia to reduce consumption [9].

Explanatory variables
Worthington and Hoffmann [10] reviewed 36 papers on the price elasticity of water demand and listed approximately 20 other variables believed to be important in explaining the observed variation.Although the papers reviewed covered many different cities around the world and covered populations with different cultural values, there were common themes in the variables that the researchers believed were relevant.Estimates of personal or household income, the number of occupants per dwelling, city-wide average rainfall and temperature were used as explanatory variables in 75, 56, 61 and 50% of the papers, respectively.
Physical attributes of dwellings are commonly thought to influence water consumption [11].In a survey of 523 households, Domene and Saurí [12] found that swimming pool ownership played a significant role in explaining variations in water consumption.Wentz and Gober [13] used aggregate water records at the census tract level and found that swimming pools were the second most important factor in explaining variation in water consumption.Household size was the most important.Mayer et al. [14] found that, on average, dwellings in Arizona, with swimming pools, used more than twice as much water outdoors as those without.Yet swimming pool size and location may not be recorded by planning authorities, or such data may not be easily accessible to the researcher.However, their size and location can be estimated from satellite imagery.For large-scale analysis (e.g. at the city-level), such imagery may be available at low cost.
Satellite imagery has been used to identify swimming pools in other fields of research.Tien et al. [15] used satellite imagery to identify swimming pool locations as a water source in fighting fires in emergency events.McFeeters [16] used Normalized Difference Water Index (NDWI) derived from multi-spectral satellite imagery to identify swimming pools as sources of mosquito breading.In their study, 78.4% of swimming pools were identified correctly.Their method required a near-infrared spectral band, which is currently only available at some cost.This contrasts with the work of Tien et al. [15] who used publically available Red/Green/Blue (RGB) satellite imagery.
Aside from physical aspects of dwellings, local environmental factors are also commonly used as explanatory variables.Rainfall [17] and ambient temperature [18] are the most common of these.Climate data is commonly reported only at the city level.Sountharajah et al. [19] used city-wide average quarterly rainfall and temperature to explore patterns in water consumption in Sydney Australia.Temperature, however, may vary by several degrees Celsius within a city [20].An increase in urban tree cover, for example, may have a significant local cooling effect [21].Outdoor temperature measurements at each dwelling have not been previously used to explore water consumption, despite the possibility to measure such variations from satellite images existing since the early 1980's with the Thematic Mapper (TM) sensor on the Landsat 4 satellite.
Household attitudes and behaviours are believed to affect water consumption but are difficult or expensive to measure.Ajzen and Fishbein's [22] 'Theory of Reasoned Action' suggests that one's attitude may be a predictor of corresponding behaviour.Some of the literature suggested a strong relationship between attitudes and water consumption, but others found no such relationship [23].Domene and Saurí [12] used an index (0-6) to rate occupant behaviour towards conservation practices and found a positive correlation to water consumption.Similarly, Tonn and White [24] found that households with favourable attitudes toward conservation used less energy.In a survey study of 56 households in New Jersey (USA), where 70% of energy is consumed by air conditioning (cooling), Seligman et al. [25] found that attitudinal factors accounted for 55% of the variance in summer electricity consumption.In a survey of 120 households in the UK Brandon and Lewis [26] found that households with positive environmental attitudes were more likely to change their energy (gas and electricity) consumption subsequent to feedback on financial or environmental costs of their consumption.On the other hand, in a study in the Netherlands, Verhallen and Van Raaij [27] found that household attitudes (survey-derived) explained only 2% of behaviour and that attitudes explained very little of the variation in household energy use.
Kahn [28] found Green Party registration in California, measured at the Census tract, positively correlated with higher public transport use, more hybrid vehicle purchases and lower gasoline consumption.Similarly, Brounen and Kok [29] found a positive correlation between the percentage of Green Party votes, measured at the city level, and the adoption of housing energy efficiency certificates in the Netherlands.In Australia, support for the Greens political party may be a proxy for community environmentalism and may help explain the spatial variation or heterogeneity of water consumption.
Wealth was the most common explanatory variable listed in the review by Worthington and Hoffmann [10] but is difficult or expensive to collect at the household level.While income is recorded at the individual dwelling level during the National Census, for privacy reasons it is not publicly released at this level of detail.For example, Sountharajah et al. [19] work in understanding water consumption and water tank adoption in Sydney, Australia, used National Census aggregated to the Local Government Area (LGA) -approximately 100,000 people.However, unimproved land value for each dwelling is freely available.This may be used instead of income as a non-identifying proxy for wealth measured at the dwelling level of detail.

Modelling determinants of water consumption
Numerous mathematical models have been used to explore the determinants of water consumption.These include Ordinary Least Square (OLS) [30], Maximum Likelihood (ML) [31], Instrumental Variable (IV) [32], Generalised Linear Models (GLS) [33], Generalised Method of Moments (GMM) [34], among others.Typically 30-70% of the observed variation in consumption is 'explained' by the modelled parameters.While these models usually focus on price, almost all attempted to explain the variance in consumption by other variables as well such as income, temperature or rainfall.

Scale
Exploration of the determinants of residential sustainability requires fine scaled data which is usually difficult or expensive to collect.The largest study in Australia, undertaken at the household-scale (end-use survey) was by Beal et al. [35].They analysed patterns in residential water consumption through household surveys of 252 homes in South East Queensland, as part of an AUD 50 million project focusing on water security and water recycling.The high cost and small sample size resulted from a lack of household specific data (explanatory variables).Phone surveys typically cost AUD 10 per response [36].On-ground site inspections, for example to measure variation in temperature across a city may cost much more.
In Australia, measurements of the physical attributes of dwellings (e.g.paved area, garden area or roof colour) or environmental or social attitudes of households are rarely publicly available.Such data may be collected by household surveys, but is expensive to collect and may not be published in detail because of privacy concerns.National Census aggregate data of approximately 200 dwellings are typically used to explore patterns in consumption but cannot provide useful data about individual households.Although detailed water consumption data at the individual dwelling scale is increasingly available, through automated meter readings, similarly detailed explanatory variables are often not.Rigorous exploration of the factors affecting resource consumption requires explanatory variables measured at the level of the individual dwelling, rather than in aggregate.

Research question
The present work aims to derive explanatory variables believed to be important in understanding patterns in residential water consumption at the household-scale of detail and without expensive surveys.To achieve this aim, dwelling-level estimates of swimming pool ownership, household wealth, land surface temperature and environmental attitudes were estimated for every dwelling (~200,000) in the city of Canberra, Australia.Single-variable linear regression were used to test the relationship between metered water consumption and these four explanatory variables.
The Methods section of this paper details the derivation of explanatory variables and any results or discussion pertinent to their derivation.The Results section of the paper is focused on testing the relationship between these derived variables and metered residential water consumption.The Discussion section, similarly, is focused on how all four variables contribute to an understanding of the determinants of water consumption and their value is assessing other aspects of urban sustainability such as energy consumption.

METHODS
Single-variable linear regression was used to explore the relationship between water consumption and four variables that were believed to affect that consumption.Swimming pool area, wealth, micro-climate and environmental attitude were all estimated at the level of the individual property block, for every residential block in Canberra, Australia (~200,000 dwellings).Values were aggregated to the suburb level to match the aggregation level at which residential water consumption was publicly available, metered residential water consumption values used were those reported by Troy et al. [37].Outliers were identified using a two-sided Grubbs' test [38] and were removed.The aggregated sample size was 52 suburbs.

Swimming pools
Swimming pool location and size were estimated using satellite imagery.Each image was 640 × 640 pixels and had a ground resolution of approximately 30 cm, with 15,201 images covering the Canberra urban area.Sample images were obtained from Google Maps™.The details of image pre-processing (pansharpening) used to convert the multi-band satellite images (panchromatic) to RGB images, were not available to the researcher.All pixels of 53 images were manually classified as either swimming pool or non-pool.A random sample of 50% of the pixels were used as training examples for a Support Vector Machine (SVM) linear classification learning algorithm [39] and the remainder were used for classification verification.For the SVM, a constant of 10 was used for the regularization term in the Lagrange formulation and a value of 0.1 was used for the free parameter of the Gaussian Radial Basis Function (RBF).Figure 1 shows the SVM classification of all pixels used in the training sample.The pixels identified as swimming pools are shown in blue.Clusters of similar blue pixels represent individual swimming pools.
The SVM misidentified some areas as swimming pools which had a similar RGB colour to swimming pools but were not (false positives).Many swimming pools in the training dataset were 'greenish' in colour.This increased the number of other 'greenish' pixels falsely classified as pools.The relatively small area of analysis (~ 550 km 2 or 6.2 × 10 9 pixels) allowed manual verification.Pixels falsely identified as swimming pools were removed.Publicly owned land was removed to reduce the study area and to reduce false positive pool identification.A more stringent classification was not used because it may result in some pools not being identified (false negatives).All pixels were spatially joined to a cadastre to provide a unique identifier (residential property ID) for each possible pool pixel.Misclassified pixels were often shadow areas, cars, trampolines, tarpaulins, polycarbonate roofing material or other small blue coloured objects.The pixels were grouped by each property.Each 'identified' pool was manually inspected and 'non-pools' were removed.The surface area of each pool was estimated as the sum of the pixels classified as swimming pools on each property.Total pool surface area in each suburb was estimated as the sum of 'pool' classified pixels.evident.Manual inspection of the full dataset identified 19.7% of all pixels as falsely classified as swimming pools.Swimming pools not correctly identified (false negatives) were estimated by manual inspection of a randomly selected 150 images.Pools not correctly identified were often white in colour, (empty), green in colour (not maintained or lined with green materials) or grey in colour (most likely due to image processing by the data custodian).26% of swimming pools were not correctly identified.This compares favourably with McFeeters [16] whose method failed to identify 21.6% of the pools.The main advantage of the approach in this work is the use of publically available and low cost RGB imagery rather than costly multispectral data as in McFeeters [16].

Household wealth
The 'wealth' of households was estimated from the Unimproved Land Value (ULV) on which the dwelling was located.The unimproved land value was annually estimated for each property by the territory government and used for government charges.In multi-dwelling complexes the unimproved land value was pro-rated per dwelling.Figure 3 shows a histogram of the ULV's, pro-rated per dwelling for the city of Canberra, Australia.Generally higher values are found closer one of the city centres.Lower values generally correspond with apartments.

Local environment dwelling temperatures
Thermal satellite data was used to estimate spatial variation in summer daytime temperature across the city of Canberra.A Landsat 7 satellite image for Canberra was converted to surface temperature measurements using the method described by Coll et al. [40].A daytime image of thermal band 6 was used.The image had a spatial resolution of ~ 60 m at ground level.The daytime summer thermal image was from the 3 rd February 2000 (Figure 4).Older areas, in the centre of town, were generally cooler while newly built areas, in the north of the town, were much hotter.The environment temperature at individual dwellings was estimated by the nearest pixel value.Average suburb temperatures were estimated as the mean dwelling value across each suburb.

Environmental attitude
There was reason to believe that attitudes should influence behaviours and therefore water consumption of households [28].The environmental beliefs of households were estimated from 2010 Federal House of Representatives election polling station results.Voter turnout was 94.63% [41].Note that in Australia voting in elections is compulsory and voters are required to register and are fined if they do not register or do not vote without an acceptable excuse such as ill health.Four political parties had candidates for the election.The Australian Greens party received 18.0% of the first preference primary vote.A linear spatial interpolation based on the closest 4 polling stations was used to assign a 'percentage green vote' to each dwelling.There was generally one voting station per suburb.Average suburb values were estimated as the mean dwelling value within each suburb (Figure 5).

RESULTS
Single-variable linear regressions were used to test the relationship between metered water consumption and estimates of swimming pool ownership, household wealth, land surface temperature and environmental attitudes across the city of Canberra, Australia.

Swimming pools
Identification of swimming pools from satellite imagery revealed 6,133 swimming pools in Canberra.The surface area of swimming pools in each suburb was positively correlated with water consumption (p < 0.001, R 2 = 0.265).Suburbs with an average of 2 m 2 of swimming pool surface area per dwelling consumed 100 kL per dwelling per year (kL dwel −1 yr −1 ) more than suburbs with no significant pool surface area (Figure 6 Panel A).The suburb of Red Hill had the highest number of swimming pools.There were 151 identified or 14% of all dwellings.There were 10 suburbs where less than 1% of the dwellings had swimming pools.

Household wealth
Household wealth, as estimated by ULV, was positively correlated with water consumption (p < 0.01, R 2 = 0.15).Suburbs with an average ULV per dwelling of AUD 190,000 had an estimated consumption of 180 kL dwel −1 yr −1 .This increased to 400 kL dwel −1 yr −1 per dwelling in suburbs with a ULV of AUD 800,000 per dwelling (Figure 6 Panel B).
The 'City centre' suburb was removed from analysis because residential land could not be separated from commercial in mixed use developments.These were the dominant form of development in the area.Commonly, wealth is modelled on a logarithmic scale to account for the highly skewed distribution.Taking the natural log of the ULV did not improve the correlation (p < 0.05, R 2 = 0.09).
Mean ULV per block and per square meter were also investigated as estimates of household wealth, but both were found to be poor approximations.Mean ULV per block over-estimated wealth in suburbs with higher density developments.High density developments typically occupy a single collectively owned block.Estimating wealth based on block value alone without considering the number of dwellings or households on each block therefore over-estimates the wealth of suburbs with many high density developments.Mean ULV per square meter was found to under-estimate wealth in affluent areas with very large blocks.Many affluent areas had block sizes greater than 1,000 m 2 .The land value per square metre was relatively low but the land purchase price would exceed AUD 1,000,000, which is prohibitively expensive for poorer households.

Local environment dwelling temperatures
The consumption of water for garden areas lowers surface land temperature through evaporation.As expected, the local daytime summer temperature was negatively correlated with water consumption (p < 0.05, R 2 = 0.06).Suburbs that consumed 290 kL dwel −1 yr −1 had an average surface temperature of 28.1 °C.In suburbs that

Environmental attitude
Increased concern for the environment was negatively correlated with water consumption (p < 0.01, R 2 = 0.11).While suburbs with a 12% vote for Australian Greens had an estimated annual average consumption of 285 kL dwel −1 yr −1 , this dropped to 170 kL dwel −1 yr −1 in suburbs with a 30% Greens party vote (Figure 6 Panel D).
The percentage of Liberal Party (conservative) votes showed the opposite relationship.The Liberal Party vote was positively correlated with water consumption (p < 0.001, R 2 = 0.28).

DISCUSSION
This work has demonstrated the success of deriving dwelling-level estimates of swimming pool ownership, household wealth, land surface temperature and environmental attitudes for every dwelling (~200,000) in the city of Canberra, Australia.Furthermore, deriving these datasets used low cost and publically available datasets.
All the variables were found to be correlated with publicly available water consumption data.Three of the most commonly thought variables to explain consumption [10] were shown to be easily derived at the household level from publically available data.This works contributes to the understanding of the determinants of urban water consumption and to those who wish to improve the sustainability of cities.

Swimming pools
The use of satellite imagery to identify swimming pools provides both their location as well as an estimate of their size from which water use due to evaporation may be estimated.While this work used suburb-aggregated water consumption (because publically available data was only available with this resolution), the method described may be used to estimate water consumption at each dwelling.Water loss due to evaporation can be estimated from local meteorological data.The average annual potential pan evaporation in Canberra is 1,400 mm [42].Pan evaporation is multiplied by 0.7 for large water bodies [43].Water loss due to evaporation in Canberra is 1 kL m −2 of swimming pool surface area.This is similar to the results reported by Dawson [44].However, pan evaporation may underestimate evaporation of swimming pools.This may be because rain water may not be captured by 'full' pools because the additional water would overflow.Pool covers may reduce the expected consumption of water by reducing water evaporation.
Even where pool location data is easily accessible, for example from a database, interrogation of satellite imagery may still be valuable.Satellite imagery may be used to estimate pool volume, or to check if the pool is full, empty or has been removed and the time at which this occurred.Typically in Australia, access to government-held swimming pool location data is restricted for privacy reasons.This work has shown that pool location is already in the public domain and that satellite image analysis can be used to automate collection on the scale of a whole city.The relative ease of collection and improved measurement of pool use should encourage the use of such data.It may allow improved analysis of the determinants of urban water consumption and sustainability particularly when matched with dwelling level water consumption data.

Household wealth
Household wealth, as estimated from unimproved land valuation, was found to be correlated with water consumption.Unimproved land value had a large effect on water consumption, a four times increase in ULV resulted in a doubling of water consumption.The review of water demand modelling by Worthington and Hoffmann [10] showed that 75% of the literature they reviewed used wealth (as measured by income) as an explanatory variable.This work estimated wealth for each household at a lower cost than from surveys that attempt to estimate wealth based on household income.Estimating wealth for each household across a city would be prohibitively expensive using a survey.It may also be difficult to use a survey to estimate household wealth because of the privacy concerns and because of non-responses.

Local environment dwelling temperatures
The low explanatory power of the correlation between land surface temperature and water consumption (R 2 = 0.06) may be because garden watering was not a major portion of total water consumption.It may also be because in the older suburbs the large trees that provide shading and local cooling are part of the public realm and are rarely watered by households.
Micro-climate data at the household level of detail can be derived easily from publically available satellite imagery as was found in this work.Aside from explain patterns in urban water consumption it may also be useful in exploring other aspects of urban sustainability such as household energy consumption.The 3.4 °C variation between suburbs would expect to have a significant impact on cooling or heating energy to achieve a similar thermal comfort.Huang et al. [21] found that a 25% increase in the urban tree cover in Sacrament could reduce by 40% the annual energy required for cooling.
The use of satellite imagery also allows the creation of up-to-date datasets.The Landsat 8 satellite, for example, images the entire earth every 16 days.Such imagery can be useful in longitudinal studies.For example, longitudinal temperature profiles gathered for every dwelling in a city or country and could be used to explore the variation in energy or water consumption over several years.

Environmental attitude
Areas of the city with a higher percentage of Green Votes were found to have much lower water consumption.This work adds to the field of urban sustainability by demonstrating the apparent relationship between voting results and water consumption.The results of this work suggests that one's voting preference is a predictor of environmental behaviour.This finding is similar to the work of Kahn [28] who found a relationship between Green ideology and gasoline consumption and between Green ideology and public transport use.

Further work
This work has shown that computer algorithms may be used to identify features of the city and to create a database of these features so that they can be used to explore the determinants of urban sustainability.In some cases, these datasets may exist but have restricted access for 'privacy' reasons, but in other cases, such as for dwelling land surface temperature, may not yet exist.It is hoped that this work encourages the creation of such databases.
The strength of using polling results to estimate environmental attitudes is that it removes survey biases and provides an estimate at low cost.Rather than using aggregated data, such as that released from the National Census, this work provided estimates at the individual dwelling scale.The data collection methods outlined in this work are inexpensive and may be useful for exploring other aspects of urban sustainability aside from water consumption, such as energy consumption.
Other variables identified by Worthington and Hoffmann [10] that were believed to be important in understanding patterns in residential water consumption were rainfall, garden area and the number of dwelling occupants.Garden area may be estimated from satellite imagery in a similar manner to identifying swimming pools in this work.Estimating the number of dwelling occupants without invasive surveys would also be valuable in understanding pattern of water (or other resource) consumption.The current best practice is to estimate occupancy rates based on National Census estimates of population density [45].
This work is a first step in deriving household-level explanatory variables important in understanding patterns in residential water consumption.Further work should combine these explanatory variables with metered household water consumption to understand the relative contribution of these explanatory variables and to test for correlations between variables.

CONCLUSIONS
This work outlines four methods for estimating variables suitable for exploring the consumption of water at the household scale of detail.All the variables were found to be correlated with publicly available water consumption data aggregated to the suburb level.
Satellite imagery was used to identify and locate swimming pools across the city of Canberra, Australia.The area of swimming pools in each suburb was positively correlated with water consumption.
Thermal satellite imagery was used to estimate the summer daytime temperatures around individual dwellings.Local summer daytime temperature was found to be negatively correlated with water consumption.That is, areas of high water consumption led to lower ambient temperatures from evapotranspiration.
Household wealth was estimated from unimproved land values and found to be positively correlated with water consumption.This is a low cost method of estimating 'wealth' for all dwellings in Canberra, compared with, for example, household income surveys.
Environmental attitudes were estimated from election polling station results.Areas with a higher proportion of Greens party votes had lower water consumption.The work indicates that election results for the Australian Greens party may reflect environmental attitudes.
Improving urban sustainability requires an understanding of the determinants of resource consumption.This work had derived dwelling-level statistics of income, land surface temperature, environmental attitudes and swimming pool ownership for the city of Canberra and shown that all are correlated with residential water consumption.This works contributes to the understanding of the determinants of urban water consumption and to those who wish to improve the sustainability of cities.

Figure 1 .
Figure 1.RGB classification pixels used to train the SVM (blue points indicate pixels identified as swimming pools)

Figure 4 .
Figure 4. Land surface temperature across the urban area of Canberra, Australia

Figure 5 .
Figure 5. Interpolated percentage of Green Party vote across Canberra, Australia (polling stations shown by black dots)

Figure 6 .
Figure 6.Water consumption as a function of swimming pool area (Panel A), unimproved land value (Panel B), summer daytime temperature (Panel C) and percentage of Australian Greens party voting (Panel D) (each suburb is represented by a grey dot) Talent, M. Deriving Low-Cost, Dwelling-Level Statistics for ... Year 2020 Volume 8, Issue 1, pp 56-70 consumed an average of only 190 kL dwel −1 yr −1 , the surface temperature was 31.4 °C (Figure 6 Panel C).