From yards to cities: a simple and generalizable probabilistic framework for upscaling outdoor water conservation behavior

Outdoor watering of lawns accounts for about half of single-family residential potable water demand in the arid southwest United States. Consequently, many water utilities in the region offer customers cash rebates to replace lawns with drought tolerant landscaping. Here we present a parcel-scale analysis of water savings achieved by a ‘cash-for-grass’ program offered to 60 000 homes in Southern California. The probability a resident will participate in the program, and the lawn area they replace with drought tolerant landscaping, both increase with a home’s outdoor area. The participation probability is also higher if a home is occupied by its owner. From these results we derive and test a simple and generalizable probabilistic framework for upscaling water conservation behavior at the parcel-scale to overall water savings at the city- or water provider-scale, accounting for the probability distribution of parcel outdoor areas across a utility’s service area, climate, cultural drivers of landscape choices, conservation behavior, equity concerns, and financial incentives.


Introduction
Climate change and population growth threaten the balance of water supply and demand in many urban regions around the world [1][2][3][4][5][6]. A dramatic case in point is the urban water stress brought on by the California drought of 2011 to 2016, the most severe drought in the southwest United States over the past 1200 years [7]. In January 2014, California's Governor Jerry Brown issued the first of a series of emergency proclamations to address the statewide drought, and California's roughly 400 urban water agencies responded with a number of short-and long-term water conservation programs [8]. Because irrigation of lawns accounts for roughly half of residential water demand in a typical California home [9,10], many water agencies focused on reducing residential outdoor water use [8]. In general, utilities can encourage conservation through [11]: (1) direct positive financial incentives such as rebates; (2) direct negative financial incentives such as fines; (3) indirect financial incentives such as tiered pricing, (4) public education campaigns; and (5) sanctions, bans, or norming. In this study we focus on an example of the first approach; namely, a 'cash-for-grass' lawn replacement rebate program.
Cash-for-grass programs are a popular approach for incentivizing lawn replacement. In these programs, water agencies offer customers a rebate for replacing irrigated grass in their yards with drought tolerant landscaping [6,[12][13][14]. Even with cash incentives, however, social barriers-such as the preference for lawns, requirements for an initial expenditure outlay, and neighborhood norms and covenants-limit participation [15,16]. In their recent analysis of the $350 million cash-for-grass rebate program implemented by the Metropolitan Water District in Southern California, Pincetl et al [16] called for more research into factors that influence residential participation in these programs, including 'building density, lot sizes, and other characteristics' .
To address this knowledge gap, we carried out a parcel scale analysis of a cash-for-grass program implemented by the Irvine Ranch Water District (IRWD) in Orange County, California. IRWD's rebate program, which began in late 2010, pays residential customers a fixed unit rebate (dollars per area) to replace lawns with drought-tolerant outdoor landscaping. The unit rebate paid by IRWD changed over time, from $1.50 per square foot (1 October 2010 through 1 June 2014) to $2 per square foot (1 June 2014 to 31 March 2017), except for a roughly three-week period (1-19 May 2015) when it was temporarily increased to $3 per square foot. Over our study period (October 2010 through March 2017), a total of 1559 single-family residential (SFR) parcels, or 2.6% of the approximately 60 000 SFR parcels in IRWD's service area, participated in the program. The program replaced approximately 130 000 m 2 of lawn area with drought tolerant landscaping, for an annual water savings of between 130 and 222 megaliters (ML), assuming unit reduction in water use of between 1002 and 1711 l m −2 yr −1 [17,18]. IRWD's service area is divided into 77 villages, each of which has its own architectural theme (reflecting the region's master-planned heritage and development history) and clearly defined edges [19]. In this letter we examine how the outdoor area and owner occupancy status of individual parcels in IRWD's service area-data that are readily available from the local tax assessor's office-influence the probability that a resident will participate in the rebate program. We then demonstrate how this information can be upscaled, directly linking outdoor water conservation behavior at the parcel scale to overall water savings achieved at the water provider or city-scale.

Definitions of SFR parcel and rebate participation
For the purposes of this study, an SFR is defined as a parcel with a residential detached dwelling and an IRWD water meter account and associated service point ID (SPID). SFR parcels were classified as rebate 'participants' provided: (1) a rebate application was filed within our study period (1 October 2010 to 31 March 2017); (2) the applicant passed an onsite inspection by IRWD personnel (to verify that lawn was replaced with drought tolerant landscaping); and (3) the applicant received a rebate check from IRWD following the inspection. Because rebates were typically processed within 6 months of the initial application, the status of all rebate applications was evaluated as of November 2017, eight months after our study window closed. SFR parcels were classified as rebate 'non-participants' if they failed any of the above criteria.

Parcel-scale features
For each SFR parcel in IRWD's service area we compiled tax assessor information referenced by Assessor Parcel Number (APN), including outdoor area (which was calculated as the difference between the parcel's lot area and building area) and owner occupancy (APN and SPID were matched from IRWD records).

Data curation
Parcels were removed from our analysis if any of the following applied: (1) '0' was listed for the parcel's lot size or year built; (2) the parcel did not have an associated SPID, indicating that there was no IRWD water connection; and (3) manual inspection revealed that the parcel in question was associated with a park or other non-residential green space.

Classification and regression trees (CART)
We used the machine-learning algorithm CART (R-PART in R Software) [20] to evaluate if a SFR's outdoor area and/or owner occupancy status could discriminate between participants and non-participants (see section 1 of SM (stacks.iop.org/ERL/15/054010/ mmedia) for details).

Participation probability and 95% CIs
The probability that a randomly chosen resident will participate in the rebate program, or 'participation probability' p, was estimated as the proportion of rebate participants in any sample of N SFR parcels, is the random variable for participation (X i = 1) or non-participation (X i = 0) and the index i represents a particular SFR parcel. The corresponding 95% confidence intervals were calculated from the formula [21]:

Study statistics
Of the 60 000 SFR parcels in IRWD's service area, 46 915 were enrolled in our study based on the data curation procedure outlined in the section 2. Of these 46 915 SFR parcels, 1366 participated in IRWD's lawn rebate program for an overall participation probability of 2.9% (p = 0.029 ± 0.002).

Classification and regression tree (CART)
A forest of 33 decision trees was generated by pairing the 1366 SFR participants with an equal number of randomly chosen non-participants (see section 1 of SM). For this analysis, we adopted a SFR's decision to participate in the rebate program (='1') or not (='0') as the dependent variable, and the SFR's outdoor area and owner occupancy status as the two independent variables. Of the 33 trees generated, 30 (or about 90%) split the dataset according to whether the SFR is owner-occupied or not (first decision point) followed by whether the parcel's outdoor area is greater than 168 m 2 or not (second decision point, figure 1(a)). The same two variables appear in reverse order in the three remaining trees. Across all 33 trees, misclassification rates estimated for the top two decision points ranged from 39%-42%. These misclassification rates are reasonable, considering we restricted tree depth to just two decisions made on two parcel-scale variables; i.e. owner occupancy and outdoor area [20]. Consistent with the CART results and across all enrolled SFR parcels (N = 46 915), the participation probability is three times higher if a SFR is owner occupied (p = 0.033 ± 0.002) compared to if it is not owner occupied (p = 0.011 ± 0.002) (figure 1b). When owner occupied SFRs are further divided according to their outdoor area, the participation probability is nearly 4% if the outdoor area is greater than 168 m 2 (p = 0.037 ± 0.002) compared to less than 2% if the outdoor area is smaller than this threshold (p = 0.015 ± 0.003) (figure 1b). Thus, participation in the rebate program is highest for SFRs that are owner occupied and have outdoor areas >168 m 2 .

Participation probability
To explore the functional relationship between participation probability and outdoor area, we sorted all enrolled owner-occupied SFR parcels (N = 38 255) by outdoor area, assigned the parcels into 11 equal-sized bins, and then calculated for each bin the participation probability and median outdoor area. For outdoor area <400 m 2 , the participation probability is strongly correlated (R 2 = 0.91) with median outdoor area, increasing 1.2% for every 100 m 2 increase in outdoor area; the participation probability stabilizes at a final value of about 4.5% for parcels with outdoor area >400 m 2 (blue points and lines, figure 2(a)). For non-owner occupied SFR parcels (N = 8660), the participation probability also increases with outdoor area, but the correlation is weaker (R 2 = 0.67), the slope is reduced (0.39% increase in participation probability for every 100 m 2 increase in outdoor area), and the final probability is lower (about 1.5% for parcels with outdoor area >400 m 2 ) (red points and lines, figure 2(a)). In summary, program participation increases monotonically with median outdoor area, but the magnitude of the response (and strength of the correlation) is particularly striking for owner occupied SFR.
Once a resident decides to participate in the rebate program, the lawn area they replace also depends on their parcel's outdoor area. This conclusion was reached by sorting and binning all enrolled participants in IRWD's rebate program (N = 1366) by outdoor area, and then calculating, for each bin, median values of the outdoor area and of the lawn area replaced. For parcels with outdoor areas <600 m 2 , the median lawn area replaced increases linearly with median outdoor area (blue and red filled circles and lines, figure 2(b)). In contrast to the participation probability, however, this linear relationship is not altered by owner occupancy status. Also note Figure 2. (a) Participation probability increases monotonically with median outdoor area, but the initial slope and maximum value depend on whether the home is owner occupied (blue filled circles and lines) or not (red filled circles and lines). (b) The median lawn area replaced also increases with outdoor area, but there is little difference between owner occupied (blue filled circles and lines) and non-owner occupied (red filled circles and line) homes; note the considerable parcel-to-parcel scatter (blue and red dots correspond to owner-and nonowner-occupied parcels, respectively). (c) Probability density histograms of outdoor area for owner (blue filled circles) and non-owner (red filled circles) occupied SFRs closely follow a single log-normal probability density function (PDF, black curve). The participation probability curves from (a) are superimposed on this graph (blue and red lines correspond to SFR that are owner or non-owner occupied, respectively). (d) Model simulations of total water savings were carried out for the three participation probability curves with the initial slopes indicated. The PDF of outdoor areas from (c) is superimposed on this graph (see main text for details). the substantial parcel-to-parcel scatter around these linear relationships (blue and red dots in figure 2(b)).

Size distribution of outdoor area
The results above reveal a strong association between outdoor area and both the probability a resident will participate in the rebate program and the lawn area they replace with drought tolerant landscaping. How are parcel outdoor areas distributed across IRWD's service area? Probability density histograms generated from the outdoor areas of owner-and nonowner-occupied SFR parcels (blue and red points, figure 2(c)) closely follow a single log-normal probability density function (PDF, black curve in the figure, details in section 2 of SM). There is substantial overlap between outdoor areas most commonly present in IRWD's service area (i.e. outdoor areas with the highest probability density, solid black curve in figure 2(c)) and outdoor areas with the highest participation probability (blue and red curves in the figure). However, the highest participation probabilities are skewed toward parcels with the largest outdoor areas (and highest household incomes, see figure S1), consistent with previous reports that rebate programs are utilized disproportionately by wealthier residents [15,22].
This result begs the question: could the rebate structure be altered to incentivize the participation of residents with lower incomes? In the context of figure 2c, this would entail 'flattening' the participation probability curve, for example by decreasing its slope and increasing its intercept. Under the fixed unit rebate adopted by IRWD, rebate payouts increase linearly with the lawn area replaced, up to a maximum of $3000. This rebate structure may incentivize the participation of residents with large yards (and higher household incomes), consistent with the results presented in figures 2c and S1. To entice the participation of households with smaller yards, the utility could consider transitioning from a fixed unit rebate to a fixed cash payment to participating residents that is, within reason, independent of the lawn area they replace. On the other hand, by enrolling many more small lawns in the program, per force the average lawn area replaced per rebate will decline, possibly leading to a net reduction in water savings overall. To clarify such tradeoffs, it would be helpful to have a modeling tool that can relate, for these various 'what if ' scenarios, how changes in outdoor water conservation behavior at the parcel scale translate to total water savings achieved at the water provider or city scale.

Probabilistic framework for outdoor water savings
From the results in figure 2 we can derive, for any incremental change in outdoor area (from a to a + ∆a, units of square meters), the incremental water savings ∆W (units of liters per year) accrued by implementing a cash-for-grass rebate program. The incremental water savings ∆W is equal to the product of the unit water savings associated with replacing lawn with drought tolerant landscaping (w ′′ , units of liters per square meter per year), the probability a randomly chosen resident will participate (p (a), unitless), the average lawn area replaced with drought tolerant landscaping (ℓ (a), units of square meters), and the number-distribution of outdoor areas across the utility's service area n (a) (units of inverse square meters), where the product n (a) ∆a represents the number of SFR parcels with outdoor areas in the incremental range a to a + ∆a: ∆W = w ′′ p (a) ℓ (a) n (a) ∆a. The unit water savings, w ′′ , captures the influence of local climate [23], cultural preferences for outdoor plants [13,[24][25][26], and water use behavior [13] on the water savings realized when a unit area of lawn is replaced with drought tolerant landscaping. For its service area, IRWD adopts a value of w ′′ = 1711 liters per square meter per year. Taking the limit ∆a → 0 and integrating, we arrive at the following simple formula for estimating total water savings achieved by a cash-for-grass program at the city scale: The variable u is a dummy integration variable and the limits of integration a min and a max (units of square meters) represent the range of outdoor areas of interest.
Substituting our empricial expressions for p (a), ℓ (a), and n (a) into equation (1) and integrating (see section 2 of the SM for details), equation (1) predicts that owner and non-owner occupied SFR parcels in the IRWD service area should yield a total water savings of 134 and 9.8 ML per year, respectively. These model predictions are within 22% of the actual water savings achieved by IRWD's program, calculated by summing up the lawn area replaced during the study period and multiplying by w ′′ = 1711 liters per square meters per year (163 and 12 ml per year for owner and non-owner occupied parcels, respectively).
With equation (1) we can now answer the question: how would the rebate program's overall water savings change if the participation probability curve was 'flattened out' , for example by transitioning to a fixed cash rebate? To simulate this scenario, we decreased the initial slope of the participation probability curve while holding the average service area participation probability constant at 3.3% (consistent with the average participation probability reported for the enrolled owner occupied SFR parcels in figure 1(b), details in section 3 of SM). Surprisingly, the model predicts very little change in water savings (from 133 to 123 ML per year) as the initial slope is reduced from the value inferred from IRWD's dataset (m p = 1.22 × 10 −4 m −2 ) to a completely flat line (m p = 0 m −2 ). The reason, evident in figure 2(d), is that a small reduction in the participation of SFRs with large outdoor areas is balanced by an increase in the participation of much more numerous SFRs with small outdoor areas; recall, the number-distribution of outdoor areas in IRWD's service area is positively skewed by virtue of being log-normally distributed [21]. Thus, at least in this case, there is no inherent trade-off between encouraging the participation of households with a diverse range of incomes and water saving goals. It remains to be seen, however, if a change in the rebate structure alone (e.g. from a fixed unit rebate to a fixed cash rebate) can alter the shape of the participation probability curve. Even if this were possible, other factors might make such an approach impractical; e.g. administrative costs associated with vastly more rebate inspections.
Research is presently underway to extend equation (1) to address additional factors known to influence the success of water conservation programs, including temporal variability (e.g. associated with news coverage of drought) [27,28], demand hardening [29], and neighborhood adoption effects [15,30].
The upscaling approach developed and tested here may also prove useful for estimating changes in hydrological budgets (e.g. evapotranspiration [31]) at the water provider or city-scale, associated with the distributed adoption of drought tolerant landscaping.