Optimal sizing of rainwater harvesting systems for domestic water usages: A systematic literature review

Rainwater harvesting systems (RWHS) are increasing in popularity because of their ability to alleviate water pressure on centralized systems, minimize or delay rainfall runo ﬀ , and ﬁ t relatively easily in both the centralized/decentralized infrastructure organization. Adequately sizing RWHS is critical to optimizing their operation because under-sizing results in systems that are unable to provide a su ﬃ cient, reliable source of water while oversizing increases the capital costs incurred with limited marginal bene ﬁ ts and poses potential water quality risks. In this paper, we conduct a systematic literature review to assess the state-of-art in the ﬁ eld of optimization of domestic rainwater harvesting systems. Sizing of storage is identi ﬁ ed as the most important objective of optimization, yet sizing for cost is the most frequently implemented outcome of optimization. Optimizing for a local maximum is often favored over simulation-based optimization methods that produce global maxima. To derive more realistic sizing estimates, future optimization studies will have to take into account greater variation in water demands as well as various climate change scenarios, especially given that rainfall frequency and quantity are critical design variables of a rainwater harvesting system.


Introduction
Urbanization and shrinking cities are having an impact on infrastructure, particularly aging water infrastructure.At the center of decentralized water infrastructure lies rainwater harvesting systems (RWHS).The vital importance of RWHS is the effect they have on the three water networks (potable, stormwater, and wastewater) in terms of decreasing water demand on the potable water network, decreasing stormwater runoff, and, if coupled with greywater recycling systems, decreasing the quantity of wastewater generated by using water multiple times before discharge (Ghisi and Ferreira, 2007).
Tanks are the costliest individual component of the system since they account for 30 % of the whole-of-life costs (Gurung et al., 2012).As a result, capital costs make up (80 %-82 %) the majority of the lifecycle costs (Stewart, 2011).Simulations have shown that installed tanks can be oversized with respect to demand (Ward et al., 2010), and thus to optimize lifecycle costs, care should be taken to correctly size the system to decrease the cost associated with an oversized tank and to avoid increasing water age (Wales, 2006).In fact, modeling tools have been developed to simplify the evaluation and design of RWHS with a specific focus on the task of storage sizing.Different types of models include: • Empirical relationship methods (e.g., Ghisi, 2010; Palla et al., 2011), where empirical relationships are used to describe the sizing of rainwater tanks.Parameters used typically include rainfall, water demand, and roof area.
• Stochastic parametric and non-parametric methods (e.g., Basinger   et al., 2010; Cowden et al., 2008; Guo and Baetz, 2007), which use stochastic techniques to simulate important parameters in tank design, for which data is missing or incomplete.
• Continuous mass balance simulation of the tank inflow and outflow (e.g., Campisano and Modica, 2012;Fewkes, 2000;Imteaz et al., 2011a;Liaw and Tsai, 2004;Mitchell, 2007;Sample and Liu, 2014), where mass balances typically represent the inflow, outflow, and losses of the tank in order to characterize the tank size.The models may use different time scales and algorithmic models (yield before spillage and yield after spillage) to estimate tank sizes (Jenkins and Pearson, 1978).
The purpose of this systematic literature review (SLR) is to define what is typically being optimized in the literature with respect to RWHS, the methods used, limitations of existing studies, and implications for practice.In this SLR, we focus on articles related to optimizing the variables related to RWHS design that directly impact the tank size.
It is also worth noting, from a credibility standpoint, that while storage size is a significant determinant of system cost, other moderating variables could result in cost changes.For example, incorporating a treatment system for potable use may only be feasible above a certain system capacity.Thus, cost functions for smaller sizes would have an advantage if this factor was considered, mainly for the primary purpose of optimization.The nature of optimization is to find the best or most effective use of resources.Hence, with regards to RWHS, optimizing a RWHS goes beyond the sizing of the tank and could involve other objectives.This research will take the intent of optimal sizing into account in the SLR.

Methodology
We performed an SLR on the optimal sizing of RWHS in order to get a clear understanding of how these analyses are implemented.We chose to use the SLR as our main method for gathering and processing information because: a) it closely follows scientific methods, b) it limits bias with the general goal of producing a methodical synopsis of the research in a particular field of study, and c) it identifies research or knowledge gaps and areas for future studies (Petticrew and Roberts, 2008).An SLR is needed here in order to get an accurate picture of existing approaches for optimally designing RWHS to uncover opportunities for future research and development.We adopted the Cochrane method for conducting the SLR, supplemented by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to ensure consistent and complete presentation of methods (Higgins and Green, 2011;Moher et al., 2015).The Cochrane method allows researchers to ground their outcomes on the results of studies that meet specific quality criteria, since the most dependable studies will offer the best proof for making decisions about a variety of topics, which minimize the effect of bias across different sections of the review.
The first step of a meta-analysis using this method is defining the research questions related to the research subject; hence, we defined the following questions that needed to be answered by the SLR: Research Question 1: What variable(s) is being optimized to size RWHS in the literature?
Research Question 2: What methods are being used in the literature for the optimization process?
Research Question 3: What are the limitations of the current optimization analyses and how can they be overcome in future work?
We searched for publications in the following databases: Engineering Village (Compendex, 1884-present and Inspec, 1898-present), Web of Science (core collection, 1900-present), Scitech Premium (Proquest, 1946-present) and Scopus (1800s-present).RWHS are generally defined as systems harvesting rainwater from rooftops with the purpose of providing water for domestic usage (potable and non-potable ).The first step was to define the relevant keywords in order to find pertinent publications related the topic of research.A preliminary analysis of some of the related literature revealed that "rainwater" was the most commonly used term to describe RWHS.Hence, our first search term was "rainwater".The terms "optimal" and "optimum" were also commonly used in the pertinent and relevant literature.Hence, our search string ended up being (rainwater) AND (optimal) and its variations (optimum), (optimize), (maximize) (maximum), (minimize) and (minimum).The search terms were found in the title, abstract, and keywords of existing publications in the databases.We did not limit the categories of the search areas given that this field is multi-disciplinary by nature.We only selected journal articles dating from the year 2000 (at the start of the previous decade) published in English.Identical publications found using different databases were excluded.The screening process is as follows: we read the titles first, the abstracts next and the complete texts last and at each stage of the process, we discarded the unrelated works for the defined area of research and works which did not state sizing as a main objective.Journal articles in other languages were excluded, as well as a nominal amount of articles that we did not have access to through our university libraries.
After the selection process, the following information was compiled: • Year of publication.
• Author-specified keywords used.
• Country of publication.
• Optimization purpose as stated in the objectives section.If the ob- jectives section was missing, we extracted the optimization purpose from the introduction.We excluded works where optimization or sizing was not listed in the objectives of the paper.
• Key parameters that characterized the optimization being described (RWHS design variables, simulation methods, and optimization decision variables).
We performed an examination of the data collected and compiled our conclusions in the following sections.

Analysis
The review was performed in March 2019, then updated in September 2019.We found 2695 relevant journal articles to the search criteria we used: • Engineering Village (Compendex and Inspec databases): 476 re- levant articles were found in both databases.
• Web of Science: 795 relevant articles were found after the search in the Web of Science database.
• Scitech Proquest (main database): 652 relevant articles were found after a search of the Scitech Proquest main database.
• Scopus: 772 articles were found after a search in the Scopus data- base.
After the thorough screening process previously described, we were left with 45 directly relevant articles based on PRISMA as shown in Fig. 1.
Although these works were excluded from the analysis at the abstract phase of the screening process (Fig. 1), we evaluated them to make sure that the findings were not significantly different than the works that were included in the review and that we did not miss valuable insights that would have otherwise been overlooked based on the previously explained search criteria.
Fig. 2 summarizes the distribution of the journal articles across the multiple databases, meaning how many of relevant publications were found in each database, namely: • Engineering Village: 476 journal articles found, 35 articles remaining following screening.
• Web of Science: 795 journal articles found, 40 articles remaining following screening.Some articles were found on multiple databases while others were only listed on one.Ultimately, the greatest number of journal articles meeting all criteria for inclusion were found in Scopus database, followed by Web of Science, then Scitech Proquest and lastly Engineering Village, as shown in Fig. 2.
Most of the publications regarding optimally sizing RWHS were found in "Resources, Conservation and Recycling", followed by the "Journal of Hydrology" and "Journal of Cleaner Production".The distribution of the publications is summarized in Fig. 3. Fig. 4 shows the distribution of journal articles by location of study and year of publication.The country with the most publications related to the optimizing of domestic rooftop rainwater harvesting is the USA (7), followed by Australia and Taiwan (6).
We analyzed the author-supplied keyword strings used in the selected publications.Overall, there were 147 keyword strings specified, except for the two oldest publications (Jenkins, 2007;Liaw and Tsai, 2004) which did not specify keywords.Fig. 5 shows the most frequently used keyword strings in the selected articles; rainwater harvesting was the most frequently used.The word "optimization" appears three times as a keyword out of the 45 articles.
To get a better insight into the use of the keywords, we analyzed the frequency of the actual words used, rather than the strings that were found originally.The term "rainwater" is the most frequently used, followed by "harvesting", "water" and "tank".We illustrated the occurrence of the keywords and their frequency with the help of a word cloud, as shown in Fig. 6 where the font size indicates the word frequency (Heimerl et al., 2014).
Given that the driving purpose of the SLR was to address the research questions described in the methodology section, the next section presents the results of analyzing the actual content of the papers and a discussion of those results.

Results and discussion
This section is organized in two parts: in the first part, questions 1 and 2 address the methods and variables used for size optimization of RWHS while question 3 delves into the discussion pertaining to those methods and the recommendations for future research.
4.1.What variables are being optimized with regards to sizing RWHS in the literature?
The results of the analysis of the relevant articles show that the general approach to RWHS size optimization can be summed up as follows: Optimizing the size of the tank while optimizing one or more variables related to the design of RWHS.Several variables were optimized in the relevant works, as shown Table 1.In the following section, we will list the optimization variables associates with the relevant works and we will discuss in details how these variables were optimized.

Cost
Twelve articles in the final data set optimize the costs associated with RWHS design, as shown in Table 1.Those costs are expressed as shown in Table 2.The most used parameter in cost optimization is cost Fig. 1.Flowchart of systematic literature review using PRISMA (Moher et al., 2015). of water from centralized treatment, which would be displaced by the RWHS.The table headings are the cost elements, investment variables, and investment metrics used to optimize RWHS' costs: • Capital costs: costs associated with the tank, pumps and pipes (when included in the cost).
• Maintenance costs: costs associated with the required maintenance of the system over its lifetime.
• Operation costs: costs associated with running the system such as the power needed for the pumps and the disinfection.
• Water costs: costs associated with the town water supplied or the cost of the water saved by using the RWHS.
• Environmental costs: costs associated with any runoff from the site (runoff from the RWHS tank or drainage).
• Inflation rates: measure at which the average price of a product increases over time • Discount rates: percent change of prices from one year to the next.
• Rebates: amount paid by way of reduction, return, or refund on what has already been paid.
• Payback period/ Return on investment: amount of time required to break even • Benefit-cost ratio: relationship between the cost of the project and its benefits expressed in monetary value • Net-present value: life cycle costing tool which decides the values of future investment It is interesting to note that in all the cost optimization analyses, the capital costs of the RWHS are always taken into consideration because a) the optimization function's output is the size of the tank and b) capital costs make up the majority of the costs (Stewart, 2011).The second most used metric in cost optimization is water costs.This "water  costs" metric is equally important in most cases because as water prices increase, the value of RWHS increases.The payback period or Return on Investment analysis was the most used financial method to determine the economic feasibility of the optimized sizing while the benefit-cost ratio and net-present value methods were used second most.Jenkins (2007) included the environmental costs (e.g.stormwater fees) associated with using RWHS while Khastagir and Jayasuriya (2011) used included in the analysis rebates offered by the Victorian government in Melbourne, Australia to make RWHS more affordable.

Reliability
As shown in Table 1, eleven papers in the final data set focused on optimizing the system reliability in function of the tank size.Across these articles, reliability was defined in two distinct ways: • Volumetric reliability or water-saving efficiency, which is the total rainwater supplied divided by the demand for that water (Imteaz et al., 2012;Islam et al., 2010;Liaw and Tsai, 2004;Ndiritu et al., 2017;Nnaji et al., 2017) Fig. 6.Word cloud of the author-supplied keywords.

Table 1
Optimizing variables employed in literature related to the design of RWHS.Cost elements, investment variables and investment metrics used in cost optimization for RWHS.• Time-based reliability, which is the fraction of time that demand is fully met (Cowden et al., 2008;Karim et al., 2015;Khan et al., 2017;Khastagir and Jayasuriya, 2011;Koumoura et al., 2018;Lawrence and Lopes, 2016) The advantages of using the volumetric reliability are: • Less restrictive: it takes into account the fraction of the time when demand is partially met.
• Less influenced by the computational time step: the volumetric re- liability can be used with sub-daily, daily, weekly, monthly and yearly time-steps.
• Less influenced by the system's characteristics: rainfall data can be missing or unavailable for the desired simulation period.
The advantages of using the time-based reliability are as follows: • Clearer understanding of the inter-annual rainfall variability.
• Better descriptive of the system's failure: the system fails when it is unable to meet all demand.
The volumetric reliability indicator is most commonly used when the output is a measure of the water saving efficiency while the time reliability indicator can describe the fraction of time, over the analysis period, when the demand will be fully met.If the ultimate purpose of the analysis is to maximize the volume supplied by rainfall, the volumetric reliability is more representative of the system.If the purpose is to design a system that can maximize the amount of time when full water demand is met, then the time reliability is the better factor.

Effectiveness/ performance
As shown in Table 1, seven articles in the final data set optimized the effectiveness/performance of a RWHS to determine the size of the tank.A large-scale analysis for sizing for effectiveness or performance of a RWHS depends on the author-specified indicators chosen in the analyzed works as follows in Table 3.
The rainwater utilization indicator, used by Cheng and Liao (2009), is the result of a principal component analysis which is a statistical technique that that uses an orthogonal transformation (linear transformation which preserves a symmetric inner product) to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components.In this case, the authors used observations of the demand (annual demand divided by the collection area and the average annual rainfall) and storage (the storage capacity divided by the collection area and the average annual rainfall) fractions for their analysis.Additional indicators include: • The water-saving efficiency is the volumetric reliability, defined in the previous section.
• The overflow ratio is the fraction of rainfall that is not utilized.
• The detention time is the length of time water is retained in the tank.
• The rainwater use efficiency is the proportion of rainwater actually used.
• The demand-area ratio is the demand per unit area.
• The deficit rate is the percentage of the demand not met.
Auguste and de Gouvello ( 2009) developed three indicators pertaining to a reliability curve (percentage of days where different water demands are met) to assess the size of the optimized tank from the town water supplier's point of view.Cheng and Liao (2009) developed a rainwater utilization indicator that can be used to analyze regional rainfall characteristics, and to come up with representative variables and weights (which indicate the interrelationship of the variables of a rainwater harvesting system that can be revised to amend the parameters for the optimal system).Those scores can then be compared to the water saving potential of different RWHS which can lead to an optimized storage design.Muklada et al. (2016) developed the water saving efficiency and rainwater use efficiency indicators to optimize the performance of the system.Lopes et al. (2017) used the demand-area ratio and the deficit rate indicator in order to optimize the size of the storage tank for a combination of demands and roof areas.Palla et al. (2011Palla et al. ( ,2012) ) developed a demand fraction and a storage fraction indicators in order to assess the performance of the RWHS and find the optimum tank size.Vialle et al. (2011) used the water saving efficiency as an indicator of the performance of the system.
It is interesting to note that all the authors used two or more indicators to assess the performance of the RWHS and size the tank, as opposed to the previous section where only the reliability (volumetric or time-based or both) was used for that purpose.

Meeting water demands
As shown in Table 1, five articles in the final data set optimized the size of the tank to meet water demands.The water demands, as specified by the authors, are as follows: • Seo et al. (2012) introduced variability in daily water demand for four homogeneous and four heterogeneous users and analyzed the impacts of that variability on the individual rain barrel sizes when those barrels are connected (physical and non-physical connections) to the four users.The output is a comparison of the sizes of the barrels before and after connecting them.
• Fernandes et al. (2015) designed a system that could optimize the tank size to satisfy low (non-potable) water demands (such as cleaning cars and washing pavements).Low or non-potable water applications are typical when capacity largely exceeds demand.
• Londra et al. (2015) optimized the size of the tank in order to meet a certain fraction of the total water demands: 30, 40, and 50 % of the total water demands for households.
• Rostad et al. (2016) sized tanks to meet the water demand for toilet flushing in four major cities in the US in residential and mixed residential neighborhoods given typical urban household characteristics (roof area, estimated number of residents).The authors track how increasing water demand affects the reliability of the system as well as rainfall runoff.

Table 3
Effectiveness/performance indicators used.

Effectiveness Performance
Auguste and de Gouvello ( 2009) Reliability indicators: fraction of days when demand is 100 % met, less than 10 % met and daily watersaving efficiency Cheng and Liao (2009) Rainwater utilization indicator Palla et al. (2011) Water-saving efficiency, overflow ratio, detention time Vialle et al. (2011) System efficiency, water-saving efficiency Palla et al. (2012) Water-saving efficiency, median value of detention time Muklada et al. (2016) Water-saving efficiency, rainwater use efficiency Lopes et al. (2017) Demand-area ratio, deficit rate • Fonseca et al. (2017) developed a web-interface decision support system (DSS) to optimize tank sizing using inputs pertaining to water needs from users.The output of the application is maximum tank sizes and annual efficiency values as well as a probability of non-exceedance in order to establish conditions for wet, mean and dry years.High non-exceedance values for a particular tank size are more conservative estimates of the estimated efficiency.
It is interesting to note that, in contrast to the previous section, the reliability indicator is being used here to track the performance of the system rather than it being the main design parameter.

Roof area
As shown in Table 1, three articles in the final data set focused on designing the system with an emphasis on the optimal roof area as follows: • Hashim et al. (2013) proposed a model that can propose optimal roof areas and tank sizes for a large RWHS.
• Rowe (2011) suggested increasing the roof areas of houses in Bermuda in order to meet the existing storage capacity available.
• Wallace and Bailey (2015) recommended increasing both the available catchment areas and storage volumes in order to meet water demands during dry periods for Micronesian communities.
Two of these articles describe island communities (Rowe, 2011;Wallace and Bailey, 2015), where conventional thinking would focus on increasing the tank size in order to meet more water demands.However, Rowe (2011) found that a) many existing water tanks were oversized in Bermuda, hence, either overfilled or underfilled and b) that the optimum capacity of tanks is 0.37 m 3 per 1 m 2 of catchment area.Wallace and Bailey (2015) recommend increasing the rainwater catchment areas because of unused storage available that can then be used to sustain water demands during drought periods.In the third article, Hashim et al. (2013) optimized the rainwater catchment area to sustain a large rainwater harvesting system (communal RWHS).

Water savings
As shown in Table 1, two articles in the final data set focused on optimizing the tank size to save on the use of centrally-treated municipal water as follows: • Imteaz et al. (2011b) optimized the size of two large existing tanks with the optimization criteria being total overflow losses (≈ 0) and water saved (= constant value).
• Tsihrintzis and Baltas (2014) optimized the tank size to not use public water, allowing additional water to overflow, with tanks sized to provide adequate supply throughout the year.

Other variable optimization
As shown in Table 1, five articles in the final data set focused on the following variables or system characteristics to size the tank: • Bocanegra-Martínez et al. (2014) optimized the system to minimize the fresh water use and its total cost.
• Sample and Liu (2014) optimized the system for the dual purpose of meeting water needs and providing runoff capture.
• Allen and Haarhoff (2015) optimized the design of the system for constant water demand, i.e., for daily consumption.
• Seo et al. (2015) proposed a rainwater harvesting sharing scheme whereas the individual storage would be reduced.
The aforementioned works analyzed the variables used for optimally sizing the RWH tank.As reported in Table 1, the authors have mostly optimized using the cost and reliability of the system as the main decision variables.The following section looks at the optimization process and the methods used.

What methods are being used in the literature for the optimization process?
The following section looks at the methods and variables used for the sizing of the tank in the 45 relevant studies, as well as the optimization methods used.

Methods and variables used for the sizing of the tank
Of the 45 relevant papers that look at storage sizing for RWHS, we extracted the following data points: the resolution with which rainfall data are incorporated in the model, the approach to simulating the level of water in the tank at any point in time, and the rate and resolution with which demand is modeled.Mass balances typically represent the inflow, outflow, and losses of the tank in order to capture water levels in the tank and calculate the optimal tank size.The model may use different time scales and algorithmic models such as yield before spillage (YBS) and yield after spillage (YAS) to estimate tank sizes (Jenkins and Pearson, 1978) as well as parametric methods such as the storagereliability-yield (SRY).The results of the data points extracted from our units of analysis are shown in Table 4.
As shown in the first data column in Table 4, rainfall is represented in most studies using historical data, which does not explicitly take into account potential large changes that could occur quickly due to climate change.In fact, in one study in Australia, the authors found that climate change will adversely impact residential RWHS by reducing water savings and reducing reliabilities (Haque et al., 2016).Adding more storage without minimal increase in the total cost of ownership or even redistributing rainwater could help manage the effects of climate change on RWH.Running or verifying the analysis on wet and dry years using sensitivity analysis can better inform about the performance of a RWHS under a climate change scenario.
What the second data column in Table 4 shows is that the most used tank sizing method to model the performance of a RWHS is the water mass balance method, proposed by Jenkins and Pearson (1978).In fact, 51 % of the simulation modeling is done using the mass balance method, followed by 29 % using the YAS, 11 % using the YBS and 7% using both YAS and YBS methods.The YAS release rule is more conservative than the YBS rule in terms of output (Fewkes and Butler, 2000).According to Rostad et al. (2016) and Mitchell et al. (2008), the mass balance approach strikes a balance between the outputs of both release methods.
As for the third data column in Table 4, the variability in water demand is not typically accounted for because most works consider daily average water demand except for two works where a lognormal distribution is used to reflect the daily variability in water demand.Eight of the studies use average daily values for water demand but vary those averages based on weekdays/weekends, humid/dry weather, monthly changes in water demands.This gap could be managed by conducting a sensitivity analysis to the water demand or varying the water demand.
It is noteworthy that the columns are sorted by year, and there have been no easily observable trends in the literature regarding these various approaches.

Optimization methods
Optimization is the process of choosing the best solution out of a set of multiple outputs.Hence, the optimal solution is the one with the highest expected utility (Gigerenzer and Selten, 2002).For any given real-world problem, an optimization problem can usually be formulated in a generic form as follows: where x is the optimization variable and b i the constraints or firm requirements that limit the possible choices.A solution of the optimization problem (1) matches to a choice that has minimum cost (or maximum utility), from all available choices (Boyd and Vandenberghe, 2004).The optimization approaches used in existing RWHS studies are all based on single-objective or multi-objective optimization.The sizing studies evaluated in the SLR deal with the decision-making related to appropriate sizing of the system while maximizing/minimizing one or more variables related to the design of a RWHS.
Based on the review of the optimization methods of the selected works, the RWHS sizing optimization articles are divided into two primary decision-making styles: simulation-based optimization and satisficing (which is a combination of satisfy and suffice (Chun, 2015)).Simulation-based optimization problems are formulated in terms of a defined objective function that a) is based on mathematical proofs and b) has an extreme solution or an optimal solution.In contrast, satisficing problems, as proposed by Simon, have moderate goals where optimality may be difficult to implement because of the presence of uncertainty or ambiguity (Simon, 1959;Stirling and Goodrich, 1999).With this approach, one keeps on looking for an optimal outcome until an acceptable solution is found according to a standard chosen by the user (Stirling, 2003).According to Byron (1998), satisficing represents a stopping instruction that can decrease the search time for other, better options as defined by the user.For example, in the case of RWHS, when one variable is pre-defined by the author (e.g., finding the optimal size for a defined volumetric reliability), then the solution to the problem becomes a local solution rather than a global solution as defined by the simulation-based optimization problem.The advantages and disadvantages of both optimization methods are as follows: • Mathematical optimization can find the absolute optimal solution whereas satisficing finds a local optimal solution based on the decision maker's preference (Wierzbicki, 1982).
• In some situations, uncertainty and complexity can inhibit the search for an optimal solution, making it reasonable to stop when finding a functioning one (Stirling and Goodrich, 1999).
• Optimization requires having all the relevant facts, which is nearly impossible to comply with (Stirling, 2003).
The main methods for simulation-based optimization can be classified as follows (Carson and Maria, 1997): • Gradient based search methods: these methods evaluate the

Table 4
Results of the extraction of data points from our units of analysis.
response function gradient to measure the form of the objective function and employ deterministic mathematical programming techniques such as the finite differences, likelihood ratios, perturbation analyses and frequency domain methods.
• Stochastic optimization: this method allows the location of a local optimum for an objective function whose outputs are unknown analytically but rather can be estimated or measured.
• Response surface methodology (RSM): this method includes fitting a series of regression models to the output variable of a simulation model and optimizing the resulting regression function.
• Heuristic methods: these methods represent the field of direct search methods (requiring only function values) and mix exploration with exploitation resulting in efficient global strategies.Those methods include genetic algorithms, evolutionary strategies, simulated annealing, tabu search and Nelder and Mead's simplex search.
• Asynchronous teams: this method is a process that involves multiple problem solving strategies that can cooperate in tandem.
• Statistical methods: these methods involve the use of statistics in order to solve optimization problems, such as, importance sampling methods, ranking and selection, and multiple comparisons with the best.
The criteria for classifying the selected works as a simulation-based optimization problem or a satisficing problem is based on whether the optimization method used follows the definition of simulation based optimization.The following criteria for simulation-based optimization methods are: • The optimization problem is formulated in a mathematical form as shown in (1).
• The problem solving method can be clearly attributed to one of the methods specified in Carson and Maria (1997), presented in the previous list.Table 5 presents the results of classifying the studies based on optimization methods used.
The works that use a simulation-based optimization approach (Chiu et al., 2009;Hashim et al., 2013;Muklada et al., 2016;Ndiritu et al., 2017;Nnaji et al., 2017;Okoye et al., 2015;Sample and Liu, 2014) have a defined objective function, one or multiple decision variables (depending on the output) and a collection of constrains that bound the function as shown in Table 6.
An analysis of the methods of optimization of the RWHS was presented.As reported in Table 4 and Table 5, a few works used simulation-based optimization and most works use a satisficing approach to optimization.The following section looks at the limitations of the current optimization processes used and how to manage them in future works.

4.3.
What are the limitations of the current optimization analyses and how can these be overcome in future works?
In decision making theory, Beyth-Marom et al. (1991) postulate that an output is optimal when the process is optimized as well, i.e., being able to practice the following steps: Having a well-defined mathematical objective function bounded by one or multiple constraints or following a well-defined optimization method appears to be a methodical optimization process, especially when a multi-objective optimization process is required (Bocanegra-Martínez et al., 2014;Chiu et al., 2009;Hashim et al., 2013;Ndiritu et al., 2017;Okoye et al., 2015;Sample and Liu, 2014).In a review of simulation-based optimization methods related to building performance, Nguyen et al. (2014) identified three distinct phases in the simulation-based optimization process: pre-processing, running the optimization and post-processing phases.The major tasks of the three phases are as follows: • Pre-processing: this phase's main objective is to formulate the op- timization problem, to set the constraints and to identify the variables.
• Running the optimization: the main objectives of this phase are monitoring the optimal solution, controlling the termination criteria and detecting any errors.
• Post-processing: the results are analyzed and presented during the post-processing phase.
The RWHS sizing simulation-based optimization works are presented in the same manner as described by Nguyen et al. (2014).The satisficing works are also based on the same structure with three distinct phases using iterative methods which output a local optimum rather than a global one.As Nguyen et al. (2014) found with regards to optimization and building performance analysis, it is often difficult to verify whether a global optimum is achievable by optimization.The same can be applied to the optimization of RWHSs for several reasons: a) The uncertainty of water demands: in most of the optimization works, water demand was illustrated as a discrete average which realistically is not the case because demand profiles vary between outwardly similar households in comparable locations as a result of a difference in socio-environmental factors.In a recent peak water demand study in over multiple years and in multiple locations across the US for single and multi-family dwellings, the researchers found that the average water use was 60.1gpcd (gallons per capita per day) and almost 98 % of homes registered leaks.Interestingly, leakage represented almost 17 % of the average daily water use (Buchberger et al., 2015).Toilets had the highest use in terms of gpcd.The tally showed that residential water use has a tendency to be higher on weekends than otherwise.In its latest water use report (for the year of 2015), USGS estimated the average domestic water consumption (indoor and outdoor use) per capita per county and the differences between counties run as low as 2 gpcd up to 1,429 gpcd with a national average of 87.4 gpcd (USGS, 2017).
What is sorely lacking in water research across most water-centric disciplines is access to usage data, which in turn reduces the stochasticity inherent to water demand modeling.City and town managers are aware of the privacy concerns associated with releasing water metering data because of inadequate cyber security measures surrounding the usage of those devices (McDaniel and McLaughlin, 2009).Smart water metering, intelligent infrastructure and the Internet of Things (IoT) (Saad et al., 2019) are bound to decrease the unpredictability with the increase in digital security surrounding the usage of metering devices.Indeed, the use of big data and machine learning (Chen et al., 2019) will increase the understanding we currently have of water demands, expanding in turn the granularity of the variables which will be conducive to better performing RWHSs.
b) The uncertainty of future rainfall patterns: all of the optimization works considered in this review have based their rainfall analysis on historical data or synthetic data (based on historical rainfall) up to 113 years (Jenkins, 2007).The optimal tank size could in effect be optimal for the time of the design; however, RWHS have a lifecycle ranging between 20 and 40 years (and in some analyses up to 60 years) Climate change is expected to impact rain patterns quite significantly over this time period, which could make the optimization analysis that are based on historical rainfall data less valuable for future planning (Haque et al., 2016;Meehl et al., 2007).
In fact, in a study of the impacts of climate change on RWH, the authors found that accounting for tank size adjustments, catchment areas and water demand rates will be needed in order for RWHS to be sustainable (Zhang et al., 2019).The use of representative years in terms of rainfall to be used as extreme years to test the system (wet, dry and average) on top of the historical data can decrease the uncertainty associated with changing rainfall patterns, but they may or may not capture the types of changes we may see in a changing climate.
One way to reduce the stochasticity inherent in predicting future rainfall patterns is better approaches to prediction of weather in the context of global change at the local scale.Future works should also consider the possibility of increasing available storage, such as communal water spaces in order to utilize excess rainfall as well as store available rainfall in times of drought.c) Lack of grounding in practice: most of the research on RWHS does not necessarily factor in real-world conditions when modeling the performance of the systems.For example, most of the optimization studies considered in this review output a range of sizes or a specific size that the authors consider optimal without taking into account the fact that tanks come in discrete sizes.One solution could be the use of modular rainwater harvesting systems which can be built to hold unlimited amount of rainwater and can fit anywhere.Another example would be that the simulated models do not also take into account the fact that the roof technologies are changing in ways that may make our assumptions about yield less accurate.Hotter temperatures on metal roofs (which are becoming more common in residential construction in many areas) mean more evaporation during the first part of a rain event while the roof cools which are Total costs, purchased water Sample and Liu (2014) Net benefits (water supply and runoff capture) Okoye et al. (2015) Cost of purchased water, cost of RWHS Ndiritu et al. (2017) Yield, reliability, storage Nnaji et al. (2017) Reliability not really accounted for in the models.The use of intelligent sensors can predict future weather patterns and prime the roofs accordingly.
The cutoff (the minimum amount of water available in the tank to prevent the system from running dry) and freeboard volumes (the rainwater overflows in the freeboard section of the tank) are not necessarily included in the models which impact the ultimate tank size.The use of discrete tank sizes is a more realistic approach to the simulation process.
In a broader sense, the future of the RWH systems will be a nexus of a traditional modeling approach with the inclusion of all the information collected by the IoT, that are not readily available presently.Currently, researchers rely on rainfall and water usage as the primary inputs for RWH modeling.In the future, inputs such as land cover changes, modular construction, and even future building usage on top of the current inputs will help expand our understanding of RWHS models, transcend the current socio-economic spectrum as well as exploit local weather patterns over the RHWS' lifecycle.Instead of using past data to model today's water usage, researchers will be able to model for tomorrow's water usage, today.

Conclusion
In this paper, we conducted a systematic literature review of works pertaining to optimal sizing of rainwater harvesting systems for domestic water usages.After the screening process, 45 works were relevant based on our search criteria.The most common optimized variable with regards to sizing a rainwater harvesting system was the cost of the system, followed by the reliability of the system and effectiveness/performance of the system.Most works used historical rainfall and average water demands as input to their systems, while the most used sizing method was the water mass balance method.7 works used simulation-based optimization methods to find the global optimum while the rest used satisficing approaches to find local optimums in terms of sizing.
Simulation-based optimizations provide the closest, in terms of process and output, means to finding global optimal solutions whereas satisficing decision-making is generally calibrated according to the opportunity cost of delay and the computational cost of considering more options and collecting more data.All optimization publications rely on historical rainfall data to make a decision on the size of RWHS but truly optimal sizes that span the lifecycle of the system will have to take into account the changing rainfall patterns.The uncertainty of water demands and future rainfall patterns, and lack of grounding in practice are all gaps in the current research.The combination of the use of smart water meters, intelligent infrastructure and the IoT will provide better understanding of water needs.More research on climate change on the local level will reduce the stochasticity inherent in future rainfall patterns.Moreover, taking into account more real-world conditions (with the use of smart sensors) can increase the precision of the output of the simulations, hence improve the optimality of the sizing of rainwater harvesting systems.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 2 .
Fig. 2. Total and relevant numbers of publications across the databases.

Fig. 3 .
Fig. 3. Summary of the distribution of the relevant publications among different journals.

a
List relevant action alternatives; b Identify possible consequences of those alternatives; c Assess the probability of each alternative occurring; d Establish the relative value or utility of each alternative, and; e Integrate those values and utilities to find the most attractive course of action.

Table 5
Results of the optimization methods used in the selected works.

Table 6
Decision variables used in the simulation-based optimization articles.