Pitfalls in international benchmarking of energy intensity across 1 wastewater treatment utilities 2

The collection, treatment and disposal of wastewater is estimated to consume more than 2% of the world's electrical energy, whilst some wastewater treatment plants (WWTPs) can account for over 20% of electrical consumption within municipalities. To investigate areas to improve wastewater treatment, international benchmarking on energy (electrical) intensity was conducted with the indicator kWh/m3 and a quality control of secondary treatment or better for ≥95% of treated volume. The core sample included 321 companies from 31 countries, however, to analyse regional differences, 11 countries from an external sample made up of various studies of WWTPs was also used in places. The sample displayed a weak-negative size effect with energy intensity, although Kruskal-Wallace analyses showed there was a significant difference between the size of groups (p-value of 0.015), suggesting that as companies get larger; they consume less electricity per cubic metre of wastewater treated. This relationship was not completely linear, as mid to large companies (10,001-100,000 customers) had the largest average consumption of 0.99 kWh/m3. In the regional analysis, EU states had the largest average kWh/m3 with 1.18, which appeared a result of the higher wastewater effluent standards of the region. This was supported by Denmark being the second largest average consuming country (1.35 kWh/m3), since it has some of strictest effluent standards in the world. Along with energy intensity, the associated greenhouse gas (GHG) emissions were calculated enabling the targeting of regions for improvement in response to climate change. Poland had the highest carbon footprint (0.91 kgCO2e/m3) arising from an energy intensity of 0.89 kWh/m3; conversely, a clean electricity grid can affectively mitigate wastewater treatment inefficiencies, exemplified by Norway who emit just 0.013 kgCO2e per cubic meter treated, despite consuming 0.60 kWh/m3. Finally, limitations to available data and the analysis were highlighted from which, it is advised that influent vs. effluent and net energy, as opposed to gross, data be used in future analyses. The large international sample size, energy data with a quality control, GHG analysis, and specific benchmarking recommendations give this study a novelty which could be of use to water industry operators, benchmarking organisations, and regulators.

wastewater effluent standards of the region. This was supported by Denmark being the second 23 largest average consuming country (1.35 kWh/m 3 ), since it has some of strictest effluent 24 standards in the world. Along with energy intensity, the associated greenhouse gas (GHG) 25 emissions were calculated enabling the targeting of regions for improvement in response to 26 climate change. Poland had the highest carbon footprint (0.91 kgCO2e/m 3 ) arising from an 27 energy intensity of 0.89 kWh/m 3 ; conversely, a clean electricity grid can affectively mitigate 28 wastewater treatment inefficiencies, exemplified by Norway who emit just 0.013 kgCO2e per 29 cubic meter treated, despite consuming 0.60 kWh/m 3 . Finally, limitations to available data and 30 the analysis were highlighted from which, it is advised that influent vs. effluent and net energy, 31 as opposed to gross, data be used in future analyses. The large international sample size, 32 energy data with a quality control, GHG analysis, and specific benchmarking 33 recommendations give this study a novelty which could be of use to water industry operators, 34 benchmarking organisations, and regulators.  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69 70

Introduction 99
The collection, treatment and disposal of wastewater is a significant consumer of energy, with 100 estimates suggesting that more than 2% of the world's electrical energy is used for water 101 supply and wastewater treatment (Plappally & Lienhard 2012;Olsson, 2015). The EU (2017) 102 state that energy requirements in wastewater treatment plants (WWTPs) account for more 103 than 1% of consumption in Europe, whilst Means (2004) and Kenway et al. (2019) report that 104 WWTPs can consume over 20% of electrical consumption within municipalities. Reducing the 105 energy consumption of wastewater management is integral to efficient resource use within a 106 circular economy and to reduce greenhouse gas (GHG) emissions. This task is more difficult 107 considering WWTP electricity demand within developed countries is expected to increase by 108 over 20% in the next 15 years as controls on wastewater become more stringent (Wang et 109 al., 2012;Hao et al., 2015); with the same trend expected in developing countries as 110 wastewater quality becomes a greater priority (Lopes et al., 2020). The importance of 111 improving the sustainability of wastewater treatment is highlighted by its inclusion in the United 112 Nations Sustainability Development Goal 6 (2021a) that seeks to secure safe drinking water 113 and sanitation, focussing on the sustainable management of wastewater, water resources and 114 ecosystems. screening, and grit removal and grinding. During this stage, pumping is the only significant 124 energy consumer, at 0.002-0.042 kWh/m 3 , depending on the structure and location of the 125 sewer system. Primary treatment involves separating circular settling tanks with mechanical 126 scrapers, using very little electricity (4.3•10 -5 -7.1•10 -5 kWh/m 3 ). The secondary treatment 127 stage is responsible for a significant proportion of the total electrical consumption, whist the 128 aeration system is the process that consumes most electricity (0.18 and 0.8 kWh/m 3 ), 129 accounting for 45%-75% of total plant energy consumption (Longo et al., 2016;Gandiglio et 130 al., 2017). Longo et al. (2016) comments further that between 8.4•10 -3 and 0.012 kWh/m 3 is 131 used by mechanical scrapers in gravity settling to separate sludge. Secondary sludge 132 recirculation requires more pumping, consuming an additional 0.047 to 0.01 kWh/m 3 , whilst 133 mixing for anoxic reactors ranges between 0.053 and 0.12 kWh/m 3 . Tertiary treatment further 134 increases electricity consumption, the degree to which depends on the technology. Tertiary 135 filtration consumes from 7.4•10 -3 to 2.7•10 -3 kWh/m 3 , UV disinfection uses between 0.045 -136 0.11 kWh/m 3 , and mechanical utilisation for the dosage of chemicals (e.g., chlorinated 137 reagents, aluminium or iron salts) expends 9.0•10 -3 -0.015 kWh/m 3 . Finally, the processing of 138 sludge throughout different stages can represent considerable energy consumption, for 139 example, aerobic sludge stabilisation, which is the most consuming procedure within sludge 140 treatment, can use between 0.024 -0.53 kWh/m 3 . 141 Efficiency improvements at plant and company level could reduce the energy demand of 142 wastewater treatment. Various methods could enhance overall system intensity, including 143 process-energy reduction and energy recovery from waste, which can be conducted to such 144 an extent that WWTPs can become energy neutral or even energy positive (Maktabifard et al., 145 2018). An effective way to improve efficiency is the use of control engineering techniques 146 the Electric Power Research Institute estimating that 10-20% of energy savings can be 152 achieved this way (Copeland and Carter, 2014). Approximately 50% of the total energy 153 consumption of a WWTP can be provided by biogas from anaerobic digestion (Hao et al., 154 2015), with sludge pre-treatments enhancing the biomethane yield further. This is also 155 possible by altering fuel cells and optimising thermal conditions (Gandiglio et al., 2017). 156 Furthermore, re-using the nitrogen and phosphorus from WWTPs for crop fertilisation can 157 offset the considerable energy consumption of producing synthetic fertilisers (Danuta, 2018). 158 A valuable tool for improving wastewater energy intensity amongst water companies is 159 benchmarking. By utilising key performance indicators, it is possible to find the optimal 160 performers and evaluate companies against similar entities or standardised values (Krampe This study had several objectives. 1) to explore the energy intensity of wastewater treatment 182 on an international scale with the most up-to-date data available and an effluent quality control 183 to ensure credible comparison, an exploration not conducted at this scale previously; 2) to 184 investigate reasons for varying performance, including regional, legislative, and size 185 differences; 3) to assess the carbon impacts of wastewater treatment energy intensity relative 186 to each country, which has not been conducted hitherto; 4) to evaluate areas for improvement 187 in international benchmarking practices. The international scope of the study helps address 188 many of the knowledge gaps highlighted earlier, and the novelty of the work can be of use to 189 the water industry, benchmarking organisations, energy efficiency analysts, and regulators, 190 by providing recent results of wastewater energy intensity and associated carbon from many 191 countries across the world, along with suggestions on improving future data collection, 192 reporting and analysis.

Data description 195
The core indicator used was kWh/m 3 of wastewater treated, kWh being gross electricity 196 consumed. Since the level of wastewater treatment impacts on energy consumption (see 197 Section 1), a control on water quality was deemed necessary. There were limited possibilities 198 with available data; however, wastewater receiving secondary treatment or better at volumes 199 of 95% and above was incorporated. Secondary treatment can vary in processes undertaken 200 and thus energy consumed, e.g., there can be considerable energetic differences between 201 conventional activated sludge and granular activated sludge (Bengtsson et al., 2018), and 202 many processes outlined in Section 1 however, without more detailed data, using secondary 203 treatment or better as the quality control was the best option. 204 The main source of data was the International Benchmarking Network for Water and 205 Sanitation Utilities (IBNET, 2021) database, this was supplemented by company reports and 206 other national benchmarking schemes, which collectively covered Greece, Italy, Spain, 207 Sweden, Canada, United States, UK, Australia, New Zealand, Denmark, and Netherlands. 208 The sample years were 2014-18, with only one year of data being required to be valid in the 209 study to maximise the sample size. It is possible that by using one entry within the five-year 210 range, an abnormal year of heavy rainfall and increased wastewater treatment could be used; 211 however, the indicator kWh/m 3 should negate this. Companies with multiple data points 212 throughout those years had their values averaged. Extra data from the IBNET database were 213 utilised to conduct part of the analysis comparing energy intensity of primary only treatment 214 (>95% of total volume treated) and the core sample data. This extra primary treatment data 215 had 29 companies from nine countries, the comparison with core sample was undertaken with 216 only the same nine countries for the fairest results. 217 External data to this from journal articles were used in Section 3.3 to enable a better 218 understanding of regional differences, covering Portugal, Germany, Finland, Brazil, Mexico, 219 India, South Korea, China, Japan, Singapore, and South Africa. This external data did not 220 have the same treatment quality controls that the core data had and was based largely on 221 samples of WWTPs, not companies, and therefore was not incorporated into the core sample. 222 Summary statistics for the sample are available in Table 1, with a full data table and  227 When evaluating regional differences in energy intensity (Section 3.1.2), wastewater effluent 228 standards are presented (Table 3) to ascertain the reason behind regional variation, which 229

.1. Spearman's rank correlation coefficient 241
To assess the relationship between a) the size of companies and their energy intensity, and 242 b) the percentage of tertiary treatment received in each country and energy intensity, in 243 Section 3.1, Spearman's rank correlation coefficient ( ) was utilised. This non-parametric 244 approach was chosen due to the sample being non-normally distributed and has the 245 advantage of being relatively insensitive to outliers. ₛ is calculated according to the following 246 is the difference between ranks for each variable data pair and is the number of 250 data pairs. When = 1 the data pairs have a perfect positive correlation ( = 0) and when 251 = -1, the pairs have a perfect negative correlation. 252

Kruskal-Wallis test 253
To test if there was a significant energy intensity difference between the size groups in Section 254 3.1, a Kruskal-Wallis test was used. This non-parametric approach was chosen, as there 255 was not a particular distribution of the energy intensity data. The statistic is calculated with: 256 where is the sum of sample sizes for all groups, is the number of groups, is the sum of 258 the ranks in the ℎ sample, and is the size of the ℎ sample. To decipher whether the 259 medians of the groups are differing, the value is compared to the critical chi-square value 260 at an alpha level of 0.05 in this instance (degrees of freedom = 3). If the critical chi-square 261 value is < the statistic, there is significant difference between the groups, whereas if the chi-262 square value is ≥ , there is not enough evidence to suggest that the medians are unequal. 263 The limitation of this approach is that the specific groups that display differences between 264 them are not known however, for the purposes of what the K-W test is being used for in this 265 study, this is an accepted condition. The international sample utilised here is displayed in Figure 1, with each company and their 274 energy intensity being plotted against their size, measured in population served. The range of 275 data (0.04 to 3.11 kWh/m 3 and 500-15,000,000 in population served) meant that outliers and 276 non-normal distribution could affect inferences from analysis. To negate this, Spearman's rank 277 was utilised, and size categorisation was undertaken to group similar sized companies 278 together, results of which are in Table 2 with their associated mean average electricity 279 intensity. 280

283
The whole sample has a rs value of -0.108, suggesting, as companies get larger, they consume 284 less electricity per cubic metre of wastewater treated; however, it is a weak relationship and 285 displayed a non-significant p-value. A Kruskal-Wallace test revealed there was a significant 286 difference between the four applicable groups (p-value of 0.015); implying utility size does 287 influence energy intensity, which concurs with much of the literature (Venkatesh et al., 2014;288 Young, 2015). Furthermore, the group of companies serving over 1,000,000 people had a 289 slightly lower average kWh/m 3 compared to the rest of the sample, with the rs value showing a 290 weak negative relationship to a significant degree (p-value of 0.024), supporting inferences 291 that larger companies have slightly lower energy intensity. This appears to be a non-lineal 292 relationship since the highest average energy intensity is from the 10,001-100,000 group, 293 which with the 100,001-1,000,000 group show very weak positive relationships, whilst the 294 smallest applicable category of 1001-10,000 shows a very weak negative result. These results 295 indicate that the extreme companies on the size spectrum are not necessarily handicapped in 296 their pursuit for efficiency, and therefore should actively seek to learn from the top performers, 297  factor often heavily linked with energy intensity is the level of treatment the wastewater 308 receives (as discussed in Section 1), which is at least partially dependent on regulatory 309 standards that differ from region to region. The data used ensured that at least 95% of the 310 wastewater from each company received at least secondary treatment. This was an important 311 effluent quality control as data collected, available in the Supplementary Information, showed 312 companies that treated ≥95% wastewater to only a primary level only consumed 0.36 kWh/m 3 313 compared to 0.76 kWh/m 3 for companies that treated ≥95% wastewater to at least a secondary 314 level in the same countries. Even within secondary wastewater treatment though, there can 315 be variances with the technologies utilised and therefore differing levels of energy 316 consumption; for example, aeration can be conducted with turbines, diffusers and in some 317 cases, not at all (Guerrini et al., 2017). Having a quality control in the data was important 318 however, without more granular data on how much of that wastewater was treated to a tertiary 319 extent; relationships within the results could be misrepresented. As Figure 2 shows, secondary 320 treatment or better actually represents mostly tertiary treatment in many EU member states. 321 Spearman's rank correlation coefficient was conducted with the tertiary treatment percentage 322 data from Figure 2 and the matching countries in the energy intensity sample collected. The 323 relationship was positive but non-significant for all valid data (rs 0.36, p-value 0.2) and when 324 using countries in the energy data sample that had over 15% of population represented in the 325 data (rs 0.49, p-value 0.33). Although the results showed tertiary treatment did not cause 326 significant increases in energy consumption, more tertiary treatment will clearly increase 327

Regional differences 335
To assess regional variances and further investigate the effect of wastewater effluent quality 336 standards on energy consumption, grouping of companies was completed based on their 337 legislation and United Nations (2021b) Sustainable Development Goal regional groupings. A 338 selection of countries and their summarised wastewater parameters is presented in Table 3 (Preisner et al., 2020). In countries that were formerly part of the Soviet Union, a 345 materially different method is in place, which is based on the assumption that the level of 346 wastewater treatment must ensure the normative water quality in the control cross-sections of 347 individual water bodies (Neverova-Dziopak, 2018). This means the maximum allowable load 348 discharged from each WWTP is defined based on the category of the receiving water, its 349 specific characteristics, and the construction of the wastewater outlet. These different 350 approaches exemplify the difficulty in directly comparing regions, however, the major effluent 351 maximum standards give a reasonable guide, albeit whilst mindful of distinct contexts. 352  Table 4 shows that the EU companies had the largest average energy intensity at 1.18 356 kWh/m 3 , whilst all other regions averaged much lower, ranging between 0.58-0.64 kWh/m 3 , 357 apart from Russia and the former states of the Soviet Union who averaged 0.82 kWh/m 3 . The 358 EU UWWTD directive is widely appreciated to have some of the strictest effluent standards in 359 the world (Morris et al., 2018), so it was anticipated for those countries to have a higher energy 360 intensity due to higher levels of treatment requiring more energy (Capodaglio and Olsson, 361 2020). Despite this, it is still a little surprising that it is so high compared to others, considering 362 many EU countries utilise some of the most efficient treatment techniques and technologies 363 (United Nations, 2017; Preisner et al., 2020), such as those discussed in Section 1. It is 364 expected then, that as regions with lower effluent standards improve to similar levels of 365 advanced economies, their energy consumption will increase too. 366    that a 1% increase of inflows from industry will decrease energy efficiency by 28%. If the 398 sample has areas that treat high volumes of industrial effluent, then they would have 399 performed poorly in this analysis. 400 The regional and global perspective could look very different depending on the data available. 1.12 kWh/m 3 based on different studies. The disparity between these results is likely due to 404 differences in the context of various data. Some may be temporally divergent or have 405 representativeness issues where a few WWTPs may represent a company, a few companies 406 may represent a country, and a few countries may represent a whole region. Table 4 for 407 example, shows how Central and South America, North America, and Sub-Saharan Africa 408 have very few countries within them and those countries only have one company representing 409 them, although this is possible when a quality control (≥ secondary treatment for ≥ 95% of 410 volume) reduces sample size. Having representativeness issues is not ideal; however, the 411 practice is carried out by international benchmarking organisations such as the EU 412 Benchmarking Co-operation (2020), when more data is unavailable. In addition, there may be 413 biases in reporting where companies who may already be performing well or actively trying to 414 improve are more likely to actively share their wastewater energy data, whereas poorer 415 performers may not disclose the data or just not have the means to collect it thus, undermining 416 benchmarking efforts. Although there are potential issues around the sampling parameters, 417 data representativeness, and potential reporting biases, this is a common theme when 418 attempting to collect sufficient data for comparison (Singh et al., 2012). The results presented 419 here however are the best current indication of reality, which is discussed further in Section 420 3.1.4. 421

Country-level analysis 422
To further evaluate possible influences of energy intensity and the practicality of the data, the 423 scope was narrowed to country-level analysis. The global coverage of the dataset was patchy 424 despite extensive efforts to collect wide-ranging data, therefore some partially mismatching 425 data in terms of company-level and known WWTP-level data was used from other studies to 426 further inspect differences in electrical intensity between countries (Figure 3). Due to the 427 expansive sample, many countries and companies that have not been evaluated previously 428 are included in this study.

432
The colours represent regional separation. WWTPs; therefore, it is probable the countries are not being fully expressed due to limited 438 sample size, as discussed in the previous section. There is also the major influencing factor 439 of the disparity of wastewater effluent quality within the sample as examined above; especially 440 considering the external data could not be filtered by secondary treatment or better as the 441 main sample was. These five countries with the lowest energy intensities have some of the 442 lowest wastewater quality requirements in the sample as Table 3,   wastewater treatment then, depends on influent and effluent water quality, treatment 502 technologies, effluent quality standards and compliance with those standards, and electricity 503 fuel mix. To reduce GHG emissions, companies require a reduction in energy consumption, 504 in addition to possible self-generated renewable energy generation. To reduce energy 505 consumption, benchmarking and modelling followed by learning from best practice and 506 incorporating applicable processes (some were outlined in Section 1) can be beneficial 507 (Mannina et al., 2016), although the importance of investing in new and innovative 508 technologies should not be underestimated either. 509

Learning from limitations 510
Results presented in this study offer the best view of the state of international wastewater 511 energy intensity with current available data; however, as the sections above have discussed, 512 there are avenues to improving future analysis and reporting, which is particularly pertinent to 513 water managers and analysts. Foremost, there is a need for more data; this sample included 514 31 countries and 321 companies in the core sample, before expanding it to 42 countries with 515 more sporadic WWTP data from individual studies. Chini and Stillwell (2017) also call for more 516 availability and transparency in water utility data in their study of the United States water 517 sector, highlighting that the only means of acquiring data is through open record requests of 518 individual utilities. Even following data requests from over 200 utilities, only 61% responded. 519 Sato et al. (2013) further emphasise the need for global, regional, and country level data, 520 illustrating that only 55 countries have data available on wastewater production, treatment and 521 reuse, with 57 countries having no information available at all. Whilst the study is somewhat 522 dated now, clearly these themes are still valid. A lack of data not only makes it difficult to 523 affectively evaluate energy intensity and conduct benchmarking, but it also causes problems 524 of representativeness. With only limited companies reporting their data, it can lead to biases 525 within the sample. For example, perhaps only the best performers who already partake in 526 benchmarking and external analyses make their data publicly available (Denrell, 2005). In 527 combination alongside general limited coverage within areas, a lack of representation causes 528 analyses to miss the full picture, therefore reducing the quality of recommendations and real-529 world improvements. 530 The need for more detailed and granular data alongside additional data is paramount for 531 enhanced assessments of wastewater treatment in the future. A subject at the core of the 532 results in this study is the difference between net and gross energy consumption in reporting. 533 Net energy consumption would enable more meaningful sustainability outcomes as energy 534 production and strain on the electricity grid are encompassed, which are integral elements for 535 modern WWTPs. Additionally, compliance rates with wastewater effluent standards would 536 enhance the accuracy of analysis, as currently regions with similar standards are grouped 537 together, although their compliance rates may differ greatly. These extra and more detailed 538 data would also enable the inclusion of explanatory factor analysis to improve understanding 539 of how exogenous influences can be managed to enhance efficiency. Currently, the data 540 conditions of scarcity and factors already influencing results such as those mentioned above 541 would mean explanatory factor analysis would not currently offer value. Finally, this study used 542 wastewater treated at least to secondary treatment level or better, but more detail on which 543 level of treatment has been used and what volume that was applied to would enable a better 544 understanding of the current state of wastewater treatment in many regions. For the best 545 understanding of treatment levels, having key pollutant removal data or influent vs effluent 546 data would be required. An alternative unified metric to kWh/m 3 that incorporates energy and 547 a quality aspect would be best for optimum intensity benchmarking. An example is energy per 548 unit of organic load removed (kWh/CODremoved), which is a simple performance indicator that 549 conveys meaningful information. This has been used in other studies (Patziger, 2017) and analyses. Although there is more demand for quality indicators to be ubiquitous in measuring 561 and reporting, and there are differing approaches in including quality within energy efficiency 562 assessments, it is important that utilities, regulators, and academics unify their metrics, to ease 563 comparisons, analysis, and ultimately, facilitate learning and improvement. 564

Conclusions 565
The objectives of this study were to investigate the international energy intensity of wastewater 566 treatment, explore variances in performance, evaluate the carbon impact of the energy 567 consumption, and assess how to improve international benchmarking practices. The global 568 average electricity consumption for wastewater treatment was 0.89 kWh/m 3 . Larger 569 companies serving over 1 million customers display slightly lower specific consumption, of 570 0.78 kWh/m 3 . When viewing regional groupings, EU companies had the highest average 571 energy intensity at 1.18 kWh/m 3 , with three EU countries standing out: the Netherlands (1.06 572 kWh/m 3 ), Belgium (1.14 kWh/m 3 ), and Denmark (1.35 kWh/m 3 ). Countries with the lowest 573 energy intensity varied from Brazil, though India and South Korea to South Africa (averaging 574 0.24 kWh/m 3 ). This appeared to be a symptom of the energy data being gross consumption 575 and there being a disparity between wastewater quality standards, since energy production at 576 WWTPs was not captured and the lowest energy consumers had some of the worst standards, 577 and vice versa. It is expected that as regions with lower effluent standards improve to similar 578 levels of advanced economies, their energy consumption will increase too. The influence of 579 energy consumption on GHG emissions was diverse owing to interaction with widely differing 580 emission intensities of grid electricity; Poland had the highest carbon footprint with 0.91 581 kgCO2e/m 3 , whilst Norway emitted just 0.013 kgCO2e per cubic meter of, despite consuming 582 0.60 kWh/m 3 , showing the importance of energy intensity on particular infrastructures. 583 Although this study provided some valuable quantifiable results, the conclusions stemming 584 from the limitations of carrying out the benchmarking exercise are just as crucial. There is a 585 lack of quantity, quality, and granularity in existing global wastewater data, making it difficult 586 to fully analyse the impact and potential paths to improve wastewater treatment. A lack of data 587 generally leads to a lack of representativeness of certain regions, skewing comparisons with 588 limited sample sizes. The two changes that would have the most significant impact for future 589 analyses are to have influent vs. effluent quality and net energy consumption data, which 590 would increase the accuracy of studies, circumnavigating varying legislative effluent standards 591 and compliance rates. The large international sample size, energy data with a quality control, 592 GHG analysis, and specific benchmarking recommendations provide novel results which 593 could be of use to water industry operators, benchmarking organisations, energy efficiency 594 analysts, and regulators.