Introduction

Solid waste management (SWM) constitutes a challenging and prevalent issue for authorities due to health problems wrought by improper waste management and the various related atmospheric issues, such as odour pollution, air pollution, and soil degradation (Ferronato et al. 2021). In developed countries with growing populations and economies, there has been a steady increase among both urban and rural populations in the level of waste generation (Kamarehie et al. 2020; Llanquileo-Melgarejo & Molinos-Senante 2021). This increase has made researchers highly concerned about the sustainable management of municipal solid waste (MSW) (Xiao et al. 2020). In response to this issue, many cities—including Adelaide, San Francisco, and Vancouver—have adopted a zero-waste strategy (ZWS) as part of their waste-management system (Ayeleru et al. 2018; Zaman 2015). ZWS is a concept aimed at resource recovery and the conservation of scarce natural resources, as waste is being diverted from landfills for permanent disposal. This strategy can be achieved by minimizing, composting, recycling, and reusing waste and modifying the ways in which people and businesses use limited resources; additionally, it urges businesses to product designing so that waste can be eliminated from manufacturing processes (Lombardi and Bailey 2015; Moazzem et al. 2021). In addition, ZWS supports a circular economy (Kurniawan et al. 2021), which can be defined as an economic structure aimed at the reduction of waste and the consistent recycling of energy (Cocker and Graham 2019). To construct a closed-loop structure, circular systems encourage reuse, sharing, maintenance, refurbishment, remanufacturing, and recycling to minimize the use of resource inputs and limit deforestation, carbon emissions, and the generation of waste (Pecorini et al. 2020; Rathore and Sarmah 2020). Like SWM, one of its key components, yard waste management is also becoming increasingly popular among cities in developed countries in the pursuit of a ZWS. Yard waste accounted for 11–27% of total MSW in urban centres across Australia, Canada, and the USA (Lee et al. 2020; Vu et al. 2019).

The accurate prediction of MSW plays a significant role in the development of a sustainable and efficient waste management system (Ferronato et al. 2020; Singh and Satija 2016). If MSW predictions are inaccurate, waste management programs and facilities can be designed or operated inefficiently (Vu et al. 2019). The univariate prediction of MSW is challenging, as it relies on a wide array of multivariate factors. Population, per capita gross domestic product (GDP), household income, urbanization, standard-of-living parameters, geographical location, climate, and local environmental laws all influence waste generation (Boumanchar et al. 2019; Wang et al. 2018). However, the scarcity of reliable data constitutes a major challenge for the implementation of waste management (Mohammadi et al. 2019).

Compared to other industrialized nations, Canada produced more MSW per capita and diverted less waste to landfills (Pan et al. 2019; Richter et al. 2018). This study uses the CoW as its study area because yard waste accounted for a mean of 25% of the total MSW reported in the CoW (CoW-a 2020a) from 2007 to 2018. On October 19, 2011, the Winnipeg City Council approved a holistic waste management strategy; it raised the city-wide waste-diversion rate to 50% or higher, a proposal first made for the public sector by the City Council on 23 June 2010 (CoW-a 2011a). This strategy, which aimed to boost the recycling of household garbage, was formally implemented in early October 2012. The principles of this plan are in line with the concept of a ZWS. To support the CoW’s MSW-management plan, this study aims to predict quarterly YWG through the accurate prediction of MSW generation up to the target year of 2025 using only the twelve historical MSW data points from the year 2007 to 2018. Some studies used five years (Abbasi and El Hanandeh 2016; Liu et al. 2021) while some presented ten years (Ghinea et al. 2016; Mushtaq et al. 2020a, b) of prediction time. As there is no specific guideline, this study used seven years of forecasting period for the planning purpose. The term yard waste refers to decomposable waste materials such as grass, leaves, and small tree branches that have been trimmed during a growth phase.

Furthermore, the independent factors related to MSW generation require accurate forecasting which largely depends on the actual amount of MSW, which constitutes the motivation behind this study. Additionally, an accurate MSW generation forecast is required to quantify the potential amount of yard waste for the city to the assessment of energy recovered from MSW. To increase the prediction accuracy, unlike previous waste-prediction studies (Al-Salem et al. 2018; Ayeleru et al. 2018), this study considers both socio-economic and climatic factors together in MSW generation prediction. At the same time, a correlation analysis approves the key independent factors relevant to the generation of the MSW for the city.

There are many data-intensive models, e.g. artificial neural network (ANN) (Ali and Ahmad 2019; Kontokosta et al. 2018), support vector (Meza et al. 2019; Niska and Serkkola 2018), and multiple linear regression (MLR) (Abdulredha et al. 2018; Golbaz et al. 2019), aimed at MSW prediction. However, due to high variability considered as the uncertainty (Singh 2019; Tsai et al. 2020) and an insufficient amount of historical data associated with MSW management and forecasting, grey forecasting models are more appropriate than data-intensive models approaches, as they are more effective at limiting forecasting error (Duman et al. 2019; Wang et al. 2018). As a result, this study uses the grey model, which is appropriate for forecasting future data with less prediction error (Liu et al. 2011). The grey model is a well-established forecasting model for predicting grey-type data and dealing with uncertainty (Zhang et al. 2019). For example, Ren et al. (2013) and Ren (2018) used a grey model to predict the yield of biohydrogen under scanty data conditions and proved its superiority over the ANN model and support vector machine.

Most prediction methods based on time series involve moving averages and exponential smoothing (Gooijer & Hyndman, 2006), neural networks (Tealab 2018), and grey models (Kayacan et al. 2010; Liu et al. 2011). The applicability of exponential smoothing and the moving-average method is limited to linear time series data. Excellent performance is shown by neural networks in both linear and nonlinear time-series data. However, for high accuracy, most neural networks require large quantities of data to train the system. In contrast, the grey model can be implemented in both linear and nonlinear data with uncertainty and does not require its sample to be as substantial for an accurate prediction (Liu et al. 2016).

There are two prominent types of prediction models commonly used to analyse time-series data in the method of the grey theory: GM (1, 1) for single time-varying factors and GM (1, N) for multiple time-varying factors (Kayacan et al. 2010). Grey forecasting includes the sequence forecasting, calamity forecasting, prediction of seasonal calamities, topological forecasting, and systematic forecasting (Lü and Lu 2012). One of the grey theory’s most significant features is the use of accumulated generation operation (AGO) to minimize data randomness (Zeng et al. 2020). The AGO approach efficiently eliminates noise by transforming random time series data into a monotonically increasing sequence, which can rapidly evaluate systematic regularity (Liu et al. 2016). Given the simplicity of the grey model and its potential for the prediction of time-series data, many researchers have begun to employ this model. It has been successfully used to study climate (Dengiz et al. 2019), energy (Li and Zhang 2019; Lu 2019), healthcare (Rahman et al. 2019), industrial technology and safety (Lü and Lu 2012), and petroleum exploration (Wang and Song 2019) among many subjects. Finally, the results obtained from the grey models are compared with the result obtained from the simple regression model as it has been applied in various waste prediction studies (Abdulredha et al. 2018; Golbaz et al. 2019).

The outcome of this study will support the efforts of urban planners, engineers, legislators, and researchers—especially those in the CoW—in sustainable planning for yard waste management in terms of budgeting, resource allocation, and estimating energy generation.

The rest of the paper is organized as follows: Section 2 gives the related materials and methods. Section 3 presents the results and discussions. Finally, Section 4 concludes this paper.

Materials and Methods

Three climatic factors with two conditions—temperature (average/maximum), humidity (average/maximum), and wind speed (average/maximum)—and eight socio-economic factors—population, number of households, number of labour force, employment number, household income, unemployment, income per employee, and GDP were considered as the independent factors for each quarter (winter, spring, summer, and fall) during the study period, as MSW generation largely depends on socio-economic and climatic factors (Kannangara et al. 2018; Vu et al. 2019). Quarterly waste data were used instead of annual waste data to ensure prediction accuracy, as MSW generation is not consistent throughout the year. Sections 2.2 and 2.3 discuss the available data on the independent factors and detail each of these factors, respectively. Key independent variables were screened out by the correlation analysis with respect to the target variable, MSW tonnage. Since the time length and sample size of the historical data on the independent factors and target variable were not identical, an identical time length and sample size of the independent variables’ data are considered to the target variable, MSW generation.

The independent factors related to the target variable were used to develop the GM (1, N) model, with individual factor modelling applying the GM (1, 1) model and single regression analysis (SRA). The grey models were constructed using MATLAB (v. 2017a). The accuracy of the models was then determined through five common statistical indices, including mean square error (MSE), mean absolute percentage error (MAPE), mean absolute error (MAE), root mean square error (RMSE), and correlation coefficient (R2) and one additional metric normalized-RMSE (NRMSE). The most accurate model was used to forecast the independent factors, and the GM (1, N) model was applied to predict MSW generation for the CoW. These values were then used to estimate the city’s YWG. To calculate the amount of YWG, the percentage of YWG out of total MSW was measured using historical data. The future percentage of YWG was estimated for the target period using the GM (1, 1) model. Quarterly YWG was then evaluated by multiplying this percentage by the predicted quarterly MSW. Fig. 1 illustrates the methodology followed in this study.

Fig. 1
figure 1

Framework for YWG prediction from an estimated amount of MSW for the CoW

Study area

The CoW is located in the south-central part of Manitoba, Canada, where the Red River meets the Assiniboine River, as shown in Electronic Supplementary Material (ESM) (Fig. S1). It comprises Manitoba’s provincial capital as well as its surrounding municipalities, towns, and cities. Due to its flat topography, clay soils, and heavy snowfall, the CoW is subjected to yearly flooding (CoW-b 2011b). The city covers an area of 464.08 km2 with a population density of 1,430 persons per km2 (Statistics Canada 2011).

The CoW’s position in the Canadian Prairies gives it a continental tropical climate with very cold winters. Summers have a mean high temperature of 25.9°C and a mean low of 13.5°C (Weather Atlas 2020). Winter is the coldest and driest season of the year with the temperature ranging from −21.4°C to −11.3°C in January (Weather Atlas 2020). Winnipeg is a regional economic hub. According to a report on the city’s economic development, it has one of the most diversified economies in the world, with large commercial (15.2 %), manufacturing (9.8 %), education (7.7 %), and healthcare and community support (15.2 %) sectors (Economic Development Winnipeg 2016). The CoW had an average population of 719,269 in 2007—though this is expected to rise to 809,800 by 2023 (CoW 2019; Statistics Canada 2016). From 2002 to 2015, the number of households in the CoW has increased from around 249,000 to around 291,900—an increase of 17% (CoW-b 2018b).

Waste data availability

While there are no available quarterly historical data on yard waste for the CoW, there are annual historical data on yard waste. However, there are both quarterly and annual historical data on MSW for the CoW. The descriptive statistics about the historical MSW data and the selected factors between 2007 and 2018 are presented in Table 1, and the quarterly historical data are displayed in ESM S1 (Table S1.1, Table S1.2, Table S1.3, and Table S1.4). Table 1 represents the four descriptive statistics: mean, standard deviation (St. Dev.), minimum (min.), maximum (max.), and coefficient of variation (CV) values of the target variable and the independent factors for four quarters. According to the table, the mean and the St. Dev. of the MSW generation in Q2 and Q3 are higher than that in Q1 and Q4. Among the independent factors, the mean and the St. Dev. of population, income, and GDP are nearly the same across all four quarters, but they increase from the first quarter to the fourth quarter. The mean and the St. Dev. for the number of households are constant throughout each year. Since the climatic factors vary by season, their changes are random across the four quarters. Mean of the temperature and wind speed go higher in the third quarter and humidity goes higher in the second quarter. As seen in Table 1, the CV values of MSW are similar to those of Q2, Q3, and Q4, but are lower than those of MSW in Q1. Throughout the four quarters, the CVs of the independent components are more uniform for socio-economic components than for climatic elements. Humidity takes a uniform CV compared to wind speed and temperature.

Table 1 Descriptive statistics of factors selected for the modelling of MSW generation

The socio-economic data were collected from various open data sources related to the CoW (Conference Board of Canada 2020; CoW-a 2018a, b, 2016, 2019; Economic Development Winnipeg 2016, 2019; Statistics Canada 2011, 2016), and the climatic data were collected from the Weatherstats Winnipeg (Weatherstats 2020). The weather data were then transformed from monthly to quarterly on two levels: average and maximum. The minimum conditions for temperature, wind speed, and humidity have a negligible effect on YWG; as a result, they were not considered.

In the CoW, yard waste was collected in two ways: depot and self-haul collection and curbside collection. The total amount of yard waste corresponds to the summation of these collections. Figure 2 illustrates the annual yard waste information for the city from 2007 to 2018. This figure indicates that YWG is gradually increasing over time. The figure also indicates that the CoW started yard waste collection in 2007, which was extended in 2010, and began to give it more attention in 2013. Additionally, yard waste prediction is important to quantify the amount of YWG beforehand to assess the energy recovery and to support the ZWS as well as to efficient MSW management.

Fig. 2
figure 2

Annual yard waste collection data for the CoW (CoW 2020b)

Socio-economic parameters selection

The key criteria for choosing socio-economic factors were statistical significance and availability at the municipal level. After examining previous studies (Ayeleru et al. 2018; Kumar and Samadder 2017), this study initially considered eight socio-economic factors and six climatic factors (Vu et al. 2019) related to MSW generation, as established in Table 2. Ultimately, four major socio-economic factors (population, number of households, household income, and GDP) and four sets of three major climatic factors (one set per quarter) were selected for the study. To make these selections, correlation analysis was performed between the target variable—MSW—and the initial 14 independent factors. Significant correlations among the factors were established to limit multicollinearity between the inputs (Daoud 2018). Due to increased standard errors, multicollinearity has statistical consequences, such as making an independent variable statistically insignificant. It makes the model difficult to comprehend and introduces an overfitting issue. To avoid this difficulty, highly related sets of parameters with correlation coefficients greater than absolute 0.95 were excluded (Abdoli et al. 2011). The correlation matrix for the first quarter is displayed in Table 2. The correlation matrices for the other three quarters are presented in ESM S3 (Table S3.1, Table S3.2, and Table S3.3).

Table 2 Correlation matrix of dependent and independent factors for the first quarter (Q1)

These coefficients indicate the existence of collinearity between the independent factors and also represent a measure of the linear association between two variables. To validate the calculated values of the correlation coefficients, R, they were compared with the critical Pearson correlation coefficients, RCrit (Sousa et al. 2007) which is used to determine the linear correlation between two data sets. It can be calculated using the ratio of two variables' covariances to the product of their standard deviations. As a result, RCrit is effectively a normalized measurement of covariances, with the result always lying between −1 and 1. If the absolute value of a correlation coefficient exceeds the critical Pearson value, then the correlation coefficient is valid. RCrit was calculated using the following equation:

$${R}_{crit}=\frac{t_{crit}}{\sqrt{DF+{t}_{crit}^2}}$$
(1)

where DF stands for degrees of freedom, which are the number of independent values that can vary in a statistical analysis without violating any constraints. For the n sample of observations and k number of grouped variables, the DF can be calculated as DF = n − k. The term, tcrit refers to the ‘cut-off point’ on the t-distribution which may be found in the t-distribution table. In this study, the total number of observations was 12 for each quarter and the number of grouped variables was two; hence, the degrees of freedom (12-2) were 10. Simultaneously, at 0.05 level of significance (two-tailed test), tCrit was 2.228, resulting in RCrit equal to 0.576.

According to Table 2, the socio-economic factors are strongly correlated with the climatic factors in their relation to the target variable, MSW. The following five socio-economic factors—labour force, employment, household income, unemployment, and income per employee—can be considered by only one factor, household income, which has a strong correlation with MSW. Climatic factors showed a little significant effect on MSW generation but they are more responsible for YWG. This study selected three climatic factors—average wind speed, average humidity, and maximum temperature—for the first quarter based on a strong correlation with the MSW and among themselves. Following this way, a set of four socio-economic factors and three climatic factors are selected for the other three quarters.

Grey theory

This study utilizes the two basic grey models for individual factors and multivariable MSW prediction. The following subsections detail the models’ solution procedures.

GM (1, 1) Prediction Model

The basic principles and modelling mechanism of GM (1, 1) is as follows (Liu et al. 2016):

  • Step 1. For n samples, the original time sequence, X(0), is given as:

$${X}^{(0)}=\left[{x}^{(0)}(1),{x}^{(0)}(2),\dots, {x}^{(0)}(n)\right]\kern3.5em n\ge 4$$
(2)

To reduce the noise and disorderliness of the raw data, the AGO is applied to a new series, X(1) = [x(1)(1), x(1)(2), …, x(1)(n)], where x(1) is obtained as follows:

$${x}^{(1)}(k)=\sum_{i=1}^k{x}^{(0)}(i),\kern3em k=1,2,3,\dots, n$$
(3)

The generated mean sequence, Z(1) = [z(1)(1), z(1)(2), …, z(1)(n)] of X(1) is as follows:

$${z}^{(1)}(k)=\alpha {x}^{(1)}(k)+\left(1-\alpha \right){x}^{(1)}\left(k-1\right),\kern2em k=2,3,\dots, n$$
(4)

Here, α is called the positioned coefficient of the interval grey number. The value of α is generally set as 0.5 for the generation of mean sequence; but, its value can be varied in the range of [0, 1].

  • Step 2. The first-order grey differential equation can be constructed as follows:

$${x}^{(0)}(k)+a{z}^{(1)}(k)=b$$
(5)

The mean sequence generating equation of Eq. (5) is called the differential equation:

$$\frac{d{x}^{(1)}}{dt}+a{x}^{(1)}=b$$
(6)

where a and b are called the development and control coefficients, respectively. These coefficients can be obtained using the least-squares estimation method as follows:

$$\hat{a}=\left[\begin{array}{c}a\\ {}b\end{array}\right]={\left[{B}^TB\right]}^{-1}{B}^T\ Y$$
(7)

Where, \(B=\left[\begin{array}{c}\begin{array}{c}\begin{array}{cc}-{z}^{(1)}(2)& 1\\ {}-{z}^{(1)}(3)& 1\end{array}\\ {}\kern0.75em \vdots \kern3.50em \vdots \end{array}\\ {}\begin{array}{cc}-{z}^{(1)}(n)& 1\end{array}\end{array}\right],\) and \(Y=\left[x^{\left(0\right)}\left(2\right)x^{\left(0\right)}\left(3\right),...x^{\left(0\right)}\left(n\right)\right]^T\) 

  • Step 3. The grey prediction equation can be described as follows:

$${\hat{x}}^{(1)}(k)=\left[{x}^{(0)}(1)-\frac{b}{a}\right]{e}^{-a\left(k-1\right)}+\frac{b}{a}$$
(8)

where \({\hat{x}}^{(1)}(k)\) indicates the prediction of x(1)(k) at time point k and the initial condition, x(1)(1) = x(0)(1). The inverse AGO (IAGO) sequence can be obtained as follows:

$${\hat{x}}^{(0)}(k)={\hat{x}}^{(1)}(k)-{\hat{x}}^{(1)}\left(k-1\right),\kern0.75em k=2,3,\dots, n$$
(9)

GM (1, N) Prediction Model

The multivariable grey forecasting model, represented by GM (1, N), consists of a dependent variable sequence (system characteristic sequence) and (N-1) independent variable sequences (related factor sequences). The basic modelling methods of the GM (1, N) model (Ren 2018; Ren et al. 2013) are as follows:

  • Step 1. Assume that there are N variables denoted by Xi, where (1, 2, 3, ……, N) and each variable have n number of initial sequences. Let the original sequence of the variables be \({X}_i^{(0)}\).

$${X}_i^{(0)}=\left[{X}_i^{(0)}(1),{X}_i^{(0)}(2),\dots, {X}_i^{(0)}(n)\right]\kern1em \left(i=1,2,3,\dots, N\right)$$
(10)

Applying the AGO to convert the original data into a new series, \({X}_i^{(1)}=\left[{X}_i^{(1)}(1),{X}_i^{(1)}(2),\dots, {X}_i^{(1)}(n)\right]\kern0.75em \left(i=1,2,\dots, N\right)\), where \({X}_i^{(1)}\) can be obtained as follows:

$${X}_i^{(1)}(k)=\sum_{j=1}^k{X}_i^{(0)}(j),\kern3em k=1,2,3,\dots, n$$
(11)

The generated mean sequence \({Z}_1^{(1)}=\left[{Z}_1^{(1)}(2),{Z}_1^{(1)}(3),\dots, {Z}_1^{(1)}(n)\right]\) of \({X}_1^{(1)}\) can be calculated as follows:

$${Z}_1^{(1)}(k)=\alpha {x}_1^{(1)}\ (k)+\left(1-\alpha \right){x}_1^{(1)}\ \left(k-1\right),\kern2em k=2,3,\dots, n$$
(12)
  • Step 2. The first-order grey differential equation can be constructed as follows:

$${X}_1^{(0)}(k)+a{Z}_1^{(1)}(k)=\sum_{i=2}^N{b}_i{X}_i^{(1)}(k)$$
(13)

The image equation of Eq. (13) is:

$$\frac{d{X}_1^{(1)}(k)}{dt}+a{X}_1^{(1)}(k)=\sum_{i=2}^N{b}_i{X}_i^{(1)}(k)$$
(14)

where the coefficients a and bi can be calculated as follows:

$$\mathrm{For},n\le N+1,$$
$$\hat{P}={\left(a,{b}_2,{b}_3,\dots, {b}_N\ \right)}^T={B}^{-1}\ Y$$
(15)
$$\mathrm{For},n>N+1,$$
$$\hat{P}={\left(a,{b}_2,{b}_3,\dots, {b}_N\ \right)}^T={\left[{B}^TB\right]}^{-1}{B}^T\ Y$$
(16)

where \(B=\left[\begin{array}{cc}\begin{array}{c}\begin{array}{cc}-{Z}_1^{(1)}(2)\kern0.75em & {X}_2^{(1)}(2)\end{array}\\ {}\begin{array}{cc}-{Z}_1^{(1)}(3)\kern0.75em & {X}_2^{(1)}\end{array}(3)\end{array}& \begin{array}{c}\begin{array}{cc}\kern0.5em \dots & \kern1.75em {X}_N^{(1)}(2)\end{array}\\ {}\kern0.5em \begin{array}{cc}\dots & \kern1.75em {X}_N^{(1)}(3)\end{array}\end{array}\\ {}\begin{array}{c}\begin{array}{cc}\vdots \kern2.75em & \kern0.5em \vdots \end{array}\ \\ {}\begin{array}{cc}-{Z}_1^{(1)}(n)\kern0.75em & {X}_2^{(1)}(n)\end{array}\end{array}& \begin{array}{c}\kern0.75em \begin{array}{cc}\dots \kern3.5em & \vdots \kern0.5em \end{array}\kern1em \\ {}\kern0.5em \begin{array}{cc}\dots & \kern1.75em {X}_N^{(1)}(n)\end{array}\end{array}\end{array}\right]\), and \(Y=\left[\begin{array}{c}\begin{array}{c}{X}_1^{(0)}(2)\\ {}{X}_1^{(0)}(3)\end{array}\\ {}\begin{array}{c}\vdots \\ {}{X}_1^{(0)}(n)\end{array}\end{array}\right]\)

  • Step 3. Using a and bi, the grey prediction equation can be expressed as follows:

$${\hat{X}}_1^{(1)}\left(k+1\right)=\left[{X}_1^{(1)}(1)-\frac{1}{a}\sum_{i=2}^N{b}_i{X}_i^{(1)}\left(k+1\right)\right]{e}^{- ak}+\frac{1}{a}\sum_{i=2}^N{b}_i{X}_i^{(1)}\left(k+1\right)$$
(17)

where \({\hat{X}}_1^{(1)}\left(k+1\right)\) indicates the prediction of \({X}_1^{(1)}(k)\) at time point k with the initial condition, \({X}_1^{(1)}(1)={X}_1^{(0)}(1)\). The IAGO sequence can be obtained as follows:

$${\hat{X}}_1^{(0)}\left(k+1\right)={\hat{X}}_1^{(1)}\left(k+1\right)-{\hat{X}}_1^{(1)}(k),\kern0.75em k=2,3,\dots, n$$
(18)

Metrics of Forecasting Error Analysis

Performance analysis is important for assessing the accuracy of forecasting models. The model’s predictive performance was assessed using five metrics that are common in studies of waste prediction (Kumar and Samadder 2017; Younes et al. 2015). These metrics—MAPE, MAE, MSE, RMSE, and R2—are specified below.

$$MAPE=\frac{1}{n}\times {\sum}_{i=1}^n\left|\frac{Y_i-{\hat{Y}}_i}{Y_i}\right|\times 100$$
(19)
$$\mathrm{MAE}=\frac{1}{n}\sum_{i=1}^n\left|{Y}_i-{\hat{Y}}_i\right|$$
(20)
$$MSE=\frac{1}{n}\times {\sum}_{i=1}^n{\left({\hat{Y}}_i-{Y}_i\right)}^2$$
(21)
$$RMSE=\sqrt{\frac{1}{n}\times {\sum}_{i=1}^n{\left({\hat{Y}}_i-{Y}_i\right)}^2}$$
(22)
$${R}^2=1-\frac{\sum_{i=1}^n{\left({\hat{Y}}_i-{Y}_i\right)}^2}{\sum_{i=1}^n{\left({Y}_i-{\overline{Y}}_i\right)}^2}$$
(23)

where

n:

Number of raw data points

Yi:

The actual mass of MSW

\({\hat{Y}}_i\):

Predicted mass of MSW

\({\overline{Y}}_i\):

The mean of actual MSW

In these formulations, the first four metrics require a value close to zero, while the last metric requires a value close to one. Besides these five metrics, St. Dev. (observed data)-based NRMSE is also used to evaluate and compare the prediction accuracy of the GM (1, N) model. Among the five error metrics, the MAPE is the most popular goodness-of-fit indicator for forecasting problems (Islam et al. 2021). Following the Lewis scale (Javed et al. 2020), as shown below, a MAPE of less than 20% is indicative of a reliable forecast. However, the forecasting accuracy of the prediction models used in this study is evaluated with the help of the aforementioned five metrics simultaneously.

$${\displaystyle \begin{array}{cc}\mathrm{MAPE}\ \left(\%\right)=\left\{\begin{array}{c}<10\\ {}10\sim 20\\ {}20\sim 50\\ {}>50\end{array}\kern1.5em \right.& \begin{array}{c}\mathrm{Highly}\ \mathrm{accurate}\ \mathrm{forecast}\\ {}\mathrm{Good}\ \mathrm{forecast}\\ {}\begin{array}{c}\mathrm{Reasonable}\ \mathrm{forecast}\\ {}\mathrm{Inaccurate}\ \mathrm{forecast}\end{array}\end{array}\end{array}}$$

Results and Discussion

Independent factors prediction

The details of the comparative analysis—including the simulation errors—of independent factors are presented in ESM S2 (Table S2.1-Table S2.6). To avoid unnecessary repetition and ensure a clear understanding of the forecasting accuracy of the GM (1, 1) model over SRA, Fig. 3 only demonstrates the forecasting error of the ‘population’. Three types of errors have been measured for each factor: in a sample (9 data points), out of a sample (3 data points), and overall (12 data points) for each forecasting error metric. According to Fig. 3 and the tables displayed in ESM S2, the GM (1, 1) model outperforms the SRA model across all of the factors.

Fig. 3
figure 3

Forecasting error in prediction of the population for GM (1, 1) model and SRA [A-E]

For climatic factors, such as humidity, temperature, and wind speed, the two models’ prediction accuracy is very closer to the socio-economic factors. In some quarters, the climatic factors have small R2 values for both predictive models. Additionally, in some quarters the out-of-sample error is higher than the in-sample error, and in some cases, both errors are higher. All of these phenomena are somewhat statistically insignificant and one of the reasons behind these errors could be the bias and variance errors of regression and prediction model. The bias-variance trade-off can be used to address these issues more accurately. However, examining the overall accuracy of the five-metrics described in ESM S2, it is clear that the GM (1, 1) model generates a more accurate prediction for the climatic factors than the SRA model.

The prediction accuracy for the ‘household’ factor was the same across all four quarters, as the number of households in the CoW was constant throughout each year. This analysis supports the GM (1, 1) model’s applicability in the prediction of individual factors of MSW generation. Based on this error analysis, the GM (1, 1) model was selected to predict the independent factors over the next seven years, from 2019 to 2025, which was then used to forecast MSW generation up to the target period. And finally, the predicted amount of MSW is used to estimate the YWG for the city up to the target period. This process will, theoretically, continue in five-year increments throughout the course of the CoW’s master-plan target year—2045. This stepwise prediction will help the city to examine its seasonal variation in waste generation. By examining the historical data, it is clear that waste generation is not consistent throughout the year; as a result, quarterly MSW prediction is warranted. The forecasted results for each factor obtained from the GM (1, 1) model are shown in Table 3.

Table 3 Forecasted results for each independent factor for the entire four quarters

Table 3 indicates that the predicted results of the target variable and the independent factors are increasing over time. Among the indecent factors, the four selected socio-economic factors are highly increasing as their historical data were increasing in trend. However, the changes for the climatic factors are reasonably small for their less variability in each quarter. These results for the independent factors constitute the input for the GM (1, N) model for the MSW prediction up to the year 2025. After computing the MSW generation, the final prediction of YWG is measured from the predicted amount of the MSW for the city.

MSW prediction

Since the main MSW-prediction model in this study is a multivariable prediction problem with limited historical data over a 12-year period, the multifactor MSW-forecasting model is solved using the GM (1, N) model, which is effective for working with limited data. The forecasted result is illustrated in Fig. 4, which shows a downward trend in annual MSW generation in the CoW. The decline is due to the decline in the historical data and the negative correlation between the independent factors and the target variable. Furthermore, CoW authorities have already undertaken efforts to enhance their waste management procedures. As a result of these efforts, including residential food waste collection pilot project, automated recycling collection, composting, landfill monitoring, the decline in MSW generation—though small in the first quarter—is significant. After the first quarter, gardening works and a variety of outdoor activities arise. In the summer, more food waste is generated, which explains why the waste generation varies with the first quarter with the remaining three quarters.

Fig. 4
figure 4

Prediction of the MSW up to the year 2025

The accuracy of the simulation is displayed in ESM S4 (Table S4.1). The table represents three types of errors of the GM (1, N) model for each quarter: in a sample (10 data points), out of a sample (2 data points), and overall (12 data points). The NRMSE metric is applied to compare the quarterly predicted results when the R2 value is lower and in the case where R2 is always constant—one for two samples.

According to Table S4.1, MAPE and R2 indicate an acceptable result for the multivariable prediction of MSW generation using the GM (1, N) model. According to the Lewis scale of prediction accuracy mentioned in Section 2.5, the in-sample and overall MAPE values for each quarter are below 10% which indicates a highly accurate forecast. At the same time, out-of-sample MAPE values for Q1 and Q3 are relatively greater the Q2 and Q4. The value of R2 in the first quarter shows comparably small values than the other three quarters in MSW prediction. However, the NRMSE value of Q1 in three error categories indicates a significant prediction accuracy compared to the other three quarters. On the other hand, the MAE, MSE, and RMSE expressed relatively greater values for different ranges and higher variations in the historical data of the selected independent factors. Though the-out-of-sample error results are relatively higher than the in-sample and overall error results, analysing the overall error results displayed in the above table, for only twelve-point historical data sets, the GM (1, N) model can be used to predict the amount of MSW generation for the city precisely to each quarter. This accurately predicted result of the MSW generation is required for the precise estimation of the YWG for the city up to the target period.

Yard waste prediction

Figure 5(a) illustrates the pattern of YWG with the total MSW using the collected data. Though the trend is not linear, yard waste is increasing over time, while total MSW is declining. This decline can be attributed to the start of the CoW’s master plan in 2012. The figure also represents that the rate of YWG is increasing over time.

Fig. 5
figure 5

Rate of YWG to the total amount of MSW and quarterly predicted YWG for the CoW. (a) Rate of YWG to the total amount of MSW, (b) Illustration of the quarterly predicted YWG

The YWG predictions for each quarter are depicted in Fig. 5(b). The figure also represents the estimated percent of the YWG to the amount MSW up to the target period. While the figure shows an upward trend for the future percentage of the YWG, quarterly YWG indicates a decline throughout each year. This can be explained from Fig. 5 which indicates that the MSW generation is decreasing while the YWG is increasing over time.

According to Fig. 5(b), the first quarter (Q1) demonstrates a nearly constant YWG trend, as almost no yard activities are expected during the winter, which largely falls within this quarter. Besides, it was found that the predicted result of the first quarter’s MSW generation is relatively constant throughout the target period. For the other three quarters, the predicted result of the MSW generation is falling sharply up to the target period which is why the YWG is also shown a declining trend. This predicted result of YWG obtained from the forecasted MSW generation indicates that if the city maintains its current speed of efforts to the waste management in constant up to the target period then the city will reach this predicted stage of waste generation.

Finally, this study is aligned with the research of waste prediction-related studies where socio-economic factors are considered as the key influential variables for MSW generation (Al-Salem et al. 2018; Ferronato et al. 2020; Kannangara et al. 2018). Additionally, since the primary objective of this study is to predict the YWG from the forecasted MSW and the climatic factors are largely related to YWG, this research also considers climatic factors as influential factors (Vu et al. 2019). A few studies appear in the literature where both socio-economic and climatic factors are considered simultaneously. One of the significant aspects of this study is the application of seven independent factors concurrently to measure the MSW generation and YWG for the CoW. Furthermore, recent literature on the COVID-19 pandemic suggested that Canadian waste generation characteristics have been impacted (Richter et al. 2021a, b; Vu et al. 2021). Thus, a study focusing on the effects of COVID on CoW waste management is recommended.

Conclusions

In SWM systems, accurate prediction of the MSW generation plays an important role in the development of effective steps towards sustainable economic development through a ZWS. This study presents a grey theory-based YWG-prediction method that utilizes the predicted quarterly MSW for the CoW. This study aimed to support the CoW’s master plan to achieve a ZWS by 2045. Accurate YWG predictions can simplify efforts to estimate energy recovery through composting. Unlike previous waste-prediction research, this study considered the potential influence of both socio-economic and climatic factors. This study also conducted correlation analysis to identify the key influences among the considered factors. Additionally, it estimated the individual factors throughout the target period using the GM (1, 1) model, which proved to be superior to the SRA model. It produced individual-factor predictions with MAPE values of 0.26%−8.32% for the in-sample data. The GM (1, N) model was also used to model the multivariable MSW prediction using the socio-economic and climatic factors. This generated overall MAPE values of 5.64%−7.54% with suitable results for other error metrics. The results of this study demonstrate that the grey models can reliably predict both MSW and YWG. The most significant advantage of the grey method is that—even with a very limited number of samples—it provides precise predictions. Therefore, the GM (1, N) model has effectively simulated waste prediction with a lower computational cost in cases of poor information.

The findings of this study must be seen in light of its limitations. One of the potential limitations to the generalization of these results is the decline in MSW in contrast to the values of the independent factors, several of which showed negative correlations with MSW. This could have led to a biased prediction showing undesirable outcomes to the target variables. In addition, this study directly implemented the basic grey models in waste prediction without considering the optimal design of its background parameters. A robust design of the grey models’ background parameters can generate a more precise prediction. Finally, the current system of waste projection can be extended across Canada because the socio-economic parameter values can be established consistently utilizing census data available to all municipalities in Canada.