Introduction

Climate change and global warming are one of the hot discussions around the global community due to the threat it poses to sustainable development (Destek and Sarkodie, 2019; Khalid et al., 2021). However, from the steam period, in the course of the electrical period, to the current well information times, human lifestyle and production have spectacularly changed. The economies based on conventional small-scale peasant (e.g., agricultural) have been steadily reinstated by industrial economies, which have enhanced the production efficiency and boost real income (GDP) growth. In one fell swoop, the labor demand has increased in urban-centered regions with hasty GDP growth, and most of the rural inhabitants have moved to urban regions (Yang and Usman 2021). It is obvious that the substantiation that real GDP growth determines urbanization. Considering this view, the GDP growth and urbanization process have become very significant factors of modernization level that ultimately increase the overall level of environmental pollution (Liang and Yang 2019; Usman and Hammar 2021; Ahmad et al. 2021). Moreover, urbanization and industrialization processes have given economic growth to the world but also have caused climate change resulting in global warming. One of the major contributing factors to global warming is carbon dioxide (CO2) emission along with other greenhouse gases. Even though CO2 is a minor component of the atmosphere, but it is very significant. There are several gases when they collect in the atmosphere; they prevent the solar radiation to reflect the earth’s surface and sunlight absorption. As this radiation traps in the atmosphere because of these pollutant gases, it causes an increase in the temperature of the planet. This phenomenon is known as the greenhouse effect. The gases causing the greenhouse effect include CO2, methane (CH4), nitrous oxide (N2O), chlorofluorocarbons (CFCs), and water vapor. Carbon dioxide is a significant greenhouse (heat-trapping) gas. There are many other side effects of GHGs emissions such as air pollution, global warming, rising temperature, and atmospheric variations (Yang et al. 2021a; Usman et al. 2021a). The poor air quality poses the high health risks to the citizens. There are direct consequences such as health and environmental worsening associated with these effects. Measuring CO2 emissions is an important factor in guiding climate mitigation policies.

Carbon dioxide (CO2) is a significant of greenhouse gas and is the major contributing factor to global warming. Due to human activities, CO2 concentration in the atmosphere has increased by 47% over the past 170 years (NASA 2020), though this level would have reached over a period of 20,000 years naturally (NASA 2020). The primary source of CO2 emission is human activities such as combusting fossil fuels (non-renewable energy sources) and deforestation (Usman and Makhdum 2021). Besides, there are several other natural factors as well including volcanic eruptions and respirations. As per the Doha amendment of the Kyoto protocol in 2012, the target for maximum CO2 emission per capita for Bahrain was set to 20.96 metric tons for 2020. Almost 40% of the emitted CO2 globally comes from the burning of fossil fuels to generate electricity (Qader 2009; Usman and Jahanger 2021). The rapid increase in industrialization and urbanization is usually the driving factor for economic development (Intisar et al. 2020). However, the shift to industrialization causes excessive consumption of natural resources. It also increases the energy demand that is often produced by the combustion of fossil fuels and ultimately promotes atmospheric contamination. Energy consumption has been directly associated with CO2 emission by various researchers (Yang et al. 2020; Jahanger et al. 2021a; Usman et al. 2021b). Excessive exploitation of natural resources and combustion of fossil fuels and non-renewable energy cause severe threats to the environment such as water shortage, deforestation, and climate change (Dagar et al. 2021).

Forecasting models consider the historical data and trends to produce an estimation of the entity being forecasted. Despite the fact that the future human behavior and actions impact a lot how accurate the estimations will be such as if the future climate changes will depend on the commitments on the governmental policies related to climate and their impact on other policies such as in manufacturing, energy consumptions, and production, etc. There are a number of factors that may influence climate variations and environmental pollution. The global average increase in temperature level depends on the overall emission of the GHGs emissions. Therefore, overall human behavior and actions are responsible for the climate change and emission of the GHGs. Non-renewable energy and related fossil fuel combustion are some of the primary factors that release CO2 into the atmosphere. Several research studies have investigated the relationship between CO2 emission and energy consumption and several other indicators such as fossil fuels, oil usage, coal consumption, transportation, economic growth, financial development, health expenditures, trade openness, energy use, and electricity consumption (Yang and Usman 2021; Zhang et al. 2021; Khalid et al. 2021; Usman et al 2021c; Jahanger et al. 2021a, b).

As per the report of British petroleum energy statistics (BP 2020), carbon emissions increased by 0.5% from primary energy use in 2018, which is lesser than the average growth of the previous 10 years carbon emission from energy use that is 1.1% per year. However, due to a large number of extreme weather days in 2019, carbon emission raised an average escalation of carbon emission of 2018 and 2019 that becomes higher than its 10-year average (BP 2020). The countries that produce the highest amount of CO2 are China (28%), the USA (15%), India (7%), Russia (5%), and Japan (3%) (UCSUSA 2020; Usman et al. 2020a). At the first sight, it can observe that CO2 emission is highly correlated to the economies or gross domestic product (GDP) of the countries (Yang et al. 2021a). The rich/high-income countries produce elevated amounts of CO2 emissions, while poor/low-income countries produce stumpy amounts of CO2 emissions. However, this is not the case; several developed countries have managed to reduce CO2 emissions such as the UK, Spain, Norway, Japan, France, Poland, Australia, Italy, Denmark, and South Korea. At present, Bahrain stands at 6th position in CO2 emission per capita globally and 3rd in the Gulf Cooperation Council (GCC) countries (Knoema 2020). The following Fig. 1 and Fig. 2 illustrate CO2 emission per capita of GCC countries.

Fig. 1
figure 1

Trend comparison of CO2 emissions per capita of GCC countries

Fig. 2
figure 2

Periodic comparison CO2 emissions per capita of GCC countries

Accurate measurement of CO2 emission has become a demanding task due to several reasons. Reliable measurements are a prerequisite in more accurate analysis and forecasting of CO2 emissions. However, CO2 emission significantly caused by the transportation sector is considered one of the most challenging tasks inaccurate measurements of CO2 emission due to involvement of multiple traffic parameters. Therefore, most of the models and methods of CO2 emission measurement from various sectors provide approximate values with 100% accuracy that is not difficult to achieve. The amount of carbon dioxide emission was measured at 35.4 million tons for Bahrain in 2019, while it was 32.9 million tons in 2018. There has been an increase of 7.68% which is much higher than that of the average annual growth of CO2 emission globally. As per the Doha amendment of the Kyoto protocol (United Nations 1997), the baseline year for the countries was set to 1990, and the target for maximum CO2 emission per capita for Bahrain was set 20.96 metric tons for 2020. The current amount of CO2 emission per capita is 21.64 metric tons as of 2019. The step-wise trend comparison of CO2 emission per capita for all GCC countries is reported in Table 1.

Table 1 Periodic CO2 emission per capita of GCC countries

This study considers the CO2 emissions data from 1933 to 2019. The data has been collected from multiple sources such as from the Statistical Review of World Energy report (BP 2020) and from the organization by the name “Our World in Data” (World Bank 2020). Figure 3 illustrates the CO2 emission data of Bahrain from 1933 to 2019. It is found that the CO2 emission significantly follows the upward trending in the last of twentieth and twenty-first centuries. The series also shows that there is the leveling off at around 1950, and then the series shows another sharp upward trend at around 1970. Several researchers have examined the relationships between CO2 emission and renewable energy (Jahanger et al. 2021b; Yang et al. 2021b; Usman et al. 2020b). It has been observed that renewable energy negatively impacts CO2 emissions and other environmental hazards. As based on these observations, it is possible to forecast the CO2 emission using statistical and other forecasting methods. However, it is a question of interest whether the overall trend of CO2 emission is natural or due to some human-made interventions.

Fig. 3
figure 3

CO2 emission of Bahrain (1933 to 2019)

However, the observations illustrate that on average, there has been a decrease in CO2 emission in Bahrain during the last two decades. Nevertheless, as stated earlier that there has been an increase of 7.68% in 2019 as compared to 2018, therefore, it is highly unlikely to meet its current target. However, the environmental effect of the corona virus (COVID-19) pandemic might play a crucial role here, and the overall CO2 may fall globally, not only in Bahrain, and this economy might meet its target. An estimation of the CO2 emission of Bahrain will play a vital role in planning various economic and industrial developments for the next decade. Nonetheless, carbon capture and storage might be a key strategy to mitigate CO2 emissions. However, it has several risks associated with it including the probability of leakage; measurement of economic, ecological, and social impacts; and environmental perturbation power evaluation. This study employs multiple forecasting methods to predict the CO2 emission in Bahrain for the next decade.

This research study is organized into five sections. Section 2 provides a brief literature review of CO2 emission forecasting using different methods. Section 3 illustrates the forecasting methods that are applied in this research. Section 4 demonstrates and analyzes the results obtain, and Sect. 5 concludes the research analyses and provides a few policy directions.

Literature review

CO2 emission analysis, planning, and put into practice a variety of approaches for diminishing the emission level have been an important contemplation for each of the participating countries after the Kyoto treaty (United Nations 1997; Usman et al. 2020c). The cost of the energy-efficient policies that decreases the CO2 emission is usually higher than the strategies that do not consider carbon emission (Bye et al. 2018). Carbon dioxide has become a significant contributor to GHGs and plays an important role in increasing global warming and climate change. Therefore, carbon estimation becomes more significant. Researchers around the world attempt to understand and analyze the factors responsible for CO2 emission and strives to predict a massive amount of emission (Yang and Usman 2021; Jahanger et al. 2021b). Several methodologies have been employed by various researchers (Silva 2013) to predict CO2 emission and energy utilization. These methodologies include the usage of artificial neural network (ANN), support vector machine (SVM), autoregressive moving average (ARIMA) models, decision trees, Bayesian networks, and other statistical supervised machine learning algorithms. The neural network-based approach is one of the most commonly used approaches for forecasting time series like CO2 emission. Moreover, the grey model method, computer-based simulations, linear regression, multiple linear regression, logistic functions, adaptive neuro-fuzzy intelligent system, autoregressive integrated moving average, and Holt’s methods are a few approaches which are used very often by various scholars in the forecasting of CO2 emissions. In addition, there are several other approaches that are used to consider various factors while applying the forecasting method. These approaches employ complex mathematical and sophisticated statistical techniques for forecasting. However, this research employs the Gaussian Process Regression Rational Quadratic Model, Holt’s method, and nonlinear autoregressive neural network.

Gaussian process (GP)-based forecasting, especially time series analysis, has been used for a very long time (Wiener 1964; Kolmogoroff 1941). The GP predictions have been commonly employed in geo-statistics (Journel and Huijbregts 1978; Matheron 1973). The application of the Gaussian process in the prediction process can be seen in various areas of research such as in meteorology (Daley 1991), in spatial prediction (Whittle 1963), and in spatial statistics (Ripley 1981; Cressie 1993). Besides, researchers realized that the Gaussian process can be applied in the general regression context (Zhao et al. 2020). Early usage of the GP in computer experiments (Sacks et al. 1989) discussed the optimization of parameters in the covariance function and choices of the input vector that provide the most information. Furthermore, Williams and Rasmussen (1996) discussed the applications of GP in machine learning and illustrated the optimization methods for the parameters in covariance function. In this regard, other researchers have used the Gaussian Process Regression for forecasting in a variety of fields such as rock fragmentation in surface mines, wind speed forecasting, dam displacement forecasting, stream flow forecasting, short-term photovoltaic power probability forecasting, and CO2 emissions forecasting. Considering this view, Gaussian Process Regression Rational Quadratic Model algorithm is one of the methodologies that will be used in this research to predict/forecasting the CO2 emission in the case of Bahrain.

The origin of exponential smoothing can be traced back to World War II. Following this regime, Robert Brown developed a tracking model to track fire control information (Gass and Harris 2000). This model was extended, and a general exponential smoothing method was developed (Brown 1959, 1963). Charles Holt developed a forecasting method similar to exponential smoothing in 1957, though it was published recently (Holt 2004a, b). Winters applied Holt’s method to get empirical evidence, and the results gained popularity as Holt-Winters forecasting method (Winters 1960). All variations of this method, to overcome the seasonality effect, have been developed by extending the Winters’ method (Winters 1960). There are several other non-seasonal variations of this method that have additive trend (Holt 1957), multiplicative trend (Pegels 1969), damped additive trend (Gardner and McKenzie 1985), and damped multiplicative trend (Taylor 2003). The exponential smoothing method along with its several extensions has been utilized in the different domains including CO2 emission for estimating the future trends and prediction. Several researchers have used double the exponential smoothing in developing a model for forecasting including in the field of environmental pollution and ozone formation, etc. A research study conducted by Choi et al. (2014) has used a double exponential smoothing model for estimating the trend in CO2 emission for the US transportation sector. Another research study that provides an estimation of CO2 emission of Bahrain uses several methods including the Holt-Winters method and neural time series forecasting (Tudor 2016).

Neural network-based prediction models have gained a lot of popularity in the last couple of decades. There was an explosion of techniques used in artificial intelligence in 1980. Then, in the next decade, the use of ANN was widely used in time series forecasting. In recent years, the use of neural networks has increased multifold due to a high increase in computational power. The neural network has been a reliable technique for prediction and forecasting in multiple domains. For instance, multiple perceptron, due to their approximation property, was quickly adopted in time series forecasting besides being introduced as a technique to solve the classification problems. There are several other techniques available for time series forecasting and analysis; however, most of these methodologies assume a linear relationship among the variables, and others are based on nonlinearity. Linearity-based methods do not perform well in the cases where the relationship among the variables is nonlinear. The neural network-based model prediction models can perform well even if the relationship is nonlinear. Nonlinear autoregressive neural network for time series forecasting can overcome the nonlinearity and has the potential to forecast with minimum prediction error (Benmouiza and Cheknane 2016; Hill et al. 1996). The use of neural networks can be found in various domains such as machine translation (Shahnawaz and Mishra 2013; Bye et al. 2018), natural language processing (Khan et al. 2018), sentiment analysis (Astya 2017), and image processing (Bashir et al. 2017). Neural network forecasting models have been implemented to predict in several fields including CO2 emission forecasting and air pollution estimation (Gallo et al. 2014), for forecasting the intensity of emission by some of the top CO2 emitters (Acheampong and Boateng 2019) and CO2 emission estimation (Zhao and Mao 2012; Sun and Huang 2020), for predicting humidity and room temperature (Mustafaraj et al. 2011), and for predicting energy consumption (Ruiz et al. 2016; Usman and Hammar 2021).

Methodology

Time series analysis and forecasting are significant challenges for researchers. Finding a forecasting method that can predict with minimum prediction error has been a major challenge. There are several complex mathematical and sophisticated statistical forecasting techniques available. This research paper implements Gaussian Process Regression Rational Quadratic (GPR-RQ) Model, neural network time series nonlinear autoregressive (NNTS-NA), and Holt’s method. This section illustrates the functioning of these methods in brief.

GPR-RQ model

In the Gaussian process (GPs), a subset of variables from any collection of random variables forms a joint Gaussian distribution (Williams and Rasmussen 2006). The GPs have the potential to be applied for Bayesian supervised learning. In supervised machine learning, regression is a technique that concerns the prediction of the future values of the continuous quantities. Gaussian Process Regression (GPR) models are nonparametric probabilistic models based on the kernel with a multivariate distribution with a finite collection of random variables (Williams and Rasmussen 2006; MacKay 1998). The GPR has more ability to perform well on small datasets and eradicating the problem of micronumerasticity. Over the past decade, the use of GPs has gain popularity in the machine learning community. There are several types of the kernel that can be used with GPR such as rational quadratic covariance, squared exponential covariance, Matern class covariance, and marginal likelihood gradient. Based on experiments conducted, this study has found that rational quadratic performs relatively better than the other kernels for the prediction of CO2 emission. The algorithm for Gaussian Process Regression Rational Quadratic Model is outlined as follows:

  • Step 1: Obtain the training dataset of n observations in Eq. 1 as follows:

    $$Dataset=\left\{\left({x}_{i},{y}_{i}\right)|i=1, 2, \dots , n\right\}$$
    (1)

    where x is the input vector of the dataset dimension and y is the targeted or output variable.

  • Step 2: Begin with the standard linear regression model of the following form with Gaussian noise in Eq. 2 as follows:

    $$\begin{array}{ccc}y=f\left(x\right)+\varepsilon ,& \mathrm{and}& f\left(x\right)={x}^{T}w\end{array}$$
    (2)

    where w is weight vector (parameters) of the linear regression model.

    Assume that noise \(\varepsilon\) follows an identical independent (iid) Gaussian distribution having a variance of \({\sigma }_{n}^{2}\) and zero mean such as \(\varepsilon \sim N(0, {\sigma }_{n}^{2})\).

    The likelihood algorithm can be calculated in Eq. 3 as follows:

    $$p\left(y|f\right)=N\left(y|f, {\sigma }_{n}^{2}I\right)$$
    (3)

    \(\mathrm{where\:y is}{ [{y}_{1}, {y}_{2},\dots ,{y}_{n}]}^{T},\mathrm{ f is }[{f(x}_{1}), {f(x}_{2}),\dots ,{f(x}_{n})]\) and I is a unit matrix.

    The target variable \({y}_{*}\) or output is predicted for new input values \({x}_{*}\) using the following Eq. 4 joint distribution (MacKay 1998; Bishop 2006):

    $$\left[\begin{array}{c}y\\ {y}_{*}\end{array}\right]=\left(\left[\begin{array}{c}f\\ {f}_{*}\end{array}\right]+\left[\begin{array}{c}\varepsilon \\ {\varepsilon }_{*}\end{array}\right]\right)\sim N\left(0, \left[\begin{array}{cc}{K}_{y}& {k}_{*}\\ {k}_{*}^{T}& {k}_{**}+{\sigma }_{n}^{2}\end{array}\right]\right)$$
    (4)

    where \({K}_{y}=K+ {\sigma }_{n}^{2}I\) and \(K=k({x}_{i}+{x}_{j})\), \({f}_{*}=f{(x}_{*})\) is new input variable’s latent function using the Gaussians conditioning rules (Bishop 2006), the predictive Gaussian distribution \(p\left({y}_{*}|y\right)\) is computed using the mean \(m\left({x}_{*}\right)={k}_{*}^{T}{K}_{y}^{-1}y\), and covariance \({\sigma }^{2}\left({x}_{*}\right)={k}_{**}-{k}_{*}^{T}{K}_{y}^{-1}{k}_{*}+{\sigma }_{n}^{2}\). However, the \({K}_{y}^{-1}\) can be computed using the Cholesky decomposition (Williams and Rasmussen 2006).

  • Step 3: Use the rational quadratic kernel.

    Using the rational quadratic, the kernel is computed in Eq. 5 as follows:

    $$k\left({x}_{i},{x}_{j}|\theta \right)={\sigma }_{f}^{2}{\left(1+\frac{{r}^{2}}{2\alpha {\sigma }_{l}^{2}}\right)}^{-\alpha }$$
    (5)

    where \(r\) is the Euclidean distance computed as \(r=\sqrt{{\left({x}_{i}-{x}_{j}\right)}^{T}({x}_{i}-{x}_{j})}\), \(\alpha\) is a positive scaler parameter, \({\upsigma }_{\mathrm{f}}\) is the standard deviation, and \({\upsigma }_{\mathrm{l}}\) is the characteristics length.

Neural network time series nonlinear autoregressive

CO2 emission is frequently subject to rapid transients and high variance; therefore, time series forecasting models have to overcome the nonlinearity of the change in the carbon emission level. According to a research study (Lapedes and Farber 1987), time series can be modeled using the following nonlinear autoregressive model as presented in Eq. 6 as follows:

$$y\left(t\right)=h\left(y\left(t-1\right), y\left(t-2\right), \dots , y\left(t-d\right)\right)+\varepsilon \left(t\right)$$
(6)

In prediction, past values are used to predict future values. Nonlinear autoregressive time series neural network is implemented using a multilayer feed-forward network with feedback connection (Lapedes and Farber 1987; López et al. 2012). Figure 4 illustrates the topology of the nonlinear autoregressive neural network.

Fig. 4
figure 4

Nonlinear autoregressive neural network structure

The model can be expressed in the Eq. 7 as follows:

$$\widehat{y}\left(t\right)=h\left(y\left(t-1\right), y\left(t-2\right), \dots , y\left(t-d\right)\right)+\varepsilon \left(t\right)$$
(7)

where the function \(h\) is unknown and the training aims to approximate the function \(h\) by adjusting the weights and bias. The term \(\varepsilon (t)\) denotes the error term assuming that it is a random independent variables’ sequence having zero mean and finite/constant variance. The network uses feedback delays to learn from the past time series values. The parameter \(d\) represents the delay and can be tuned for the desired accuracy by the trial and error method.

There are several training algorithms that can be used for training the neural network such as Levenberg–Marquardt, Bayesian regularization, and scaled conjugate gradient. Levenberg–Marquardt is the most commonly used algorithm. Scaled conjugate gradient algorithm uses gradient calculations that make it more memory efficient unlike the Jacobian calculations used by Levenberg–Marquardt and Bayesian regularization training algorithms. However, Bayesian regularization often attains better solutions for noisy and small problems. The neural network used in this study is an open-loop network. Prediction using open-loop networks are often more efficient than the closed-loop network. The open-loop network allows the network to be fed with correct feedback inputs despite training the network to produce the correct feedback outputs.

Holt’s method

Holt’s forecasting method uses an exponentially moving weighted average, using which flattens the random fluctuations. It puts a lessening weight on the older data. Because of these Holt’s method properties, it can perform well with a minimum amount of data (Holt 2004a, b). The new average value or moving average is estimated by calculating the weighted average of the current value of the series and the average value of the last period. Holt’s method’s capability of forecasting with minimum data combined with its flexibility makes it a good candidate for predicting CO2 emissions. As the amount of CO2 emissions data is limited and containing many fluctuations can be observed by simply plotting the data used for this research. If there may have trends of different kinds. There are three types of trends which are linear, exponential, and damped. A linear trend means an absolute equal value decrease or increases from one stage of time to a different. An exponential trend illustrates that a relatively equal value decrease or increases from one period to a different. A damped trend is a combination of the behavior of exponential trend and linear trend.

The following three equations outline the method. The first equation is the forecasting equation, the second equation is the level equation (for value smoothing), and the third equation is for the trend. The forecasting model uses the trend and level equations to address trends and fluctuations. The equations are defined in Eq. 8 as follows:

$${y}_{t+h|t}={l}_{t}+h{b}_{t}$$
(8)

where \({l}_{t}\) is the series level at time \(t\); h is the forecast step, and \({b}_{t}\) presents trend. However, \({l}_{t}\) will be calculated through Eq. 9 as

$${l}_{t}={\alpha y}_{t}+\left(1-\alpha \right)\left({l}_{t-1}+{b}_{t-1}\right)$$
(9)

where \(\alpha\) is the smoothing parameter and \({b}_{t}\) will be estimated through Eq. 10 as follows:

$${b}_{t}={\beta (l}_{t}-{l}_{t-1})+\left(1-\beta \right){b}_{t-1}$$
(10)

where \(\beta\) is the trend smoothing parameter such as 0 < \(\beta\)<1.

Forecasting using the linear method exhibits a constant decreasing or increasing trend. Empirical evidence suggests that using it results in over-forecasting (Gardner and McKenzie 1985). The following set of equations illustrates Holt’s method using damped trend is calculated through Eq. 11 as follows:

$${y}_{t+h|t}={l}_{t}+\left(\varphi +{\varphi }^{2}+\dots {\varphi }^{h}\right){b}_{t}$$
(11)

where \({l}_{t}\) is the series level at time \(t\); h is the forecast step, \({b}_{t}\) is the trend, and 0 < \(\varphi\)< 1. Both \({l}_{t}\) and \({b}_{t}\) parameter can be measured through Eqs. 12 and 13, respectively, as

$${l}_{t}={\alpha y}_{t}+\left(1-\alpha \right)\left({l}_{t-1}+{\varphi b}_{t-1}\right)$$
(12)
$${b}_{t}={\beta (l}_{t}-{l}_{t-1})+\left(1-\beta \right){\varphi b}_{t-1}$$
(13)

where \(\alpha\) is the smoothing parameter and \(\beta\) is the trend smoothing parameter such as 0 < \(\beta\)<1.

Implementation and analysis of results

This research study has used the CO2 emission data collected from Global Change Data Lab’s project “Our World in Data” (World Bank (2020). This project has consolidated data for CO2 emission from the Carbon Dioxide Information Analysis Center (CDIAC) and the Global Carbon Project. The data obtained is open source and includes the CO2 emission data from 1933 to 2018 for Bahrain. The data for 2019 was taken from the BP report (BP 2020). The amount of CO2 emission is presented in million metric tons. This research study applies multiple methods and endeavors to forecast the CO2 emission of Bahrain. It also provides the comparison among the forecasted amount by different approaches. The implementation of the discussed methods has been done using a variety of open-source Python libraries. The GPR model was implemented using the rational quadratic kernel function with fivefold cross-validation. The neural network created for predicting has 11 hidden layers, and the number of delays was kept at 2. The network was trained with 3 different algorithms that are Bayesian regularization, Levenberg–Marquardt, and scaled conjugate gradient. By using the trial and error approach with the mentioned algorithms and network configurations, the best accuracy was obtained using the Bayesian algorithm. This experiment implemented multiple approaches of Holt’s method such as exponential smoothing trend, Holt’s linear trend, and adaptive damped trend. “Estimated” method has been used for initializing with the value of 0.11 with 0.9 smoothing level and 0.1 smoothing trends.

The following Fig. 5 illustrates the responses predicted by the Gaussian Process Regression Rational Quadratic (GPR-RQ) Model. As mentioned in the previous section, this research has employed the rational quadratic kernel for training the Gaussian Process Regression model. The horizontal axis represents the year, and the vertical axis represents the amount of CO2 emission.

Fig. 5
figure 5

GPR-RQ model response plot

The above figure represents the plot for the actual amount of CO2 emission and the predicted amount. The diagonal line in Fig. 6 represents a perfect prediction in which predicted values are the same as the actual values. As observed from Fig. 6, almost all the points lie on or nearby the accurate prediction. Training using the GPR-RQ model achieved a root mean square error of 1.0171.

Fig. 6
figure 6

GPR-RQ model predicted vs. actual response plot

Figure 7 training and responses of the trained neural network model. After the trial and error method for initial weight initialization, the best accuracy was obtained using the Bayesian algorithm among the three tested algorithms. The root means squared error obtained using the neural network method was 0.206.

Fig. 7
figure 7

Prediction by neural network time series nonlinear autoregressive using Bayesian regularization

The following Fig. 8 represents the training and responses of the forecasting model using exponential smoothing trend, Holt’s linear trend, and adaptive damped trend. The model was trained to observe the effects of different trends on forecasting accuracy. Experiments were performed with several different values for smoothing parameters and trend smoothing parameters. It was observed that among the three trends, exponential smoothing trend, Holt’s linear trend, and adaptive damped trend, Holt’s linear trend has outperformed the other two. The best accuracy was obtained using \(\alpha =0.9\) and \(\beta =0.1\) for Holt’s linear trends. The root means squared error obtained using Holt’s method was 1.4096.

Fig. 8
figure 8

Prediction using Holt’s method

The following Table 2 illustrates the performance comparison for the CO2 emission prediction models. This research compares the performances of the applied forecasting methods by evaluating their accuracy with regard to root mean square errors (RMSE). RMSE is always a positive value. As the predicted values are small (maximum around 35), therefore, a small root mean square error (RMSE) amounts to a large quantity. Smaller values of RMSE will lead to higher accuracy. As summarized, the accuracy of neural network time series nonlinear autoregressive is much higher than the Gaussian Process Regression model and Holt’s methods. The neural network model has an RMSE of merely 0.206, while the GPR-RQ model has an RMSE of 1.0171, and Holt’s method has an RMSE of 1.4096. Therefore, it can be concluded that neural network time series nonlinear autoregressive has performed better for forecasting the CO2 emission of Bahrain. The following Table 3 illustrates the CO2 emission data forecast until 2025. The forecasted values have been predicted using neural network time series nonlinear autoregressive model. Among all the tested methods for predicting CO2 emission, the neural network model has illustrated better performance. The lower the RMSE value, the more accurate the results are. The root mean square error for CO2 emission prediction has been achieved smallest amount among the tested methods. The achieved RMSE value is almost 5 times smaller than the other methods. Therefore, the neural network time series nonlinear autoregressive model for predicting the CO2 emission in the case of Bahrain has explored the best performance among the methods tested.

Table 2 Performance comparison for forecasting model
Table 3 CO2 emission data forecast till 2025 for Bahrain

Conclusion and policy direction

Among all the greenhouse gasses, carbon dioxide (CO2) is the major contributor to global warming. CO2 is also known as a heat-trapping gas. It traps the solar radiation in the earth’s atmosphere that might have bounced off if the atmosphere was not polluted with CO2 and other greenhouse gases. CO2 is the primary factor for global warming, and it must be curbed or reduced. An international treaty was signed known as the Kyoto protocol that extends the climate change framework by the United Nations to put a restrain on global warming. In 2012, as per the Doha amendment, the baseline year and the target for CO2 were set for several countries including Bahrain. However, as per the CO2 emission data of 2019, Bahrain is lagging behind its target. Bahrain should consider CO2 emission as one of the key factors while developing policies related to energy consumption, production, and CO2 emissions. Carbon capture and storage can be considered one of the strategies for CO2 emission mitigation. However, it comes with several associated risks such as leakage probability; measurement of economic, ecological, and social impacts; and environmental perturbation strength assessment. This research attempts to find an appropriate forecasting model for CO2 emission. Three methods were implemented and evaluated. The neural network model has achieved an RMSE of 0.206, while the GPR-RQ model has an RMSE of 1.0171, and Holt’s method has an RMSE of 1.4096. Therefore, it can be concluded that the neural network time series nonlinear autoregressive model has performed better for forecasting the CO2 emission of Bahrain.

The findings of this study observed a solid interrelationship between the series. Following the above mentioned findings, the government and policymakers of Bahrain should take initiative to promote the power sector and initiate new strategies to diminish environmental pollution. These healthy and robust plans and policies should enhance the consumption and production of alternative, cleaner, and renewable energy sources, such as wind, hydropower, solar, and biomass energy sources, rather than spot lightning on intensive, non-renewable power combustion, production, and consumption from harmful fossil fuels. In addition, the policymakers of Bahrain, as well as government authority, can use this research to establish the power sector status and reduces the chances of several other ecological challenges and prospects. Most prominently, this research can be applied to offer verification that an important improvement and reorganization to energy are mandatory to heighten the level of real economic growth and also get better its environmental sustainability over the point in time. Researchers are working in the direction of investing the impact of different sectors and their proportion in overall CO2 emission of Bahrain and predicting the sector-wise contribution.