A hybrid model for forecasting of particulate matter concentrations based on multiscale characterization and machine learning techniques

: Accurate prediction of particulate matter (PM) using time series data is a challenging task. The recent advancements in sensor technology, computing devices, nonlinear computational tools, and machine learning (ML) approaches provide new opportunities for robust prediction of PM concentrations. In this study, we develop a hybrid model for forecasting PM 10 and PM 2.5 based on the multiscale characterization and ML techniques. At first, we use the empirical mode decomposition (EMD) algorithm for multiscale characterization of PM 10 and PM 2.5 by decomposing the original time series into numerous intrinsic mode functions (IMFs). Different individual ML algorithms such as random forest (RF), support vector regressor (SVR), k-nearest neighbors (kNN), feed forward neural network (FFNN), and AdaBoost are then used to develop EMD-ML models. The air quality time series data from Masfalah air station Makkah, Saudi Arabia are utilized for validating the EMD-ML models, and results are compared with non-hybrid ML models. The PMs (PM 10 and PM 2.5 ) concentrations data of Dehli, India are also utilized for validating the EMD-ML models. The performance of each model is evaluated using root mean square error (RMSE) and mean absolute error (MAE). The average bias in the predictive model is estimated using mean bias error (MBE). Obtained results reveal that EMD-FFNN model provides the lowest error rate for both PM 10 (RMSE = 12.25 and MAE = 7.43) and PM 2.5 (RMSE = 4.81 and MAE = 3.02) using Misfalah, Makkah data whereas EMD-kNN model provides the lowest error rate for PM 10 (RMSE = 20.56 and MAE = 12.87) and EMD-AdaBoost provides the lowest error rate for PM 2.5 (RMSE = 15.29 and MAE = 9.45) using Dehli, India data. The findings also reveal that EMD-ML models can be effectively used in forecasting PM mass concentrations and to develop rapid air quality warning systems.

algorithms; particulate matter

Introduction
Atmospheric pollution is continuously increasing due to natural phenomena (volcanic activities, desert storms etc.) and immense anthropogenic (smoke of vehicles, industrial activities, fossil fuels for energy requirements etc.) pollution generating activities [1][2][3]. Air pollution has both short and long term health hazards. Irritation in the nose, eye, throat, allergic reactions, cough, and upper respiratory infections are examples of short term effects of air pollution. Cardiovascular dysfunctions, respiratory tract infections, and cancer are some of the widely putative long term effects of air pollution [4][5][6]. These diseases are correlated with millions of deaths globally each year [7,8]. Approximately 7 million people die due to household and environmental air pollution, 94% of which die in low and middleincome countries [9]. The maximum burden of these deaths is observed in South East Asia (2.4 million) followed by Western Pacific (2.2 million) [9].
The impact of particles within the human respiratory system and in the atmosphere is largely governed by their size and generally by their other physical properties. Their size may vary from nanometers to tens of micrometers. Based on their size, particles may be categorized as fine particles (PM2.5 having a diameter of 2.5 micrometers (μm) or less) and coarse particles (PM10 having a diameter between 2.5 μm and 10 μm). Fine particles may further be categorized into ultrafine/nuclei mode (with a diameter from 0.01 μm to 0.1 μm) and accumulation mode (diameter from 0.1 μm to 1.0 μm). PM2.5 is the most hazardous ambient air pollutant for human health [10]. High PM10 concentrations can cause premature death in older people with respiratory diseases and heart problems [11].
Air pollutants forecasting is an efficient way of protecting public health, as it provides an early warning against hazardous air pollutants [12]. Forecasting the levels of pollutants may be helpful to minimize the adverse health implications by reducing the exposure of these particles through timely alerts for the general public to take preventive measures. The atmospheric systems are inherently nonlinear, and pollutants are dynamically complex in nature [13], which makes the prediction of atmospheric pollutants a challenging task. The advances in digital electronics, computing, and sensor technologies led to accurate spatio-temporal monitoring and effective forecasting of atmospheric pollutants. Numerous techniques have been developed to forecast PM concentrations such as time series analysis, artificial intelligence (AI), linear or nonlinear regression, and chemical transport models [14]. However, hybrid forecasting models are more accurate and robust when compared to single forecasting models [14]. Chelani and Devotta [15] developed a hybrid model by combining the autoregressive integrated moving average model, which deals with linear patterns. The mass concentration time series data of atmospheric pollutants is an outcome of complex natural and anthropogenic activities evolving with time, which operate on multiple time scales [13]. Shah et al. [13] proposed a hybrid forecasting model based on the multiscale characterization of reconstructed phase space and machine learning (ML) techniques for the prediction of PM2.5 and PM10.0. Huang et al. [16] proposed empirical mode decomposition (EMD), to address the non-stationary and nonlinear behaviors present in the data which motivates practitioners and researchers to use it as an effective tool. The EMD is based on statistical modeling, which is another technique used for multiscale characterization and forecasting of nonlinear and nonstationary time series data [17][18][19][20][21][22][23][24][25][26]. In a study conducted at Xingtai in China, Zhu et al. [27] proposed two EMD based hybrid models (EMD-SVR-Hybrid and EMD-IMFs-Hybrid) to forecast air quality index (AQI) data. They compared the performance of proposed models with single forecasting models based separately on support vector regression (SVR), generalized regression neural network (GRNN), autoregressive integrated moving average models (ARIMA), EMD-GRNN, Wavelet-SVR, and Wavelet-GRNN. They found that proposed hybrid models were superior and can be used for the forecasting of air pollution. In a study by [28], road traffic prediction was performed using EMD based convolution neural network (CNN) model. The results of the study show that prediction results of EMD based CNN model are more accurate than Lasso-BP, PCA-BP, and standard CNN models. Zhou et al. [29] developed a hybrid model (EEMD-GRENN) by utilizing ensemble EMD in combination with a regression neural network for the forecasting of PM2.5 in Xi'an, China. They compared the proposed model (EEMD-GRNN) with ARIMA, principal component regression (PCR), multiple linear regression (MLR), and GRNN and found that the performance of the EEMD-GRNN model was much better than other models. In another study [30] proposed a novel hybrid decomposition and ensemble model by incorporating grey wolf optimizer (GWO), complementary ensemble EMD (CEEMD), and support vector regression (SVR). They compared the results of the proposed model with single AI models, hybrid decomposition ensemble model optimized by using different algorithms, and hybrid decomposition ensemble model with different decomposition methods. They achieved high prediction accuracy for PM2.5 concentrations using the proposed model.
In this study, the EMD algorithm is combined with ML algorithms (random forest (RF), support vector regressor (SVR) with linear and radial kernels, k-nearest neighbors (kNN), feed forward neural network (FFNN), and AdaBoost) to develop EMD-ML models (EMD-RF, EMD-SVR-L, EMD-SVR-R, EMD-kNN, EMD-FFNN, and EMD-AdaBoost) to forecast two types of PMs (PM10 and PM2.5) concentrations. To evaluate and compare the algorithms, monthly PM concentrations (PM10 and PM2.5) have been predicted. In EMD-ML models, EMD is employed to decompose original PMs time series data into several intrinsic mode functions (IMFs). Then the spearman coefficient correlation is used to select the IMFs having a strong correlation with the original time series and finally, ML algorithms are used to forecast monthly PMs concentrations using selected IMFs. Hourly averaged data from Masfalah air quality monitoring station of duration from January 2014 to September 2015 and hourly averaged data from Dehli city, India of duration from January 2018 to December 2019 have been used. Single forecasting models using simple RF, SVR-L, SVR-R, kNN, FFNN, and AdaBoost algorithms alone are also developed to forecast monthly PM (PM10 and PM2.5) concentrations of Masfalah air quality monitoring station using input data of pollutants (CO, NO2, and CO2) and meteorological parameters (temperature (Temp), wind speed (WS), and relative humidity (RH)). The results indicate that the EMD-ML models outperform the single models. EMD-FFNN model provides the lowest error rate for both PM10 (RMSE = 12.25 and MAE = 7.43) and PM2.5 (RMSE = 4.81 and MAE = 3.02) using Misfalah, Makkah data whereas EMD-kNN model provides the lowest error rate for PM10 (RMSE = 20.56 and MAE = 12.87) and EMD-AdaBoost provides the lowest error rate for PM2.5 (RMSE = 15.29 and MAE = 9.45) using Dehli, India data. Therefore, EMD-ML models can be used in forecasting complex time series and to develop rapid air quality warning systems.
The rest of the paper is organized as follows: First, we describe in detail the datasets used in this study along with the EMD-ML models' flowchart and algorithm and other ML algorithms. Then the results of the study are presented and discussed followed by the conclusion section.

Data set
The datasets used in this study were collected from the Masfalah air quality monitoring station (AQMS111) and were previously used by researchers [31]. The monitoring station is situated in the Holy city of Makkah, Saudi Arabia. The reason for selecting the Masfalah site is that it is very near to the Holy Mosque (Al-Haram), a very busy area surrounded by shops and residential houses. The road near the monitoring station is very busy which emits almost all sorts of air pollutants. High levels of air pollutants pose a potential risk to the local residents, workers, and visitors. Therefore, it is important to monitor air quality in this area and carry out air quality health risk assessment. Hourly data from January 2014 to September 2015 monitored using Aeroqual AQM60 environmental station are used in this study. The data includes air pollutants (nitrogen dioxide (NO2) (μg/m3), carbon monoxide (CO) (mg/m3), and carbon dioxide (CO2) (PPM)), particulate matters (PM10 (μg/m3) and PM2.5 (μg/m3)) and meteorological parameters (temperature (Temp) (°C), wind speed (WS) (m/s) and relative humidity (RH) (%)).
Strict quality assurance and quality control (QA/QC) measures are taken to ensure data quality [31]. The QA measures comprise a selection of monitoring site, correct instrument deployment, instrument selection, design of sample system, and appropriate training of operators. QC is maintained by steps such as calibration of the instrument and its response, routine site visits, monitoring calibration gases, data review, data testing, and authorization.
Missing values and extreme pollutant cases (outliers) have been screened. According to [32] missing data can be handled by modeling the data as a distribution for its estimation, by deletion, and by imputation estimates. If data contains missing values < 5%, then any method can be used for the identification and correction of data [33]. Datasets used in this study contain missing values < 2%, and the deletion method has been used for handling missing data. Outliers present in the data are replaced with the mean value of specific month data. The outliers were identified by computing the z-score. The data values having a z-score greater than 2 standard deviation from the mean position were considered outliers. We use mean for imputing new value to handle extreme pollutant cases.
The second datasets used in this study were obtained from an online source [34] and is collected from Dehli city, India. The datasets contain PMs (PM10 and PM2.5) concentrations data of duration from January 2018 to December 2019 and are utilized for validating the EMD-ML models.

Empirical mode decomposition
Huang et al. [16] proposed the EMD method to decompose non-linear and non-stationary signals into various IMFs and a residual. Each IMF component of the original signal must satisfy two conditions. (a) The total number of zero-crossing and extrema must be equal or vary at most by one. (b) At all points, the envelope mean value defined by both local minima and local maxima must be zero. The steps involved in the EMD algorithm are as follows.
Step 1: Identify all local minima and maxima of input time series data ( ). By using cubic spline interpolation, generate lower envelop ( ) using local minima and upper envelope ( ) using local maxima.
Step 2: Compute the mean of lower and upper envelopes ( ) = ( ( ) + ( ) )/2. Step 3: Compute the candidate IMF ( ) by subtracting envelopes mean ( ) from original input time series data ( ) . If ( ) satisfied the above mentioned conditions of IMF, ( ) is considered as i th IMF and residual ( ) is substituted for the original time series data ( ) as ( ) = ( ) -( ).
Step 4: If candidate IMF ( ) does not meet the above mentioned conditions of IMF, replace the original input time series data ( ) with ( ).
Step 5: Repeat step (1-4) until the residual ( ) becomes a constant value or monotonic function, or there is no more IMF to extract from residual ( ).

Hybrid EMD-ML models
Hybrid EMD-ML models are developed by incorporating traditional EMD, correlated IMFs, and ML algorithms (RF, SVR-L, SVR-R, kNN, FFNN, and AdaBoost) for improved forecasting. For this purpose IMF components (generated through EMD) selected using the spearman correlation coefficient are used to predict each original time series. The whole process in the development of each EMD-ML model (EMD-RF, EMD-SVR-L, EMD-SVR-R, EMD-kNN, EMD-FFNN, and EMD-AdaBoost) is illustrated in Figure 2.

Learning algorithms
In this section, five learning algorithms used in this study are explained.

Feed-forward neural network (FFNN)
The artificial neural network (ANN) concept is based on a biological neural network of the human brain. The ANN is a computer model used to recognize relations or patterns among data [36]. Two main components of the ANN are a set of nodes and node links.
The feed forward neural network (FFNN) is the simplest form of ANN. In FFNN, data/input flow in one direction only. The FFNN has multiple processing elements (neurons). The neurons are linked to each other through weights. The FFNN comprises of input, hidden, and output layer(s). At the input layer, various input parameters are passed, also the aggregated weighted values are applied to hidden layer neurons. The hidden layer(s) is the intermediated layer between the input and output layers. It performs intermediate calculations. The aggregated weighted values computed at the hidden layer are applied to the output layers. The output layer produces the final output. The output Y obtained (at the output layer) is given as: where ( 0 , 1 , … … … , j , α 10 , … … … , α ) are weight and bias parameters, respectively. and ω represent the activation functions that are applied at the hidden layer as well as the output layer. are the input values for each input neuron . We used 100 neurons in the hidden layer with the logistic activation function to develop the FFNN model.

Adaptive boosting (AdaBoost)
Adaptive boosting (AdaBoost) is the first effective boosting algorithm proposed by [37]. AdaBoost produces weak learners by adjusting each weak learner's weights adaptively. AdaBoost raises the weight of misclassified samples after training a weak learner such that these samples contribute more in the next weak learner training set. The AdaBoost predictions are made by majority voting of the weak learners' outcomes. Therefore, AdaBoost mainly works by generating expanding diversity that can enhance prediction performance.

Random forest (RF)
Random forest (RF) is a type of ensemble learning algorithm, proposed by [38]. The RF algorithm depends on the classification and regression trees (CART) model. The aim of CART is to learn the relation between a dependent (X) and a series of predictor (Y) variables. The RF algorithm is built on a multitude of decision trees, which are then aggregated into a forest. First, each tree is constructed according to the bagging method on a random sample of the observations. Secondly, a random collection of features is chosen to separate nodes for each forest tree (feature sampling). Eventually, the trees are aggregated in order to use the model for prediction. This is achieved by averaging the results. In this study 10 number of trees are used to construct RF predictive model.
where the model wk introduces a continuous output. The contribution of patterns closer to the target in the prediction should be more than other patterns. The similarity in term of the distance between patterns can be defined as: In this study, k = 3 is used to construct the kNN model.

Support vector regressor (SVR)
Let [( 1 , 1 ), … , ( , )] be a set of training data, where each ⊂ denotes the input samples along with conforming target value ⊂ for = 1, … , ( is the size of training data) [41]. The generic form of SVR estimating function is: In the above equation, ⊂ , ⊂ and represents the non-linear transformation from to high dimensional space. The objective is to identify the and in order to determine the values of by minimizing the regression risk.
is a constant, represents a cost function. In terms of data points, vector can be written as: The generic equation using Eqs 4 and 6 can be rewritten as: ( , ) indicates the kernel function. The dot product in Eq (7) can be replaced with kernel function ( , ) . The mathematical representations of kernel functions used in this study are as follows.

Evaluation measures
The root mean square error (RMSE) and mean absolute error (MAE) is the most commonly used measures for evaluating the performance of predictive models. The range of both measures is from 0 to ∞, lowest values show that the predicted model's performance is better. The RMSE can be determined by taking the square root of mean square error (MSE) and can provide a complete error distribution scenario. MAE is the average of absolute differences between the actual and predicted values. Mean bias error (MBE) is also used to estimate the average bias in the model or average forecasting error. MBE represents the systematic error of the forecasting model to over or under forecast. The positive value of MBE represents the over-forecast of the model whereas the negative value represents the under-forecast of the model. The mathematical equations used for computing RMSE, MAE, and MBE are given below.
where represents the target (expected) values and is the model's predicted values.

Results
In the first phase of each of the EMD-ML models, EMD is used to extract the data characteristics of PM10 and PM2.5 time series by decomposing the historical data as presented in Figure 2 and discussed before. EMD algorithm is applied on both PMs (PM10 and PM2.5) timeseries data of Misfalah, Makkah, and 14 IMFs along-with a single residual has been generated for each of the time-series data. Similarly, the EMD algorithm is applied on both PMs (PM10 and PM2.5) time-series data of Dehli, India, and 11 IMFs along-with a single residual for PM10 and 12 IMFs along-with a single residual for PM2.5 have been generated. Figure 3 Table 1 shows that for PM10 time series data of Misfalah, Makkah, IMF3-IMF11 and IMF13-IMF14 have a strong correlation with original data and for PM2.5 data of Misfalah, Makkah, IMF2-IMF14 and residual have a strong correlation with original data. For PM10 data of Dehli, India, IMF2-IMF8, and IMF10-IMF11 have a strong correlation with original data and for PM2.5 data of Dehli, India, IMF3-IMF4, and IMF6-IMF12 have a strong correlation with original data.
Therefore, in the second phase of EMD-ML models, only these IMFs are given to ML algorithms for the prediction of each time-series data.  Figure 4. The prediction curve produced by each of the EMD-ML models using setting 1 fits better at many points and followed the trend of actual values in a quite better way for both PM10 and PM2.5 concentration as compared to single forecasting models. As the trend of predicted values using each of the EMD-ML models is quite closer to actual values which clearly showed that the hybrid EMD-ML models can better forecast PMs concentrations. Among all EMD-ML models, EMD-FFNN model using setting 1 provides better prediction of both PMs.
Similarly for forecasting both PM10 and PM2.5 time series of Dehli, India, selected IMFs data (length of each IMF is the same as the original data) and original data are organized according to setting 1 and setting 2, but in setting 1 the train-set consists of the selected IMFs and original data from January 2018 to November 2019, while the test-set comprises of selected IMFs and original data of December 2019.
The design of the ML algorithms (RF, SVR-L, SVR-R, kNN, FFNN, and AdaBoost) follows the configurations detailed in the section learning algorithms and hybrid EMD-ML models. RMSE, MAE, and MBE measures are computed to evaluate the performances of learning algorithms used for forecasting PM10 and PM2.5 time series. The exemplary plots of actual and predicted values of PM10 and PM2.5 using EMD-ML models according to both setting 1 and setting 2 are illustrated in Figure 5. The prediction curve produced by each of the EMD-ML models using setting 2 fits better 0.09 0.16 -0.04 -0.07 at many points and followed the trend of actual values in a quite better way for both PM 10 and PM2.5 concentrations. As the trend of predicted values using each of the EMD-ML models is quite. The scatter plot of observed and predicted values (prediction has been done using EMD-ML closer to actual values which clearly showed that the hybrid EMD-ML models can better forecast PMs concentrations. Among all EMD-ML models, the EMD-kNN model using setting 2 for PM10 and EMD-AdaBoost model using setting 2 for PM2.5 provides better prediction.models) of both PM10 and PM2.5 concentrations are presented in Figure 6. The figure shows a good agreement between observed and predicted values.
In Table 2, prediction results of both PM10 and PM2.5 using setting 1 and setting 2 in terms of RMSE, MAE, and MBE based on EMD-ML models and single forecasting models are presented for Misfalah, Makkah. It is clear from the table that EMD-ML models using setting 1 produced lower errors against forecasted values of both PM10 and PM2.5 as compared to setting 2 and traditional single forecasting models. The lowest error rate in terms of RMSE and MAE for both PM10 (RMSE = 12.26 and MAE = 7.43) and PM2.5 (RMSE = 4.81 and MAE = 3.02) have been achieved using the EMD-FFNN model. In the case of single models the lowest error rate in terms of RMSE and MAE for PM10 (RMSE = 22.18 and MAE = 11.98) has been achieved using the RF model and setting 1 and for PM2.5 (RMSE = 11.88 and MAE = 8.28) has been achieved using FFNN model and setting 1. The results clearly show that hybrid models with setting 1 of data are the robust choice for the prediction of PM concentrations of Misfalah, Makkah.
MBE represents the systematic error of the forecasting model to over or under forecast. The positive value of MBE represents that the predictive model is overestimated and vice versa. The MBE values present in Table 2 are considerably better showing no bias for models RF, kNN, FFNN, and AdaBoost. The SVR-L and SVR-R models exhibited the highest MBE values showing model bias which needs to be filtered out.
In Table 3, EMD-ML models based prediction results of both PM10 and PM2.5 using setting 1 and setting 2 in terms of RMSE, MAE, and MBE are presented for Dehli, India. It is clear from the table that EMD-ML models using setting 2 produced lower errors against forecasted values of both PM10 and PM2.5 as compared to setting 1. The lowest error rate in terms of RMSE and MAE for PM10 (RMSE = 20.56 and MAE = 12.87) and PM2.5 (RMSE = 15.29 and MAE = 9.45) have been achieved using EMD-kNN and EMD-AdaBoost models respectively. The results clearly show that hybrid models with setting 2 of data are the robust choice for the prediction of PM concentrations of Dehli, India data.
MBE represents the systematic error of the forecasting model to over or under forecast. The positive value of MBE represents that the predictive model is overestimated and vice versa. The MBE values present in Table 3 are considerably better showing no bias for models each model against setting 2. Various EMD-ML models for PM10 and PM2.5 using setting 1, exhibited the highest MBE values showing model bias which needs to be filtered out.
The feasibility of EMD-ML models (EMD-RF, EMD-SVR-L, EMD-SVR-R, EMD-kNN, EMD-FFNN, EMD-AdaBoost) lies in the following two points. First, the PMs (PM10 and PM2.5) concentrations, which are non-stationary and non-linear, can be decomposed into various IMFs using the EMD algorithm. Thus, IMFs having a strong correlation with original data can be used as input for EMD-ML models. Second, the EMD-ML models are well suited for time-series data prediction and have achieved significant results in various fields like wind speed forecasting [19], Chinese currency exchange rates forecasting [20], energy time series forecasting [21], rotating machinery structural faults detection [24], sudden cardiac death (SCD) prediction [26], and air quality index forecasting [27]. In comparison with the study of [27] which suggests EMD-SVR-Hybrid as an optimal predictive model for the forecasting of daily AQI with RMSE = 24.46 and MAE = 18.10, the performance of EMD-ML models used in this study is quite better (optimal predictive model is EMD-FFNN with RMSE = 12.26 and MAE = 7.43 for forecasting PM10 and RMSE = 4.81 and MAE = 3.02 for forecasting PM2.5). Similarly, in comparison with [29] study which utilizes ensemble EMD in combination with regression neural network (EEMD-GRNN) for the forecasting of PM2.5, with RMSE = 29.41 and MAE = 19.80, the performance of EMD-ML models used in this study is quite better for forecasting both PM10 and PM2.5.
In general, EMD-ML models can be better than single forecasting models for the prediction of PMs (PM10 and PM2.5) concentrations. The results of the current study verify the validity and feasibility of EMD-ML models. Comparison of actual and predicted curves for predicting A) PM10 time series data using each of EMD-ML model and setting 1, B) PM10 time series data using each single forecasting models and setting 1, C) PM10 time series data using each of EMD-ML model and setting 2, D) PM10 time series data using each single forecasting models and setting 2, E) PM2.5 time series data using each of EMD-ML model and setting 1, F) PM2.5 time series data using each single forecasting models and setting 1, G) PM2.5 time series data using each of EMD-ML model and setting 2, H) PM2.5 time series data using each single forecasting models and setting 2.

Conclusions
In this study, the EMD algorithm was applied to address the trends and random behavior of time series data to enhance the accuracy of PM forecasting. This study attempted to improve ML model prediction by coupling them with the EMD procedure. The EMD algorithm is used for multiscale characterization of PM10 and PM2.5 by decomposing the original time series into numerous IMFs. We used Spearman's correlation coefficient to select strong correlated IMFs of PM10 and PM2.5 to build a predictive model. The air quality time series data from Masfalah air station Makkah, Saudi Arabia, and Dehli city, India are utilized for the validation of the developed hybrid model. Firstly, the EMD based predictive models are applied to predict monthly PMs (PM10 and PM2.5). For the hybridized models, the original time series data are decomposed into fourteen IMFs and one residual for the PMs modeling process. The non-hybridized RF, SVR-L, SVR-R, kNN, FFNN, and AdaBoost models are also applied to forecast monthly PM (PM10 and PM2.5) using input data of pollutants (CO, NO2, and CO2), PMs (PM10 and PM2.5), and meteorological factors (Temp, WS, and RH). The results demonstrated that correlated IMFs incorporated in EMD-ML models provide more prediction abilities of PMs and should be recommended to forecast PMs concentrations.
The EMD-ML models have accomplished good predictive performance and can be applied for the prediction of other pollutants present in the air as well as for other time-series data such as biological signals, financial time series, and energy time series. Other versions of EMD such as ensemble EMD (EEMD), complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and multivariate EMD (MEMD) can also be used instead of EMD in EMD-ML models.