Domain adaptation based deep calibration of low-cost PM 2 . 5 sensors

—Air pollution is a severe problem growing over time. A dense air-quality monitoring network is needed to update the people regarding the air pollution status in cities. A low-cost sensor device (LCSD) based dense air-quality monitoring network is more viable than continuous ambient air quality monitoring stations (CAAQMS). An in-ﬁeld calibration approach is needed to improve agreements of the LCSDs to CAAQMS. The present work aims to propose a calibration method for PM 2 . 5 using domain adaptation technique to reduce the collocation duration of LCSDs and CAAQMS. A novel calibration approach is proposed in this work for the measured PM 2 . 5 levels of LCSDs. The dataset used for the experimentation consists of PM 2 . 5 values and other parameters (PM 10 , temperature, and humidity) at hourly duration over a period of three months data. We propose new features, by combining PM 2 . 5 , PM 10 , temperature, and humidity, that signiﬁcantly improved the performance of calibration. Further, the calibration model is adapted to the target location for a new LCSD with a collocation time of two days. The proposed model shows high correlation coefﬁcient values (R 2 ) and signiﬁcantly low mean absolute percentage error (MAPE) than that of other baseline models. Thus, the proposed model helps in reducing the collocation time while maintaining high calibration performance


I. INTRODUCTION
A IR pollution is a global problem that causes seven million deaths every year [1].It has several adverse effects on human health.It also has a significant contribution to the mortality rate in India [2].Hence, public awareness of air pollution is an essential requirement that demands regular monitoring of the pollution level.Air pollution from particulate matter (PM) ranks as one of the leading causes of death globally [3], [4].The continuous ambient air quality monitoring stations (CAAQMS) can provide real-time PM 2.5 information to people.The establishment of a dense air pollution monitoring network using the CAAQMS is not a feasible option due to their high cost [5].As a result, the spatial resolution of CAAQMS air quality measurements is Sonu Kumar Jha and Mohit Kumar are with the Centre for Environmental Science and Engineering, Indian Institute of Technology Kanpur, India (e-mail: ksonu@iitk.ac.in;mohitk@iitk.ac.in;) Vipul Arora is with the Department of Electrical Engineering, (e-mail: vipular@iitk.ac.in)Sachchida Nand Tripathi is with the Department of Civil Engineering and Centre for Environmental Science and Engineering, (e-mail: snt@iitk.ac.in)Vidyanand Motiram Motghare, A. A. Shingare, Karansingh A. Rajput, Sneha Kamble are with the Maharashtra Pollution Control Board, (e-mail: jdair@mpcb.gov.in;ms@mpcb.gov.in;srohq11@mpcb.gov.in;srohq10@mpcb.gov.in) Corresponding author: Vipul Arora and Sachchida Nand Tripathi insufficient for extensive spatiotemporal mapping [6].
Small and portable, low-cost sensor devices (LCSDs) may improve the capacity to characterize the PM 2.5 concentrations with high spatial and temporal resolution [7].The LCSD is a potential technology to meet the PM monitoring network's requirement in densely populated cities [6], [8].These sensors are capable of capturing spatial variability more effectively [9].In recent years, various start-ups have emerged, providing compact and affordable wireless PM 2.5 sensors.However, the measured data using the low cost sensors is less reliable than that of the CAAQMS [10], [11].Therefore, it is an essential requirement to calibrate the LCSDs against the CAAQMS.
A lot of research has been carried out on the calibration of LCSDs in recent years [12], [13].Various studies have reported the satisfactory performance of the PM 2.5 low-cost sensors when compared against Federal Equivalent Methods (FEMs) or research-grade instruments [8], [14], [15].It is observed that the LCSDs are helpful for the evaluation of short-term changes in the aerosol environment [14].The authors [8] concluded that appropriate calibration models are an essential requirement for the LCSDs to achieve high accuracy and precision over a wide range in PM 2.5 concentration.In [16], LCSDs and reference monitors are collocated to monitor the gaseous pollutants and PM.They found that the adequately supported LCSDs with the data modeling tools have shown a considerable potential to measure the air quality.A system based on linear regression and Gaussian process regressor for the calibration of low-cost PM 2.5 sensors is proposed in [17].This method is only effective for the high degree of urban homogeneity in PM 2.5 [17].In [8], it is demonstrated that the performance of the quadratic calibration method is found better than that of the simple linear counterpart.The study [5] focused on the inter-comparison of low-cost PM sensors in polluted sites and observed the consistent performance of PM sensors.In [18], a calibration model is developed for various air pollutants such as PM 2.5 and CO 2 and tested across sites and across devices.This work does not use domain adaptation.
The methods discussed above to calibrate PM 2.5 values measured using LCSDs performed well at the same site at which they are trained.The deployment of these models at target locations may degrade the performance as the models are not location and device independent.Therefore, in the present work, a calibration method is proposed which is developed at one location (source location s) for a device d and can be adapted to deploy at a target site (s ) and a new device (d ).First, the base calibration model M for a d, using machine learning algorithms, is developed at s using large training dataset of two months.After that, the M is adapted at s for a d using a lesser amount of training dataset.Hence, the collocation time of LCSDs with CAAQMS can be significantly reduced.To achieve the objective, the domain adaptation-based method is implemented in the present work.To the best of the authors' knowledge, domain adaptationbased calibration of PM 2.5 LCSDs has not been explored yet.This paper contributes the following facet.
1) The features derived from PM 2.5 , PM

III. METHODOLOGY
This work presents a methodology for the calibration of the PM 2.5 measured using LCSDs.The overall method is developed in two phases.In the first phase, the M for a d is developed using different machine learning techniques at s. Finally, the M is adapted for d at s using the domain adaptation methods with a shorter collocation time.The steps carried out for developing the calibration model are shown in Fig. 1, and described as follows: A. Data pre-processing and feature preparation The obtained dataset contains outliers and missing values.Therefore, it needs to be pre-processed before the experimental work.We consider all the sensor readings at the same time instant (PM 2.5 , PM 10 , RH, Temp, and other derived features) as one data sample.If any of the readings is missing from a data sample or goes out of a fixed range, we discard that data sample and do not use it for training.The range is defined as 1 • C < Temp < 50 • C and 1 % < RH < 100 %.The dataset contains features that have different ranges.We use standardization because different machine learning and deep learning-based algorithms are affected by the scale of the input.Hence, we perform feature standardization to reduce the training sensitivity to the inputs' range and make the features well-conditioned for optimization.The standardization is performed as follows [19]: Where N f and z denote the standardized input features and actual input features, the z and σ z represent the mean and standard deviation of the actual input features.The output is not standardized in this work.The details of the standardization is provided in the supplementary material.The performance of the calibration model is improved by selecting useful features.There are many possible combinations of base features.We have empirically selected a few simpler combinations of the base features named as derived features.We get intuition from the studies [8], [20], [21].In [8], it is shown that RH 2 /(RH − 1) is a useful feature to form the calibration model.In [20], author have used empirical approach to select the combination of base features using the trade-off between performance and functional complexity.The time-lags based features are also found to be useful for the calibration of LCSDs. in [21].
In this paper, effective time-lags [22], weather, time, and non-linear features are used for developing the model.The effective values of lags for PM 2.5 and PM 10 are found using the cross-correlation coefficients [23].The other features are also selected using cross-correlation coefficients.The Pearson's correlation of the features w.r.t CAAQMS PM 2.5 is computed as follows: Where x t and y t denote the value of t th feature and CAAQMS' PM 2.5 , respectively.The x, and y represent the mean value of the features and CAAQMS' PM 2.5 , respectively.During the feature selection process, the feature with the highest correlation coefficient is considered and is fed as input of the calibration model and the R 2 value is computed.Further, more features are added, one by one, based on their correlation coefficient, until the improvement in R2 becomes negligible.
In this way, we have selected 27 features in the present work.The 27 features that have shown high correlation with the reference PM 2.5 values are selected that are shown in Table I.Their correlation values are shown in Fig. 2. The indexes at 0, 5, 10, 15, 20, and 25 represent PM 2.5 , Sin( 2π 24 hour , PM 2.5 at lag t-23, PM 10 at lag t-3, PM 2.5 × RH, and PM 10 × Temp × RH, respectively.The two features showing the highest correlation coefficient in Fig. 2 at indices 7 and 13 are PM 2.5 and PM 10 at lag 1, respectively.Hence, these features have the highest significance in the formation of the calibration

B. Techniques used to obtain the calibration model
The various technique to model the relationship between LCSDs and CAAQMS are summarised as follows: 1) Linear Regression (LR)

It models the linear relationship between input and output.
A multivariate linear regression model (MVR) is a commonly used method for the calibration of low-cost sensors [24], [25].

2) Ridge Regression (RR)
It reduces the overfitting of the LR model using L 2 regularization [26].

4) Elastic Net Regression (ENR)
It reduces the LR model's overfitting by incorporating both the L 1 and L 2 regularization [27].

5) Support Vector Regressor (SVR)
SVR can also be used to train a model to map the low-cost sensors' data to the CAAQMS.It estimates the function using a support vector machine.The prediction using SVR depends upon the support vectors [28].It is utilized to calibrate the low-cost sensors for measuring ozone concentrations [29].

6) Deep Neural Network (DNN)
Artificial Neural Network is a widely used tool in various time series regression and forcasting problems [22], [30].It approximates the non-linear relationship between inputs and output for developing the calibration model.It has an input layer, output layer, and a hidden layer between the input and output layers [31].The DNN consists of more than one hidden layers.

C. Domain Adaptation
In the present work, a calibration model M is developed at (s, d) and adapted at (s , d ) using a shorter duration of training data.This kind of approach is suitable at those sites where the CAAQMS monitors are available for a short duration.As the direct deployment of M at (s , d ) may not show good performance due to domain shift [32].Domain adaptation-based techniques are required to overcome these issues.The domain adaptation refers to adapt the knowledge from one domain to work to the new domain [32], [33].This kind of approach is found useful for the calibration of haptic sensors [34].The steps of the domain adaptation method for start-up A and B are different.For startup A, M is adapted at (s , d ) in two steps as follows: Algorithm

5.
Use the D T1 to train the newly added top layer with a higher learning rate using (5).6. Fine-tune the entire model with a lower learning rate by using the D T1 using (6).7. Utilize the new learned regression models for predicting the labels of D T2 as Y cal T2 = f new (X T2 ). 8. Finally, evaluate the performance of the proposed model using the metrices such as, R 2 , M AE, M AP E, and SM AP E as shown in ( 7)- (11). 1.In step one, a new layer is added on the top of the layers M learned at (s, d).The layers trained at (s, d) are frozen and only the newly added layer is trained with a much smaller dataset at (s , d ) .
2. In this step, the entire model is finetuned with a much smaller dataset at (s , d ).We consider M with parameters θ at (s, d), that maps LCSDs observations (x) to calibrated PM 2.5 (ŷ).We use a DNN model to implement M .The parameters of M at (s, d) are initialized randomly.The architecture of M consists of one input layer, seven hidden layers, and one output layer.The parameter θ of M is updated using the two months of data at (s, d) using gradient descent algorithm over D S as follows, Here, α ∈ R is the learning rate.The θ and θ represent the randomly initialized weights and learned weights of M .The denotes the gradient.The used loss function is the mean absolute error, which is defined as Now, to adapt M at (s , d ), we discard the output layer of M and add a layer on the top of M .The weights of the layers trained at (s, d) are freezed and the weights (φ) of the added layer are trained using two days of data at (s , d ): Here, β ∈ R is the learning rate which is kept higher than α ∈ R. The φ and φ are the randomly initialize weights and the learned weights for the new layer added on top of M .Finally, the parameters (θ , φ ) of the entire model are finetuned using two days of data at (s , d ) as follows, Here, γ ∈ R is the learning rate which is kept lower than α ∈ R. The Θ denotes the learned weight after the adaptation of the calibration model at (s , d ).The adapted model is tested using the rest of the data at (s , d ) over D T .

D. Calibration model's performance evaluation criteria
The coefficient of determination R 2 , the mean absolute error (MAE), the mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE) are computed to evaluate the performance of the calibration model.These are expressed as follows: where, K denotes total number of observations.The z t and ẑt (x) are the CAAQMS' and calibrated PM 2.5 values for observation t, respectively.

IV. EXPERIMENTAL RESULTS AND DISCUSSION
We have experimented with two kinds of tasks.In the first task, we have trained using the data at (s, d), and The training data consists of parallel data recorded simultaneously with the CAAQMS and d.The training and testing of M is performed using the data obtained by d, which is collocated with CAAQMS at s.For the experiments, out of the entire dataset, the last two weeks' data (336 hours) is used for testing, the data from two weeks previous to that is used for validation and the remaining data is used for training.We have experimented with two sets of features -base features set and base+derived features set.These feature sets have already been described in Table I.The average values of evaluation metrics, averaged over eight locations, using different models are summarised in Tables III and IV for start-ups A and B, respectively.Here, UNC denotes the trivial model which outputs the uncalibrated PM 2.5 values.We have found that the DNN model performed best for start-up A and B. Also, with the addition of the derived features, an improvement in the performance of M , over that with just using the base features, can be observed with all the models for both the start-ups.We have found this improvement to be statistically significant as the p-value for the R 2 score is less than 0.05 for all the models applied to the uncalibrated data of start-up A.
We have also shown the time series plot for CAAQMS, UNC, and calibrated PM 2.5 using DNN for start-up A at Airport and Nerul and for start-up B at Airport in the supplementary material (Fig. 1).It can be observed from this figure,    and Tables III and IV that the PM 2.5 time series obtained after calibration is showing better correlation with the CAAQMS' PM 2.5 .
We have also computed the MAE for each location's calibration model.The cumulated MAE for the calibrated PM values at Airport, Mahape, and Powai are shown in Fig. 3 for start-up A and found minimum for DNN model.In Task II, M is deployed at (s , d ) in two ways to show the effectiveness of domain adaptation.First, M is deployed at (s , d ) without domain adaptation and the results are shown in Tables V.It can be noted that the performance of M is not satisfactory when deployed at (s , d ) without domain adaptation.Similar observation is also found in [18].Even if the performance of DNN is found to be slightly better as compared to the other models deployed at (s , d ) without domain adaptation, the UNC is the best.Further, the M is adapted using the domain adaptation method at (s , d ) with a shorter collocation period.Only two days of data (48 samples) is used to adapt M , which is validated using the next seven days' data for both start-ups A and B. The remaining data is divided into two parts named as Testset-1 and Testset-2 to examine the adapted model performance in different time duration.Testset-1 contains 720 samples, and number of samples in Testset-2 vary at different locations in the range of 322 samples to 1000 samples.This is due to the different number of missing samples corresponding to the different locations.It should be noted that Testset-1 and Testset-2, shown in Table V to VIII, are of same duration.For startup A, the M is developed at Airport and adapted at seven target locations.For start-up B, the M is developed at Nerul and adapted three target locations.
To show the usefulness of the domain adaptation method in reducing the collocation time, the baseline models are also developed at (s , d ) from scratch.For the fair comparison of adapted model and baseline models the same split of training, validation, and the testing dataset are explored and the performance is compared in Tables VI and VII for start-up A and B, respectively.In these tables, the LR1, SVR1, ENR1, RR1, LAR1, and DNN1 are the baseline models trained at (s , d ) from scratch, and DNN2 is the adapted calibration model.It can be observed that the performance of DNN1 is found to be inferior as compared to LR1, SVR1, ENR1, RR1, and LAR1.This is due to the smaller size of the training dataset.The DNN1 requires a large amount of training data to improve the performance as compared to the other baseline models.However, with the use of the domain adaptation method, the limitation of the smaller training dataset can be overcome.This fact is reflected in the performance of the DNN2 which formed using the domain adaptation method.For startup A, it can be observed that DNN2 has performed better at five out of seven target locations for Testset-1 and two target locations for Testset-2.For Mahape, Nerul, and Borivali, the performance parameters are shown in Table VI.A summary of the performance evaluation at the remaining locations for start-up A is provided in the supplementary material (Table -1).From Table VII, it can be seen that for start-up B, the DNN2 have shown better performance at two out of three target locations.Even though we have observed that different baseline models have performed better than DNN2 at some locations and some test datasets, the performance of DNN2 is found to be overall consistent, as seen in Table VIII.It should also be noted that the performance of DNN2 is also found to be substantially better than that of M deployed at (s , d ) without domain adaptation as can be seen in In Fig. 4, the time series of PM 2.5 for UNC, and calibrated PM 2.5 using DNN2 are compared for startup B at Airport site, and for start-up at Vileparle and Worli sites in the supplementary material (Fig. 2).It can be Finally, the performance of DNN and DNN2 are also compared for the same test dataset of 14 days at (s , d ) and the results are shown in Table IX and X.It is expected that DNN will perform better than DNN2 because the former is trained with much more data at (s , d ) as compared to the latter.However, it is interesting to note that the performance of the latter is not very far from that of the former.Moreover, at certain locations, the performance of DNN2 even surpasses that of DNN.These results show the potential of the proposed domain adaptation method in reducing the collocation time for low cost sensor calibration.
We have used scikit-learn and TensorFlow libraries of Python programming language.To form the DNN model, the training time is 18 sec to 40 sec, and testing time is 0.005 sec with following hardware configuration (GPU: Quadro P2200, 64 GB RAM, intel i7-10700K processor, Python-3.7.10, and tensorflow-gpu-2.1.0).To adapt the model at (s , d ) (DNN2), the training time is 45 sec to 65 sec and testing time is 0.4 sec with following hardware configuration (4GB RAM, intel i3 processor, Python-3.7.10, and tensorflow-cpu-2.1.0).The dataset and code used in this work are available at https://github.com/madhavlab/2021ksonuSensorAdaptation.

V. CONCLUSION AND FUTURE WORK
In this work, a novel calibration method for measured PM 2.5 levels using LCSDs is proposed.This method makes use of deep learning for high performance and domain adaptation for reducing the time required to collocate the LCSDs with the CAAQMS at the target location.Also, new input features are derived that improve the performance of the calibration model.We compare the performance of the proposed domain adaptation based calibration method and the proposed features with several machine learning-based calibration methods and find improvements with the proposed method.With the proposed domain adaptation based calibration model, we find that two days of collocation with the CAAQMS is sufficient

Fig. 1 :
Fig. 1: The steps performed to obtain the proposed calibration model.
II. SENSORS DEPLOYMENT SITES AND DATA COLLECTION In this work, two start-up's A and B have installed the LCSDs to measure the PM 2.5 .The start-up A has installed the LCSDs at Mumbai Airport, Borivali, Kalyan, Mahape, Nerul, Powai, Vileparle, and Worli in Mumbai Metropolitan Region (MMR) Maharashtra, India.These sensors are collocated with the CAAQMS sensors to measure the PM 2.5 levels.The PM sensors used by start-up A is Plantower PMS-7003, and the Bosch BME-280 sensors are used to measure Temp and RH.Start-up B has installed the LCSDs at Mumbai Airport, Mahape, Nerul, and Vileparle.Start-up B has used Telaire sensor for measuring the PM, and Sensirion sensor for the measurement of Temp, and RH.In this study, the data collected for November 1 st , 2020 to January 31 st , 2021 is utilized to perform the experiments.The data is available at an interval of 15, 30, and 60 minutes for CAAQMS sensors and 1, 15, 30, and 60 minutes for the LCSDs of start-up A. The startup B has provided the data at an interval of 30 seconds and 1 minute.All the experiments are performed using the data available at an interval of 60 minutes.The data provided by start-up B is averaged out at 60 minutes.

Fig. 2 :
Fig. 2: Correlation of the features w.r.t CAAQMS' PM 2.5 at Airport site for start-up A.

1 : 2 . 3 . 4 .
Domain adaptation method Input: A labeled source dataset D S = {X S , Y S }, a small amount of labeled target domain dataset D T1 = {X T1 , Y T1 }, and large amount of unlabeled target domain dataset D T2 = {X T2 }.Output: Labels Y T2 of the unlabeled data X T2 in the target domain 1.Standardize the source and target domain features using(1).Learn a regression model f : X S → Y cal S using (3) and (4).Save the models learned at source domain as base model M .Freeze the layer M and add a new layer at the top of the M .
For start-up B, M is adapted at (s , d ) using the step-2 only.The proposed algorithm for the adaptation of M is summarised in Algorithm 1.The dataset of source (D S ) and target (D T ) domain are splitted in training and testing data.The size of training dataset at D T is much smaller as compared to D S .

Fig. 3 :
Fig. 3: Performance evaluation in terms of cumulative MAE (µgm −3 ), cumulated over deployment time of the different models for start-up A in Task I.

Fig. 4 :
Fig. 4: The plot showing PM 2.5 levels measured by CAAQMS, uncalibrated PM 2.5 levels measured by LCSD, and calibrated PM 2.5 values using DNN2 for Task-II (Start-up B).

Fig. 5 :
Fig. 5: Performance evaluation of DNN2 and baseline models for Task-II in terms of cumulative MAE (µgm −3 ), cumulated over deployment time for start-up A ((a), (b)) and start-up B (c) for Testset-1.

B
. Task II: Model adaptation to different site and device: (s, d) → (s , d ) Section III describes data preprocessing and features extraction, calibration models, domain adaptation, and performance evaluation criteria of the calibration model.Section IV describes the case study.Section V provides the conclusion and future work.
developed model is adaptable at (s , d ) with two days of collocation time.The paper's remainder is as follows: Section II explains a brief overview of sensor deployment and data acquisition.

TABLE I :
Features used to train the calibration model.PM 2.5 in µgm −3 Features measured by LCDS PM 10 in µgm −3 10 at t-1, t-2, t-3 Features PM 10 at t-23, t-24, t-25 Non-linear features PM 2.5 × PM 10 , PM 2.5 × RH PM 2.5 × Temp PM 10 × RH, PM 10 × Temp PM 2.5 × Temp × RH, PM 10 × Temp × RH PM 2.5 × PM 10 × Temp × RH model.The correlation coefficient of Temp is negative (index 3 in Fig. 2), this represents that Temp and CAAQMS' PM 2.5 have inverse relationship.Further details of the features are provided in Table I.The order of features in Table I is the same as in Fig 2. Finally, the obtained features' matrix is used to train the calibration model.

TABLE II :
The selected values of the parameters for developing the base calibration model.

TABLE III :
Average performance (over different locations) of different models for Task-I (start-up A).

TABLE IV :
Average performance (over different locations) of different models for Task-I (start-up B).Model Performance of M with Performance of M with

TABLE V :
Average performance (over different locations) of different calibration models deployed at (s , d ) without domain adaptation for Task-II (start-up A and B).

TABLE VI :
Performance comparison of different calibration models for Task-II (start-up A).

TABLE VII :
Performance comparison of different calibration models for Task II (start-up B).
Table V and VIII.These results demonstrate the usefulness of the proposed domain adaptation-based calibration model in reducing the collocation time of LCSDs with CAAQMS at target locations.The best-attained values of the evaluation parameters R 2 , MAPE (%), SMAPE (%) are in the font in Table VI and VII.

TABLE VIII :
Average performance of different calibration models across different sites for Task-II (start-up A and B).observed from these figures and Table VII that the PM 2.5 time series obtained after calibration is showing better correlation with the CAAQMS' PM 2.5 .We have also computed the MAE for DNN2 and baseline models at (s , d ).The cumulative sum of MAE for DNN2 is found to be lower or comparable to the baseline models at most of the sites.The cumulative MAE values for the calibrated PM values are shown in Fig.5.These results indicate the robustness of the proposed calibration methodology.