IoT Based Air Quality Prediction using SVM and Random Forest

: Internet of Things (IoT) may be a worldwide System of “smart devices” which will sense and connect with their surroundings and interact with users and other systems. Global air pollution is one of the major concerns of our era. The level of pollution has increased with time by a lot of things like the increase in population, increased vehicle use, industrialization, and urbanization which ends up in harmful effects on human wellbeing by directly affecting the health of the population exposed to it. Air quality goes down when enough amount of harmful gases are present in the air like carbon dioxide, smoke, alcohol, benzene, NH3, and NO2. To analyses, we are developing an IoT Based pollution Monitoring System which we'll monitor the Air Quality over an internet server. Existing monitoring systems have inferior precision, low sensitivity, and need laboratory analysis. Therefore, improved monitoring systems are needed.


I. INTRODUCTION
The Environment is nothing but everything that encircles us.The environment is getting polluted due to human activities and natural disasters, very severe among them is air pollution.The concentration of air pollutants in ambient air is governed by meteorological parameters such as atmospheric wind speed, wind direction, relative humidity, and temperature.If the humidity is more, we feel much hotter because sweat will not evaporate into the atmosphere.Urbanization is one of the main reasons for air pollution because the increase in the transportation facilities emits more pollutants into the atmosphere and another main reason for air pollution is Industrialization.The major pollutants are Nitrogen Oxide (NO), Carbon Monoxide (CO), Particulate matter (PM), SO2, etc. Carbon Monoxide is produced due to the deficient Oxidization of propellants such as petroleum, gas, etc. Nitrogen Oxide is produced due to the ignition of thermal fuel; Carbon monoxide causes headaches, vomiting; Benzene is produced due to smoking, it causes respiratory problems; Nitrogen oxides cause dizziness, nausea; Particulate matter with a diameter of 2.5 micrometer or less than that affects more to human health.Measures must be taken to minimize air pollution in the environment.Air Quality Index(AQI), is used to measure the quality of air.Earlier classical methods such as probability, statistics were used to predict the quality of air, but those methods are very complex to predict the quality of air.Due to the advancement of technology, now it is very easy to fetch data about the pollutants of air using sensors.Assessment of raw data to detect the pollutants needs vigorous analysis.Convolution Neural networks, Recursive Neural networks, Deep Learning, Machine learning algorithms assure accomplishing the prediction of future AQI so that measures can be taken appropriately.Machine learning which comes under artificial intelligence has three kinds of learning algorithms, they are Supervised Learning, Unsupervised Learning, Reinforcement Learning.In the proposed work we have used a supervised learning approach.There are many algorithms under supervised learning algorithms such as Linear Regression, Nearest Neighbor, SVM, kernel SVM, Naive Bayes, and Random Forest.Compared to all other algorithms SVM and Random forest give better results, so our approach selects them to predict accurate air pollution.

II. LITERATURE SURVEY Chakradhar ReddyK and Nagarjuna Reddy K[1],
The supervised machine learning technique (SMLT) was used to gather several pieces of information from the dataset, including variable recognition, uni-variate analysis, bi-variate and multivariate analysis, missing value treatment and analysis, data cleaning/preparation, and data representation.Their findings provide a valuable guide to sensitivity analysis of model parameters in terms of success in air quality pollution prediction through accuracy measurement.
Yun-Chia Liang and Yona Maimury[2], including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions.A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R 2 and RMSE, while AdaBoost provides best results for MAE.
Madhuri VM, Samyama Gunjal GH[3], the proposed work was based on a supervised learning approach using different algorithms such as LR, SVM, DT, and RF.The result shows that AQI predictions obtained through RF are promising which are analyzed with the result.Akshatha S and Jayaram M N[4], addressed the Objective of the framework was to utilize different sensors and servers to plan a proficient air quality checking framework without affecting the common habitat and give live updates to keep away from clashes.Mauro Castelli and Fabiana Martins Clemente [5], proposed a famous AI strategy, support vector relapse (SVR), to figure poison and particulate levels and to foresee the air quality list (AQI).Among the different tried other options, outspread premise work (RBF) was the sort of portion that permitted SVR to get the most reliable expectations.Utilizing the entire arrangement of accessible factors uncovered a more effective technique than choosing highlights utilizing head part investigation.&e introduced results show that SVR with RBF bit permitted to precisely foresee hourly poison focuses, similar to carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, just as the hourly AQI for the territory of California.
Rajeev Tiwari, Shuchi Upadhyay, Parv Singhal [6], The Aim was to foster a fake neural organization for air quality expectation that can perform with obliged dataset with exceptionally vigorous element to deal with the information including commotion and mistakes.Dataset utilized arrangements with contamination in the U.S. including four significant toxins (Nitrogen Dioxide, Sulfur Dioxide, Carbon Monoxide and Ozone) on everyday schedule for the time-frame of year 2008 to 2017.
Shivam Sharma and Nishu Soni [7], The point of the framework was to execute an air quality checking gadget utilizing the Internet of Thing which is otherwise called IoT for controlling air contamination and improve the nature of air.This framework estimates the continuous air quality list, temperature, dampness which is shown on a site with the assistance of web.
Mrs. A. Gnana Soundari Mrs. J. Gnana Jeslin M.E, Akshaya A [8],they proposed model skilled for effectively foreseeing the air quality list of a complete region or any state or any limited locale furnished with the authentic information of toxin focus.In our model by executing the proposed boundary diminishing details, we accomplished preferred execution over the standard relapse models.our model has 96% precision on foreseeing the currently accessible dataset on anticipating the air quality list of entire India, likewise we use AHP MCDM method to find of request of inclination by comparability to ideal arrangement.
Vineeta , Ajit Bhat , Asha S Manek , Pranay Mishra[9] ,they repesented the data has been collected from the two sources in the Bengaluru region: government website and static sensors built using Arduino.The level of CO is measured using three machine algorithms namely Random Forest Regression (RFR), Decision Tree Regression (DTR) and Linear Regression (LR).The results show that RFR gives least error of the three and hence more accuracy.Command line interface has also been created to see the CO level prediction.Fabiana Martins Clemente, Aleš Popovič, Sara Silva, and Leonardo Vanneschi[10], they employ a popular machine learning method, support vector regression (SVR), to forecast pollutant and particulate levels and to predict the air quality index (AQI).Among the various tested alternatives, radial basis function (RBF) was the type of kernel that allowed SVR to obtain the most accurate predictions.Using the whole set of available variables revealed a more successful strategy than selecting features using principal component analysis.The presented results demonstrate that SVR with RBF kernel allows us to accurately predict hourly pollutant concentrations, like carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, as well as the hourly AQI for the state of California.Classification into six AQI categories defined by the US Environmental Protection Agency was performed with an accuracy of 94.1\% on unseen validation data.S. L.  have worked in the brain tumor detection.N. Shelke et al [16] given LRA-DNN method.Suneet Gupta et al [17] worked for end user system.Gururaj Awate et al.The system is designed with a combination of hardware and software.The system includes node MCU which is been used as the main controller and other gas sensors are been interfaced to node MCU and GPS is used to get the location of the specific area.The data collected from the sensor is been trained using SVM Techniques like Plotting on Map after receiving data from sensors.For Plotting, we are using pygmaps library.The alert is indicated on the map according to location/area.Red if huge, Orange if minor and Green if clear.
V. CONCLUSION As a result, our project is to check the quality of the exposed level in the air pollution.Our project was designed to help a person to detect and predict the air quality in a particular area.Air Pollution is the major affecting factor to our environment.Not only affecting the environment but also affecting human health.The web-based application is developed to predict air quality.The gas sensors were used for identifying the gases and GPS for getting the location of the area and the graph is plotted accordingly.
Figure: System ArchitectureThe system is designed with a combination of hardware and software.The system includes node MCU which is been used as the main controller and other gas sensors are been interfaced to node MCU and GPS is used to get the location of the specific area.The data collected from the sensor is been trained using SVM Techniques like Plotting on Map after receiving data from sensors.For Plotting, we are using pygmaps library.The alert is indicated on the map according to location/area.Red if huge, Orange if minor and Green if clear.