AQI-prediction-using Spatio-Temporal-Data-Mining-and-Machine-Learning - A Case Study

The present research deals with application of data mining to spatial data of air pollution.

The Air Quality Index (AQI) is a particular number used by government agencies and so this number helps to characterize air quality at a specific location. AQI Scheme transforms weighted values of parameters referred to air pollution into a single number or set of numbers. AQI is used for local and regional air quality management in many metropolitan cities of the worldNaive Bayes classification and Gradient Descent algorithms were applied to get the minimum error with respect to prediction of the air quality index in India.

1. Data Collection We collected online datafromair quality monitoring sites from 1990 to 2014.The air pollutant data in this study included the concentrations of O3, PM2.5 and SO2. https://github.com/ShubhangiLokhande123/AQI_prediction_Using_Temporal_data_Mining.gitWe chose the meteorological factors that would influence the levels of air pollutants, including air temperature, relative humidity, wind speed and direction, wind rainfall, accumulation of precipitation, visibility, dew point, wind direction, pressure and weather conditions.

2. Performance Evaluation The statistical criteria such as mean absolute error (MAE), mean absolute percent error (MAPE), correlation coefficient (R), and root mean square error (RMSE) were selected to assess the efficiency of each regression model.

3. Naive Forecasting Estimating method in which the actuals of the last period are used as the prediction for this period, without changing them or trying to determine causal factors. It is only used to compare the forecasts of better (advanced) methods.Naïve Forecast offers a comparison benchmark separate from the final prediction showing whether or not the original final prediction is improved. Naïve Forecast is like baseline prediction based on facts and information, but for many organizations it is still unexplored. It is a method that is implemented in forecasting at the primitive level and promotes the fundamental comparison standard. The Organizations analyze whether the naïve forecast is inferior or superior from the final forecast generated by the organization.

Variables: RSPM- Respirable Suspended Particulate Matter SPM- Suspended Particulate Matter RMSE- Root mean square error si- so2 individual pollutant index ni- no2 individual pollutant index rpi- RSPM individual pollutant index