Time-Series-Analysis-of-Seouls-Air-Pollution

By: Rashid Karriti

Overview

image

Business Understanding

image

Seoul, South Korea is one of the most polluted cities in the developed world, ranking 3rd highest daily average as of 2017. Additionally, Seoul has reached PM2.5 levels which is two times the annual limit that the WHO recommends. This project analyses a series of time series models to provide insight to the South Korean Ministry of Health and Welfare, which is responsible for public health, to pinpoint what areas have the worst PM2.5, PM10, and NO2 levels that can lead to detrimental long term health effects to citizens across all districts of Seoul. Furthermore, this project investigates what could have caused these high levels of air pollution and what can be done to protect the public's health from these air pollutants. Looking at the root mean squared error of every station of Seoul to see within what range my models can predict bad pollutants in specific districts.

Data Understanding

This project focuses on using Time Series Models, mainly SARIMAX and ARIMA models, running several iterations to find with what range the models can predict the levels of each dangerous pollutant in every one of Seoul's 25 districts. My data set comes from Seoul Metropolitan Government public data through the 'Open Data Plaza' from the year 2017 to 2019 for every hour of the day across 25 districts of Seoul. Furthermore, I manipulated the dataset to be the daily average, as it was a better indicator for the levels of pollutants and a better fit for Time Series models. Lastly, I broke up every district of Seoul into a seperate dataframe and looked at the pollutants in that area and focused on those that reached levels dangerous for the public's health.
image

image

What is PM10 & PM2.5?

According to United States Enviromental Protection (EPA) Agency:

"PM stands for particulate matter (also called particle pollution): the term for a mixture of solid particles and liquid droplets found in the air. Some particles, such as dust, dirt, soot, or smoke, are large or dark enough to be seen with the naked eye. Others are so small they can only be detected using an electron microscope.

Particle pollution includes:

  • PM10 : inhalable particles, with diameters that are generally 10 micrometers and smaller; and
  • PM2.5 : fine inhalable particles, with diameters that are generally 2.5 micrometers and smaller.
Some are emitted directly from a source, such as construction sites, unpaved roads, fields, smokestacks or fires.

Most particles form in the atmosphere as a result of complex reactions of chemicals such as sulfur dioxide and nitrogen oxides, which are pollutants emitted from power plants, industries and automobiles.

image

What is NO2?
  • NO2 is a gas representative of the larger group of nitrogen oxides, mainly known to be a highly reactive gas. NO2 comes from emissions from automotive vehicles, and power plants. breathing in air with high levels of NO2 can lead to irritation of the lungs, and can permanently damage the respiratory system long term.

The other data sets that I am working with is Shanghai's Air Quality Index (AQI) from 2017 to 2019 focusing on air pollutants PM 2.5, PM10, and NO2, based on scientific research that suggest that a large amount of air pollutants, specifically particulate matter (PM2.5 and PM10) travels from Shanghai to Seoul, and showing this data will show how Shanghai's rising levels of pollutants has the lasting effects to the levels of air pollutants in Seoul. image

Additionally, I incorporated weather data from the National Oceanic and Atmospheric Administration (NOAA), specifically from its National Centers for Environmental Information (NOAA), on Seoul's daily weather to see if it correlates and/or has a large effect to the general effect of air pollution in Seoul.

image

image

Results

My best model was able to predict incredibly close to that of the actual levels of PM10 in Station 122, the district with the highest levels of PM10 across Seoul. The model is slightly off when it hits absurdly high levels, however with more data I believe that this model can become even more accurate on predicting high levels of PM10.
image

In terms of predicting the next few days after, the model struggles to predict how high it can go, and most likely will have a better indication of how the next few days will go with more data and more precise identification of a specific area in a district.
image

Pollutant Avg Train Avg Test
NO2 .03 ppm .08 ppm
PM10 26 mm3 51 mm3
PM2.5 13 mm3 27 mm3

Conclusion

South Korea's Ministry of Health and Welfare should mainly focus their attention on:

A) Districts with high levels of PM2.5, PM10 and NO2.

B) Mandante that air purification efforts be put in homes, especially in areas with high concentrations of pollutants.

C) Put more investment in clean energy and green development, which can reduce the levels of PM2.5 and PM10.

Future Steps

For better and more accurate modeling the future, the next steps will be:

A) Collect more data earlier than 2017, it will allow us to read better patterns and modeling will be much better.

B) Examine other cities in Korea or in China, to see if the models work just as well and if certain pollutants are as common.

Relevant Links