
A major challenge involving pollution detection is to measure the average PM2.5 concentration over major metropolitan areas. One useful method for predicting PM2.5 over a relatively wide range are AOD measurements. Ensemble algorithms are more effective than linear algorithms for prediction. Pandas and scikit-learn were used for data analysis.

Primary LanguagePython

Comparing Ensemble and Linear Methods for Predicting PM2.5 Levels from AOD Data

A major challenge involving pollution detection is to measure the average PM2.5 concentration over major metropolitan areas. One useful method for predicting PM2.5 over a relatively wide range are AOD measurements. Ensemble algorithms are more effective than linear algorithms for prediction. Pandas and scikit-learn were used for data analysis.


Use the package manager pip to install scikit-learn before running.

pip3 install -U scikit-learn


On line 13 of aodvspm25.py, add the folder names with the requested geographic locations for AOD and PM2.5 data.

python3 aodvspm25.py


Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.
