/Sky-Cast-Capstone

Sky Cast: A Comparison of Modern Techniques for Forecasting Time Series

Primary LanguageJupyter NotebookMIT LicenseMIT

Sky Cast

Machine Learning Nanodegree: Capstone Project

Author: Franklin Bradfield

Forecasting the future remains one of the most challenging problems in machine learning and data science in general. Recent successes toward this endeavor have utilized sequence-to-sequence neural network models, which is where I took inspiration from to produce this project. Here is a recent example the winning solution to a Kaggle competition that involved such an approach to forecast web time series.

In this machine learning project, I compare the performance of a traditional statistical technique, ARIMA, and recurrent neural networks (RNNs) on forecasting commercial airline data that I scraped from the U.S. Department of Transportation's publicly available sources.

The code for this was written in Python 2.7, using the latest versions of Python's standard scientific computing libraries - Numpy, Pandas, Matplotlib, and Scikit-Learn, as well as statsmodels - Python's statistical modeling library, and Tensorflow - Google's deep learning library.

To reproduce the results, or to simply view and manipulate the code on your own machine, first setup and activate an environment with the required libraries below installed, then $ cd into the notebooks directory and run $ jupyter notebook. Exploratory data analysis can be found in data_analysis.ipynb, while ARIMA_time_series.ipynb and RNN_time_series.ipynb contain time series models these models respectively.

Requirements

Further Reading

ARIMA for Time Series:

  • Aas, K., & Dimakos, X. K. (2004). Statistical modelling of financial time series. Norwegian Computing Center. Retrived from https://www.nr.no/files/samba/bff/SAMBA0804.pdf

  • Adebiyi, A. A., Adewumi, A. O., & Ayo, C. K. (2014). Stock Price Prediction Using the ARIMA Model. International Conference on Computer Modelling and Simulation 2014. Retrived from http://ijssst.info/Vol-15/No-4/data/4923a105.pdf

  • Dickey, D. A.; Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association. 74(366), 427–431. JSTOR 2286348. doi:10.2307/2286348.

  • Kohzadi, N., Boyd, M. S., Kermanshahi, B., & Kaastra, I. (1996). A comparison of artificial neural network and time series models for forecasting commodity prices. Neurocomputing, 10(2), 169-181. Retrieved from http://www.sciencedirect.com/science/article/pii/0925231295000208

Recurrent Neural Networks for Time Series: