ARMA Models in StatsModels - Lab

Introduction

In this lesson, you'll fit an ARMA model using statsmodels to a real-world dataset.

Objectives

In this lab you will:

  • Decide the optimal parameters for an ARMA model by plotting ACF and PACF and interpreting them
  • Fit an ARMA model using StatsModels

Dataset

Run the cell below to import the dataset containing the historical running times for the men's 400m in the Olympic games.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import warnings
from statsmodels.tools.sm_exceptions import ConvergenceWarning

warnings.simplefilter("ignore", ConvergenceWarning)

data = pd.read_csv("winning_400m.csv")
data["year"] = pd.to_datetime(data["year"].astype(str))
data.set_index("year", inplace=True)
data.index = data.index.to_period("Y")
# Preview the dataset
data

Plot this time series data.

# Plot the time series

If you plotted the time series correctly, you should notice that it is not stationary. So, difference the data to get a stationary time series. Make sure to remove the missing values.

# Difference the time series
data_diff = None
data_diff

Use statsmodels to plot the ACF and PACF of this differenced time series.

# Plot the ACF
# Plot the PACF

Based on the ACF and PACF, fit an ARMA model with the right orders for AR and MA. Feel free to try different models and compare AIC and BIC values, as well as significance values for the parameter estimates.

What is your final model? Why did you pick this model?

# Your comments here

Summary

Well done. In addition to manipulating and visualizing time series data, you now know how to create a stationary time series and fit ARMA models.