In 1958, Charles David Keeling (1928-2005) from the Scripps Institution of
Oceanography began recording
The location was chosen because it is not influenced by changing
Air samples are taken several times a day, and concentrations have been observed using the same measuring method for over 60 years. In addition, samples are stored in flasks and periodically reanalyzed for calibration purposes. The result is a data set with very few interruptions and very few inhomogeneities.
Let
-
$F : t → F(t)$ accounts for the long-term trend; -
$t_i$ is time at the middle of the$i^{th}$ month, measured in fractions of years after Jan 15, 1958. Specifically, we take$t_i = (i+0.5)/12$ where$i = 0$ corresponds to Jan, 1958, adding$0.5$ is because the first measurement is halfway through the first month; -
$P_i$ is periodic in i with a fixed period, accounting for the seasonal pattern; -
$R_i$ is the remaining residual that accounts for all other influences
Here is a brief summary, for more results as well as methodology check the report
After we tried several polynomial trends, the quadratic trend was picked as having the best bias-variance tradeoff, measured using MAPE (Mean Absolute Percentage Error) on train and test datasets. Then, after removing the trend, the periodic component was derived using the averaging all values corresponding to the same month. The resulting model has MAPE of 0.21%.
However, when we performed some statistical tests for measuring the stationarity of the residual
Data by Dr. Pieter Tans, NOAA/GML (gml.noaa.gov/ccgg/trends/) and Dr. Ralph Keeling, Scripps Institution of Oceanography (scrippsco2.ucsd.edu/).