HW1: ADF tests and auto.arima()
Closed this issue · 7 comments
Couple of questions on HW1.
First, should the adf.test()
and ur.df()
functions lead to the same conclusions as long as we specify the same assumptions/lags within each function? For example, I would expect that the following two lines of code should lead to the same conclusion (WRT rejecting or not rejecting the null). However, I am finding that that is not the case.
adf.test(dat, k = 0)
ur.df(dat, type = "trend", lags = 0)
For question 4b, when using the auto.arima()
function with trace=TRUE
, I am finding that none of the candidate models are within Delta AICc = 2 of the top model. Is this user error on my part, or am I missing something?
Hi @mbuonanduci. I'll let @eeholmes handle the first part about the Dickey Fuller tests, but when I tried 4b, I did indeed find several models within 2 AIC units of one another. I'd suggest you check that you're using the correct value for Species
, the correct range for Year
, and the correct response variable.
Thanks @mdscheuerell! I have to say, I'm a bit stumped with 4b. I am subsetting the landings
data using Species == "Anchovy"
, response variable log.metric.tons
, and start/end years of 1964 and 1987, respectively. The best fitting model has AICc = -5.727702, and while a couple of others come close, none are quite within Delta AICc = 2.
The key here is that the Q is asking
What models are within ΔAIC of 2? What is different about these models?
without mentioning anything about the "best" model. So, are any of your models within 2 units of one another?
@mbuonanduci Ah, that should have been anchovy 1964-2007 not 1964-1984 in that question. For the longer dataset, there are a number within 2 of the best model. However, as written, the answer would be 'none are quite within Delta AICc = 2'. If you would like, repeat with 1964-2007. Then you'll see 2 models within Delta AICc of the best (best plus 2nd best). Or write 'none within Delta AICc = 2'
@mbuonanduci Regarding the unit root test. You set it up correctly and yes they give the same result.
adf.test(dat, k = 0)
ur.df(dat, type = "trend", lags = 0)
The output of adf.test()
tells you
data: dat
Dickey-Fuller = -2.854, Lag order = 0, p-value = 0.247
alternative hypothesis: stationary
Test statistic is -2.854 and p-value is large so null (non-stationary) is not rejected.
The ur.df()
output is cryptic but you can see that the 1st test statistic is the same as that reported by adf.test()
###############################################################
# Augmented Dickey-Fuller Test Unit Root / Cointegration Test #
###############################################################
The value of the test statistic is: -2.854 2.8125 4.0747
I find it easiest to get the ur.df()
test levels the way I did in lecture:
test <- ur.df(dat, type = "trend", lags = 0)
attr(test, "teststat")
attr(test,"cval")
This gives
> attr(test, "teststat")
tau3 phi2 phi3
statistic -2.854003 2.81248 4.074721
> attr(test,"cval")
1pct 5pct 10pct
tau3 -4.38 -3.60 -3.24
phi2 8.21 5.68 4.67
phi3 10.61 7.24 5.91
tau3
is the test statistic you want. This is the one for
You can also get this info using
test <- ur.df(dat, type = "trend", lags = 0)
summary(test)
But the tau3
test statistic is not labeled in the output.
Value of test-statistic is: -2.854 2.8125 4.0747
Critical values for test statistics:
1pct 5pct 10pct
tau3 -4.38 -3.60 -3.24
phi2 8.21 5.68 4.67
phi3 10.61 7.24 5.91
This stackoverflow discussion of ur.df()
nicely lays the output that you see from summary()
and how to interpret it. https://stats.stackexchange.com/a/117212/291254
Thank you @eeholmes! I was interpreting the ur.df()
output incorrectly. Really appreciate the help!