Wrong F-statistics for robust SE
Opened this issue · 4 comments
The F-stats for the diagnostics are all wrong when using robust/clustered SE. They are simply not being updated with robust/clustered cases. And the documentation is incorrect for cov_type='cluster'. It should be "clustered".
Wu-Hausman is off, and the f.stat for first-stage diagnostics in IV2SLS is twice as high as it should be with robust SE. It is also using chi2-distribution instead of F-distribution to compute the p-values.
Can you provide a code example that shows what you are seeing?
I can't reproduce. When I change cov_type="robust"
to cov_type="unadjusted"
I get different F-statistic values.
As for chi2 vs F, I always use chi2(df) distributions rather than using the "asymptotic F" which is just chi2(df) / df.
Sure. Take this data set (the source is ivreg, see below):
Q P D F A
0 98.48 100.32 87.40 98.00 1
1 99.19 104.26 97.60 99.10 2
2 102.16 103.44 96.70 99.10 3
3 101.50 104.51 98.20 98.10 4
4 104.24 98.00 99.80 110.80 5
5 103.24 99.46 100.50 108.20 6
6 103.99 101.07 103.20 105.60 7
7 99.90 104.76 107.80 109.80 8
8 100.35 96.45 96.60 108.70 9
9 102.82 91.23 88.90 100.60 10
10 95.44 93.08 75.10 81.00 11
11 92.42 98.80 76.90 68.60 12
12 94.53 102.91 84.60 70.90 13
13 98.76 98.76 90.60 81.40 14
14 105.80 95.12 103.10 102.30 15
15 100.22 98.45 105.10 105.00 16
16 103.52 86.50 96.40 110.50 17
17 99.93 104.02 104.40 92.50 18
18 105.22 105.77 110.70 89.30 19
19 106.23 113.49 127.10 93.00 20
Run this:
def run_diagnostics(mods):
print("\n----- Coef. SE ------")
for cov, res in mods.items():
print(f"{cov:10s}: {res.std_errors.tolist()}")
print("\n----- Breusch-Pagan Test for Heteroskedasticity ------")
for cov, res in mods.items():
endo = res.model.endog.pandas.assign(Intercept=1)
residuals = res.resids
bp_test = sms.het_breuschpagan(residuals, endo)
print(f"{cov:10s}: {bp_test}")
print("\n----- Wu-Hausman Test ------")
for cov, res in mods.items():
print(f"{cov:10s}:\n {res.wu_hausman()}")
print("\n----- Sargan ------")
for cov, res in mods.items():
print(f"{cov:10s}:\n {res.sargan}")
print("\n----- Basmann ------")
for cov, res in mods.items():
print(f"{cov:10s}:\n {res.basmann}")
print("\n----- First-stage diagnostics ------")
for cov, res in mods.items():
print(f"{cov:10s}:\n {res.first_stage.diagnostics}")
res_unadj = ivreg.from_formula("Q ~ 1 + [P ~ F + A] + D", data=df).fit(cov_type='unadjusted')
res_robust = ivreg.from_formula("Q ~ 1 + [P ~ F + A] + D", data=df).fit(cov_type='robust')
mods = {'Unadjusted': res_unadj,
'Robust' : res_robust}
run_diagnostics(mods)
You will get
----- Coef. SE ------
Unadjusted: [7.3026520951185425, 0.04327991369214111, 0.08895412123517202]
Robust : [5.1474532209912285, 0.04292534502530685, 0.07589901329404808]
----- Breusch-Pagan Test for Heteroskedasticity ------
Unadjusted: (0.01697789593123389, 0.8963296692336946, 0.015293088561413301, 0.9029507926709841)
Robust : (0.01697789593123389, 0.8963296692336946, 0.015293088561413301, 0.9029507926709841)
----- Wu-Hausman Test ------
Unadjusted:
Wu-Hausman test of exogeneity
H0: All endogenous variables are exogenous
Statistic: 11.7617
P-value: 0.0034
Distributed: F(1,16)
Robust :
Wu-Hausman test of exogeneity
H0: All endogenous variables are exogenous
Statistic: 11.7617
P-value: 0.0034
Distributed: F(1,16)
----- Sargan ------
Unadjusted:
Sargan's test of overidentification
H0: The model is not overidentified.
Statistic: 2.9831
P-value: 0.0841
Distributed: chi2(1)
Robust :
Sargan's test of overidentification
H0: The model is not overidentified.
Statistic: 2.9831
P-value: 0.0841
Distributed: chi2(1)
----- Basmann ------
Unadjusted:
Basmann's test of overidentification
H0: The model is not overidentified.
Statistic: 2.8049
P-value: 0.0940
Distributed: chi2(1)
Robust :
Basmann's test of overidentification
H0: The model is not overidentified.
Statistic: 2.8049
P-value: 0.0940
Distributed: chi2(1)
----- First-stage diagnostics ------
Unadjusted:
rsquared partial.rsquared shea.rsquared f.stat f.pval f.dist
P 0.94 0.92 0.92 110.03 0.00 F(2,16)
Robust :
rsquared partial.rsquared shea.rsquared f.stat f.pval f.dist
P 0.94 0.92 0.92 284.68 0.00 chi2(2)
So, the only adjustment happens for the SE of the coefficients and the first-starge F-stat, but the latter does not match other implementations. See ivreg in R, for instance.
data("Kmenta", package = "ivreg")
res <- ivreg::ivreg(Q ~ D + P | D + F + A, data=df)
summary(res, vcov=sandwich::sandwich)
You get:
Call:
ivreg(formula = Q ~ D + P | D + F + A, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.430 -1.243 -0.189 1.576 2.492
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 94.6333 5.1475 18.38 0.0000000000012 ***
D 0.3140 0.0429 7.31 0.0000012080527 ***
P -0.2436 0.0759 -3.21 0.0051 **
Diagnostic tests:
df1 df2 statistic p-value
Weak instruments 2 16 142.34 0.000000000064 ***
Wu-Hausman 1 16 21.90 0.00025 ***
Sargan 1 NA 2.98 0.08414 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.97 on 17 degrees of freedom
Multiple R-Squared: 0.755, Adjusted R-squared: 0.726
Wald test: 34.4 on 2 and 17 DF, p-value: 0.00000105
Your F-stat is twice the one here (see line "weak instruments")