bashtage/linearmodels

Wrong F-statistics for robust SE

Opened this issue · 4 comments

The F-stats for the diagnostics are all wrong when using robust/clustered SE. They are simply not being updated with robust/clustered cases. And the documentation is incorrect for cov_type='cluster'. It should be "clustered".

Wu-Hausman is off, and the f.stat for first-stage diagnostics in IV2SLS is twice as high as it should be with robust SE. It is also using chi2-distribution instead of F-distribution to compute the p-values.

Can you provide a code example that shows what you are seeing?

I can't reproduce. When I change cov_type="robust" to cov_type="unadjusted" I get different F-statistic values.

As for chi2 vs F, I always use chi2(df) distributions rather than using the "asymptotic F" which is just chi2(df) / df.

Sure. Take this data set (the source is ivreg, see below):

        Q      P      D      F   A
0   98.48 100.32  87.40  98.00   1
1   99.19 104.26  97.60  99.10   2
2  102.16 103.44  96.70  99.10   3
3  101.50 104.51  98.20  98.10   4
4  104.24  98.00  99.80 110.80   5
5  103.24  99.46 100.50 108.20   6
6  103.99 101.07 103.20 105.60   7
7   99.90 104.76 107.80 109.80   8
8  100.35  96.45  96.60 108.70   9
9  102.82  91.23  88.90 100.60  10
10  95.44  93.08  75.10  81.00  11
11  92.42  98.80  76.90  68.60  12
12  94.53 102.91  84.60  70.90  13
13  98.76  98.76  90.60  81.40  14
14 105.80  95.12 103.10 102.30  15
15 100.22  98.45 105.10 105.00  16
16 103.52  86.50  96.40 110.50  17
17  99.93 104.02 104.40  92.50  18
18 105.22 105.77 110.70  89.30  19
19 106.23 113.49 127.10  93.00  20

Run this:

def run_diagnostics(mods):
    print("\n----- Coef. SE ------")
    for cov, res in mods.items():
        print(f"{cov:10s}: {res.std_errors.tolist()}")

    print("\n----- Breusch-Pagan Test for Heteroskedasticity ------")
    for cov, res in mods.items():
        endo = res.model.endog.pandas.assign(Intercept=1)
        residuals = res.resids
        bp_test = sms.het_breuschpagan(residuals, endo)
        print(f"{cov:10s}: {bp_test}")

    print("\n----- Wu-Hausman Test ------")
    for cov, res in mods.items():
        print(f"{cov:10s}:\n {res.wu_hausman()}")

    print("\n----- Sargan ------")
    for cov, res in mods.items():
        print(f"{cov:10s}:\n {res.sargan}")

    print("\n----- Basmann ------")
    for cov, res in mods.items():
        print(f"{cov:10s}:\n {res.basmann}")

    print("\n----- First-stage diagnostics ------")
    for cov, res in mods.items():
        print(f"{cov:10s}:\n {res.first_stage.diagnostics}")
        

        
res_unadj  = ivreg.from_formula("Q ~ 1 + [P ~  F + A] + D", data=df).fit(cov_type='unadjusted')
res_robust = ivreg.from_formula("Q ~ 1 + [P ~  F + A] + D", data=df).fit(cov_type='robust')
mods = {'Unadjusted':  res_unadj,
        'Robust'    :  res_robust}
run_diagnostics(mods)

You will get

----- Coef. SE ------
Unadjusted: [7.3026520951185425, 0.04327991369214111, 0.08895412123517202]
Robust    : [5.1474532209912285, 0.04292534502530685, 0.07589901329404808]

----- Breusch-Pagan Test for Heteroskedasticity ------
Unadjusted: (0.01697789593123389, 0.8963296692336946, 0.015293088561413301, 0.9029507926709841)
Robust    : (0.01697789593123389, 0.8963296692336946, 0.015293088561413301, 0.9029507926709841)

----- Wu-Hausman Test ------
Unadjusted:
 Wu-Hausman test of exogeneity
H0: All endogenous variables are exogenous
Statistic: 11.7617
P-value: 0.0034
Distributed: F(1,16)
Robust    :
 Wu-Hausman test of exogeneity
H0: All endogenous variables are exogenous
Statistic: 11.7617
P-value: 0.0034
Distributed: F(1,16)

----- Sargan ------
Unadjusted:
 Sargan's test of overidentification
H0: The model is not overidentified.
Statistic: 2.9831
P-value: 0.0841
Distributed: chi2(1)
Robust    :
 Sargan's test of overidentification
H0: The model is not overidentified.
Statistic: 2.9831
P-value: 0.0841
Distributed: chi2(1)

----- Basmann ------
Unadjusted:
 Basmann's test of overidentification
H0: The model is not overidentified.
Statistic: 2.8049
P-value: 0.0940
Distributed: chi2(1)
Robust    :
 Basmann's test of overidentification
H0: The model is not overidentified.
Statistic: 2.8049
P-value: 0.0940
Distributed: chi2(1)

----- First-stage diagnostics ------
Unadjusted:
    rsquared  partial.rsquared  shea.rsquared  f.stat  f.pval   f.dist
P      0.94          0.92              0.92   110.03    0.00  F(2,16)
Robust    :
    rsquared  partial.rsquared  shea.rsquared  f.stat  f.pval   f.dist
P      0.94          0.92              0.92   284.68    0.00  chi2(2)

So, the only adjustment happens for the SE of the coefficients and the first-starge F-stat, but the latter does not match other implementations. See ivreg in R, for instance.

data("Kmenta", package = "ivreg")

res <- ivreg::ivreg(Q ~ D + P | D + F + A, data=df)
summary(res, vcov=sandwich::sandwich)

You get:

Call:
ivreg(formula = Q ~ D + P | D + F + A, data = df)

Residuals:
   Min     1Q Median     3Q    Max 
-3.430 -1.243 -0.189  1.576  2.492 

Coefficients:
            Estimate Std. Error t value        Pr(>|t|)    
(Intercept)  94.6333     5.1475   18.38 0.0000000000012 ***
D             0.3140     0.0429    7.31 0.0000012080527 ***
P            -0.2436     0.0759   -3.21          0.0051 ** 

Diagnostic tests:
                 df1 df2 statistic        p-value    
Weak instruments   2  16    142.34 0.000000000064 ***
Wu-Hausman         1  16     21.90        0.00025 ***
Sargan             1  NA      2.98        0.08414 .  
---
Signif. codes:  0***0.001**0.01*0.05.0.1 ‘ ’ 1

Residual standard error: 1.97 on 17 degrees of freedom
Multiple R-Squared: 0.755,	Adjusted R-squared: 0.726 
Wald test: 34.4 on 2 and 17 DF,  p-value: 0.00000105 

Your F-stat is twice the one here (see line "weak instruments")