Problems with CI coverage
nhejazi opened this issue · 3 comments
There's something horribly wrong with the parameter estimate -- the Monte Carlo variance is quite high in small samples and still bad even in large samples, affecting confidence interval coverage quite significantly:
n | est | var | bias | coverage |
---|---|---|---|---|
100 | 1.841118 | 0.6168012 | -0.1588817 | 0.282 |
10000 | 1.998961 | 0.0005663 | -0.0010387 | 0.917 |
The bug reported here was introduced at some point between ee46907 and 81175b1.
Based on manual inspection, there's nothing obviously different about the code used for computing the parameter estimates and related inference between the two commits given above. The only seemingly noteworthy point of difference is a change in the argument eif_tol
from an arbitrarily small value (previously 1e-7) to a value relative to the sample size (1/length(Y)
). Unfortunately, re-running the simulation with a forced EIF tolerance of 1e-7 does not seem to resolve the issue:
Without Censoring
n | est | var_mc | var_avg | bias | coverage |
---|---|---|---|---|---|
100 | 1.841118 | 0.6168012 | 0.0676804 | -0.1588817 | 0.282 |
10000 | 1.998961 | 0.0005663 | 0.0004627 | -0.0010387 | 0.917 |
With Censoring
n | est | var_mc | var_avg | bias | coverage |
---|---|---|---|---|---|
100 | 1.787103 | 1.936063 | 0.1261214 | -0.2128974 | 0.094 |
10000 | 1.994722 | 0.0006802 | 0.0008079 | -0.0052783 | 0.967 |
From this it seems to me that all simulation statistics associated with the larger sample size appear to indicate consistent/decent performance. Perhaps the estimator is unstable for smaller sample sizes?
Also @jeremyrcoyle notes that "...might be because you’re not doing the 'realistic regime' thing where you don’t shift values already on the edge -- i.e., you’re shifting the already high values out of the range where you have positivity, which would lead to variance issues." This could very well be causing the aberrant small sample behavior.
After running the same simulation over a selection of sample sizes, I think the results constitute convincing evidence that the problems in coverage are due to positivity violations. These are results produced from 1000 simulations over each sample size that appears below
Without Censoring:
n | est | var_mc | var_avg | bias | coverage |
---|---|---|---|---|---|
100 | 1.841118 | 0.6168012 | 0.0676804 | -0.1588817 | 0.282 |
500 | 1.953483 | 0.0377445 | 0.0105528 | -0.0465169 | 0.648 |
1000 | 1.970875 | 0.0121869 | 0.0049859 | -0.0291253 | 0.778 |
5000 | 1.995515 | 0.0012752 | 0.0009348 | -0.0044854 | 0.908 |
10000 | 1.998961 | 0.0005663 | 0.0004627 | -0.0010387 | 0.917 |
With Censoring
n | est | var_mc | var_avg | bias | coverage |
---|---|---|---|---|---|
100 | 1.787103 | 1.936063 | 0.1261214 | -0.2128974 | 0.094 |
500 | 1.919533 | 0.0938503 | 0.0202032 | -0.0804669 | 0.518 |
1000 | 1.942053 | 0.0266887 | 0.0092032 | -0.057947 | 0.675 |
5000 | 1.989203 | 0.0018198 | 0.001657 | -0.010797 | 0.938 |
10000 | 1.994722 | 0.0006802 | 0.0008079 | -0.0052783 | 0.967 |