RfastOfficial/Rfast

bic.corfsreg throws error in certain cases

lutzvdb opened this issue · 5 comments

Describe the bug
In very few cases, variable selection in bic.corfsreg fails. We weren't able to exactly determine what determines whether the function will crash or not. In >99% of cases, it works just fine for us.

To Reproduce
bug.csv
A minimal example CSV is attached here. Code to produce the error is:

library(data.table)
library(Rfast)

dt <- fread("bug.csv")
bic.corfsreg(dt$Y, as.matrix(dt[, .(X1, X2)]))

Expected behavior
I expect the usual output coming from forward regression, not an error thrown.

Desktop (please complete the following information):

  • OS: Windows 10 64-bit as well as Windows 11 64-bit
  • R-Version 4.0.2 as well as 4.2.0
  • Rfast-Version 2.0.6 as well as 1.9.9

Additional context
The bug seems highly dependent on certain input conditions. Consider the following example, where one row added or removed makes the difference between the bug appearing or not:

library(data.table)
library(Rfast)

dt <- fread("bug.csv")

dtx <- dt[1848:5052]; bic.corfsreg(dtx$Y, as.matrix(dtx[, .(X1, X2)])) # fails
dtx <- dt[1848:5051]; bic.corfsreg(dtx$Y, as.matrix(dtx[, .(X1, X2)])) # works
dtx <- dt[1847:5052]; bic.corfsreg(dtx$Y, as.matrix(dtx[, .(X1, X2)])) # works

A colleage has pointed out that maybe what's wrong is that there is no criterion for stopping the search if all predictors are exhausted. Increasing the BIC tolerance by a lot makes the error disappear, compare:

bic.corfsreg(dt$Y, as.matrix(dt[, .(X1, X2)]), tol = 10) # fails
bic.corfsreg(dt$Y, as.matrix(dt[, .(X1, X2)]), tol = 100) # works

In fact, adding a breaking criterion as follows to try to respect the number of available predictors resolves the bug:

if (k > p + 1) break #insert below k <- k + 1

I am unsure however if this alters the intended functionality and statistical properties of the algorithm.

Hi lutzvdb.

I will check the error at the weekend using your data. Cheers for this bug.
Do you want to send us an email with your details to add your name in the acknowledgements?

Michail

Hey Michail,

thank you for the quick reply. My name is no secret: Lutz von der Burchard. Or did you mean other details?

Cheers and Merry Christmas to you

I will add your name, that's enough.

Merry Christmas to you too.

Hi Lutz, the bug is removed and I added your name in the acknowledgements. Once Rfast is updated everyhthing will be sorted.

Thank you very much for sorting it out this quickly!