paobranco/UBL

Error in neighbours(tgt, dat, dist, p, k) : long vectors (argument 10) are not supported in .Fortran

Opened this issue · 1 comments

I got this error running SmoteClassif(). I'm trying to understand this error so I can work around the issue. Is there a function I could use where given the number of columns in the dataset and various parameter settings of the function, it would provide the maximum number of rows I could have in my dataset for the function to work (i.e. not pass long vectors to Fortran)? Understanding this would help me scale my analysis.

For what it's worth I just asked the question on stackexchange as well. It provides an example of the problem I encountered. In short, the function worked with ~357,000 rows/~150 columns but fails with ~186,000 rows/~190 columns -- which made me wonder about the math behind the function that influences the vector lengths. Thanks so much for any input you may have and for providing a such a great function!

https://stackoverflow.com/questions/75377710/how-do-inputs-to-ublsmoteclassif-influence-vectors-lengths-passed-to-fortran

FWIW -- -The 10 argument passed to the Fortran routine is distm, which is a nxn matrix, and n is the length of tgtData -- so 186,000 in my example above that does not work. So nxn here is much smaller than the largest 32 bit integer, so I still don't get why this error is occurring...