Rows or columns?
Closed this issue · 1 comments
First, this looks like a nice package. Great work.
From what I can tell, you're assuming that data points are the rows of the data matrix. This is common in statistics but opposite how several other Julia packages work (e.g., Distances.jl, MultivariateStats.jl). I believe those packages went with column-major for reasons of performance (this is especially relevant for Distances.jl, which needs to iterate over all data points often in an O(N^2) fashion).
I'm not saying you need to switch, but I am suggesting that you document your expectations clearly. In your documentation for lda
, for example, you just call X
a "matrix of floats," which doesn't address your expectation for layout.
Thank you!
I ran into the same row major/column major consideration in a package I wrote for kernel matrix computation. I ended up supporting both ways and it wasn't particularly arduous. I'll make the same enhancement to this package once I have the opportunity. For now, I added a note in the documentation.