Loading files as rrays
Closed this issue · 2 comments
Hi!
I was wondering if having an i/o system like np.load() is in the plan for rray sometime in the near future. It would be cool to have an rray_load() function to load matrices into memory. It is definitely useful for my field (NLP) where we have word embeddings where every row is an n-dimensional vector that encodes some form of semantic information of the word. Often times in NLP systems these are the inputs for models that do various tasks.
Let me know your thoughts on this!
Thanks
What are you trying to load data from? If it is a rds file you don't have to do anything special. CSV? You might be able to use tseries::read.matrix()
, or just use readr
or vroom
then convert to a matrix. Even though xtensor has methods to load from CSV (where you have to declare ahead of time the type of the data, double/integer/etc) it feels a bit outside of rray's wheelhouse. I'd like to keep it as much about array manipulation as possible.
In certain cases its a .bin file and in certain other ones it's either numpy specific or a custom .vec file (will have to check how its defined or any documentation on it), and many time its a .txt file with the first column as the rownames, column names dont matter since these vectors have arbitrary dimensions. But I do understand you not wanting rrays to venture into I/O as well. I will try and get back to you with something useful if I find it!
Thanks!