frostt-tensor/tensor_parser

Multiple non-zeros per row?

Opened this issue · 0 comments

Is there a use case for generating multiple non-zeros per row of the CSV file? The current framework assumes a simple mapping of rows -> non-zeros (unless a type function returns None, and then the non-zero is skipped).

One case that comes to mind is text parsing, in which we may want to take a CSV field, split it on whitespace, and generate a non-zero for each word (possible stemming, etc. that word via #17).

Another case is in the event of generating a symmetric tensor. Given indices i, j, k, we want to produce non-zeros (i, j, k) and (j, i, k).

But at some point we will need to draw a line between raw data preparation and CSV parsing.

Any thoughts?