snuspl/harmony

Handle different in-memory data formats for the same input data

Opened this issue · 0 comments

Different apps (e.g., MLR, GBT, Lasso) may use the same input data.
But in some case, they use different in-memory format for the exactly same data.

  • MLR, which is for classification task, maintains values in integer type.
  • Lasso, which is for regression task, maintains values in float type.
  • GBT, which is for both classification and regression, maintains value in float.

This becomes problem in #21, which makes jobs share the input table for the same input file.

We may fix them all to store data in one same type (integer or float) and transform it on use.