schuderer/mllaunchpad

Request: optional `order_columns` parameter of `get_dataframe()`

schuderer opened this issue · 0 comments

  • ML Launchpad version: 1.0.0
  • Model Type used: Python
  • DataSource type(s) used: n/a
  • Python version: 3.x
  • Operating System: all

Description

Right now, if I want to ensure column ordering in a dataframe, I need to pass the df through mllaunchpad.order_columns in my own code.

This makes sense where my code wrangles multiple dataframes in a way that would make the resulting ordering non-predictable from the initial ordering, and I just want to order the columns before I pass the data into my model.

While this works, having to call order_columns in three functions might appear tedious in use cases that pass one dataframe from the datasource straight into the model. Some users might prefer to just call data_sources['bla'].get_dataframe(order_columns=True).

Some users might also prefer to have an option order_columns: true (default false) for the datasources in the configuration file.

Not sure what is the best way to go forward here -- suggestions are welcome!

(issue added for a user)