sfu-db/dataprep

Need data-type for each column in create_report function.

anthng opened this issue · 2 comments

Hi all,
Currently, I need to add a data-type (type) param in creat_report() like as plot() function. This data type can help me generate report with numerical/categorical features without affecting "Distinct Count".

This image below was automatically generated by creat_report. However, my expected output is numerical stats and visualization.
image

My expected feature:

dttype = {c: "Continuous" for c in dataframe.columns}
creat_report(dataframe, dtype=dttype)

Any solution to my problem, please support me. Thanks

I see. So it seems DataPrep automatically identified your columns as categorical. May I ask what is the output of dataframe.dtypes?

I see. So it seems DataPrep automatically identified your columns as categorical. May I ask what is the output of dataframe.dtypes?

I cast all dtype of dataframe.dtypes to float before creating report. In e.g above, I attempted to cast "Continous", but it does not work
I guess that DataPrep automatically identifies a feature is numerical or categorical based on "distinct count" and "data type". I am not sure about this.