nicodv/kmodes

Different clusters when K-Prototypes trained on same data in numpy array and pandas dataframe

RoddyJaques opened this issue · 1 comments

I'm getting very different results when using fit_predict() on a KPrototypes wiht the same dataset as a pandas dataframe and numpy array.

The resulting clusters are very different, I've kept random_state constant, the only difference is the format of the input data. Have checked dtypes and all are consistent.

I added tests for this scenario, but I can't reproduce this: f5532e0

Please provide a fully reproducible example, @RoddyJaques .