dllllb/pytorch-lifestream

PandasPreprocessor low speed and High memory consumption

Opened this issue · 0 comments

Hey there!

Don't really know, if it is okay, however, I've tested your lib at work (SBER) and the PandasPreprocessor performed poorly.
I had about 10 mil events and about 25 features. PandasPreprocessor was working for about 2h, while my script with linear complexity did the job within 5 minutes with a numpy array instead of pandas df.

Just raising this issue, so maybe you reconsider your preprocessor algorithm.

Happy New Year!