Bug in process_datetime_data() converting datetime to int
efstathios-chatzikyriakidis opened this issue · 2 comments
Hi @avsolatorio,
I hope you are well.
The fix #72 is correct and allows to use latest pandas package. However, I am still blocked because of the line:
https://github.com/worldbank/REaLTabFormer/blob/main/src/realtabformer/data_utils.py#L265
There are cases where that could fail, e.g. I have tested in a Windows conda env and failed because bare int
was translated to int32
. Don't ask me why! My last conclusion was that it is related to Windows implementation of things as I have tested the same code and data and it succeeded in Google Colab and in an Ubuntu Linux container on the same Windows host (64bit machine) using WSL.
I think we can be more explicit and use int64
as datetimes are actually 64bit values, this will be in consistency with the following as well:
https://github.com/worldbank/REaLTabFormer/blob/main/src/realtabformer/data_utils.py#L271
So, I suggest to change it from
series = (series.astype(int) / 1e9)
to:
series = (series.astype('int64') / 1e9)
Can you help me on this? I will need a new PyPI version also (1.0.7).
Thank you!
Hello @efstathios-chatzikyriakidis , thanks for letting me know about the root cause likely being because of windows env. The patch is already published.
I highly recommend that you create a PR if you find some of these changes in the future! 😀
Hi @avsolatorio,
Yes, in the future in case I'll find some bug and it is easy to suggest a solution like this one, I will provide a PR.
Thank you so much!