google/temporian

Automatic Conversion of Pandas Object to Datetime for Timestamps in Eventset Creation

sahusiddharth opened this issue · 3 comments

In the process of creating an eventset from a Pandas DataFrame, an issue arises when timestamps are stored as Pandas objects, making it necessary to convert them to datetime format before further processing. To streamline this workflow and enhance user experience, we propose an automatic conversion mechanism that detects the datatype of the timestamp column:

If the Timestamp Column is Pandas Object:
Automatically convert the column to datetime format, ensuring compatibility for eventset creation. This alleviates the need for users to manually handle datatype conversions.

If the Timestamp Column is Already in Datetime:
Retain the existing datetime format, maintaining consistency for users who have already prepared their data accordingly.

Steps to Reproduce:

  1. Create a Pandas DataFrame with a timestamp column stored as a Pandas object.
  2. Attempt to create an temporian eventset using the timestamp column.
  3. Observe the need for manual conversion from object to datetime datatype.

can be done for other io format as well

Hey @sahusiddharth! Thanks for reporting.

Pandas objects can contain all kinds of types - could you clarify what values you are storing inside the column? Are they strings (e.g. df = pd.DataFrame({'date': ['2022-05-01', '2022-05-02', '2022-05-03']}))?

Hello @ianspektor

I was reffering to the various forms in with date and time are written in form of strings for example 01/05/2024

Got it!

Temporian stores timestamps as unix timestamps (floats) internally, and we try to have as little implicit behavior/magic as possible, so we'd rather force the user to explicitly convert those strings to a date or float manually (which means they understand how those strings are being converted - e.g. in what format they should be interpreted, and if they are in UTC or in their locale).

I agree though that it might be useful to allow specifying some parameters (such as format and locale) to allow string conversion. Will discuss with the team and report back.

Happy new year!