google/temporian

Extend EventSet's >, <, >=, <= to work with string literals

ianspektor opened this issue · 0 comments

  • EventSet < EventSet works with two numerical EventSets
  • EventSet < int works with a numerical EventSet
  • EventSet < EventSet works with two string EventSets
  • EventSet < str doesn't work with a string EventSet, and it should:
In [1]: import temporian as tp

In [2]: evset = tp.event_set([1], features={'a': [3]})

In [3]: evset < 4
Out[3]:
indexes: []
features: [('a', bool_)]
events:
     (1 events):
        timestamps: [1.]
        'a': [ True]
memory usage: 0.5 kB

In [4]: evset = tp.event_set([1], features={'a': ['a']})

In [5]: evset < 'b'
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 evset < 'b'

File ~/temporian/temporian/core/event_set_ops.py:1233, in EventSetOperations.__lt__(self, other)
   1229     from temporian.core.operators.scalar import less_scalar
   1231     return less_scalar(input=self, value=other)
-> 1233 self._raise_error("compare", other, "(int,float)")
   1234 assert False

File ~/temporian/temporian/core/event_set_ops.py:88, in EventSetOperations._raise_error(self, op_name, other, allowed_types)
     80 def _raise_error(
     81     self, op_name: str, other: Any, allowed_types: str
     82 ) -> None:
     83     """Raises an error message.
     84
     85     This utility method is used in operator implementations, e.g., +, - *.
     86     """
---> 88     raise ValueError(
     89         f"Cannot use operator '{op_name}' on {self._clsname} and"
     90         f" {type(other)} objects. Only {self._clsname} or values of type"
     91         f" ({allowed_types}) are supported."
     92     )

ValueError: Cannot use operator 'compare' on EventSet and <class 'str'> objects. Only EventSet or values of type ((int,float)) are supported.

Same goes for all other comparison ops.

The T_SCALAR tuple defined in temporian/core/event_set_ops.py only accounts for (int, float), but in some ops it should include str too. A second tuple needs to be defined for it and used where appropiate.

Note that equal and __ne__ in the same file use T_SCALAR + (bool, str), which could be defined as a third option.

Thorough tests needed to ensure we are accepting and rejecting the correct type on each op.