pandas-dev/pandas

API: indexing dates-with-datetime64

Opened this issue · 3 comments

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

ser = pd.Series(["2016-01-01"], dtype="date32[pyarrow]")
ser2 = ser.astype("timestamp[ns][pyarrow]")
ser3 = ser.astype("datetime64[ns]")

dti = pd.Index(ser3)
dti.get_loc(ser[0])  # raises KeyError
dti.get_indexer(ser.values)  # -1s
dti.get_indexer(ser.values.astype(object))  # 0s; inconsistent

Issue Description

DatetimeIndex.get_indexer has a special case (actually in Index._maybe_downcast_for_indexing) for sequences of date objects that is inconsistent with both scalar treatment and comparison op behavior. This was mostly benign before the existence of a date dtype, but now has the potential to cause problems. The special case should be deprecated.

Expected Behavior

NA

Installed Versions

Replace this line with the output of pd.show_versions()

Hi @jbrockmendel, I was wondering if the correct fix here would be to make datetime.date objects never match a DatetimeIndex (so lookups always fail unless the user explicitly converts them to datetime64). Is this the right approach to resolve this issue, or should implicit matching still be supported? Please let me know if this is the recommended direction.

The special case should be removed/deprecated.

take