pola-rs/polars

Change `pl.date_range(...)` error message to suggest `pl.date_ranges(...)` when passed an `Expr` (in a `with_columns(...)` call)

Opened this issue · 2 comments

Description

I would have expected the following code to work:

import polars as pl

df = pl.DataFrame({
  'id': pl.Series([1, 2, 3]),
  'date_min': pl.Series(['2020-01-01', '2021-06-15', '2022-11-01']).cast(pl.Date),
  'date_max': pl.Series(['2020-12-01', '2021-12-15', '2023-05-01']).cast(pl.Date)
})

# call `pl.date_range(Expr, Expr, ...)` - fails
df.with_columns(
  pl.date_range(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)

It does not though, which is unfortunate. Instead, you have to use a .map_elements call to work around

They do accept expressions.

You need pl.date_ranges() (plural) for multiple ranges.

df.with_columns(
   pl.date_ranges(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)
shape: (3, 4)
┌─────┬────────────┬────────────┬─────────────────────────────────┐
│ iddate_mindate_maxdate                            │
│ ------------                             │
│ i64datedatelist[date]                      │
╞═════╪════════════╪════════════╪═════════════════════════════════╡
│ 12020-01-012020-12-01 ┆ [2020-01-01, 2020-02-01, … 202… │
│ 22021-06-152021-12-15 ┆ [2021-06-15, 2021-07-15, … 202… │
│ 32022-11-012023-05-01 ┆ [2022-11-01, 2022-12-01, … 202… │
└─────┴────────────┴────────────┴─────────────────────────────────┘

Maybe the current error message could hint to date_ranges()?

df.with_columns(
    pl.date_range(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)
# ComputeError: `start` must contain exactly one value, got 3 values

Nice, that indeed works. Would love if the error message was updated! Will update issue description to point to updating the error message.