Change `pl.date_range(...)` error message to suggest `pl.date_ranges(...)` when passed an `Expr` (in a `with_columns(...)` call)
Opened this issue · 2 comments
DeflateAwning commented
Description
I would have expected the following code to work:
import polars as pl
df = pl.DataFrame({
'id': pl.Series([1, 2, 3]),
'date_min': pl.Series(['2020-01-01', '2021-06-15', '2022-11-01']).cast(pl.Date),
'date_max': pl.Series(['2020-12-01', '2021-12-15', '2023-05-01']).cast(pl.Date)
})
# call `pl.date_range(Expr, Expr, ...)` - fails
df.with_columns(
pl.date_range(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)
It does not though, which is unfortunate. Instead, you have to use a .map_elements
call to work around
cmdlineluser commented
They do accept expressions.
You need pl.date_ranges()
(plural) for multiple ranges.
df.with_columns(
pl.date_ranges(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)
shape: (3, 4)
┌─────┬────────────┬────────────┬─────────────────────────────────┐
│ id ┆ date_min ┆ date_max ┆ date │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ date ┆ date ┆ list[date] │
╞═════╪════════════╪════════════╪═════════════════════════════════╡
│ 1 ┆ 2020-01-01 ┆ 2020-12-01 ┆ [2020-01-01, 2020-02-01, … 202… │
│ 2 ┆ 2021-06-15 ┆ 2021-12-15 ┆ [2021-06-15, 2021-07-15, … 202… │
│ 3 ┆ 2022-11-01 ┆ 2023-05-01 ┆ [2022-11-01, 2022-12-01, … 202… │
└─────┴────────────┴────────────┴─────────────────────────────────┘
Maybe the current error message could hint to date_ranges()
?
df.with_columns(
pl.date_range(pl.col('date_min'), pl.col('date_max'), '1mo').alias('date')
)
# ComputeError: `start` must contain exactly one value, got 3 values
DeflateAwning commented
Nice, that indeed works. Would love if the error message was updated! Will update issue description to point to updating the error message.