[Data] error: Argument of type "(df: DataFrame) -> DataFrame" cannot be assigned to parameter
tekumara opened this issue · 4 comments
tekumara commented
What happened + What you expected to happen
vscode / pyright type error:
test.py:14:28 - error: Argument of type "(df: DataFrame) -> DataFrame" cannot be assigned to parameter "fn" of type "CallableClass | ((DataBatch) -> DataBatch)" in function "map_groups"
Type "(df: DataFrame) -> DataFrame" cannot be assigned to type "CallableClass | ((DataBatch) -> DataBatch)"
"function" is incompatible with "CallableClass"
Type "(df: DataFrame) -> DataFrame" cannot be assigned to type "(DataBatch) -> DataBatch"
Parameter 1: type "DataBatch" cannot be assigned to type "DataFrame"
Type "DataBatch" cannot be assigned to type "DataFrame"
"bytes" is incompatible with "DataFrame" (reportGeneralTypeIssues)
Versions / Dependencies
ray 2.4.0
Reproduction script
import ray.data
import pandas as pd
def sum(df: pd.DataFrame) -> pd.DataFrame:
key = df.iloc[0][0]
sum = df.sum()["B"] + df.sum()["C"]
return pd.DataFrame({"key": [key], "sum": [sum]})
df = pd.DataFrame(
{"A": ["a", "a", "b"], "B": [1, 1, 3], "C": [4, 6, 5]}
)
ds = ray.data.from_pandas(df)
grouped = ds.groupby("A")
sumdf = grouped.map_groups(sum).to_pandas()
Issue Severity
Low: It annoys or frustrates me.
amogkam commented
It seems this may be a problem with VScode? The DataBatch
type is correctly defined here: https://github.com/ray-project/ray/blob/master/python/ray/data/block.py#L119
tekumara commented
This works:
def sum(df: DataBatch) -> DataBatch:
ie: the function needs to take a Union
and handle all cases.
If DataBatch were a generic it would allow the function to take only one of the values in the Union
.
I can raise a PR to demonstrate if that's helpful?
amogkam commented
If there's a way to do this without having to include the Union directly in the method signature that would be great!
tekumara commented
Looks like this has all changed on master (ray 2.5) to use a new UserDefinedFunction
type alias, so I'll close tis.