ray-project/ray

[Data] error: Argument of type "(df: DataFrame) -> DataFrame" cannot be assigned to parameter

tekumara opened this issue · 4 comments

What happened + What you expected to happen

vscode / pyright type error:

test.py:14:28 - error: Argument of type "(df: DataFrame) -> DataFrame" cannot be assigned to parameter "fn" of type "CallableClass | ((DataBatch) -> DataBatch)" in function "map_groups"
    Type "(df: DataFrame) -> DataFrame" cannot be assigned to type "CallableClass | ((DataBatch) -> DataBatch)"
      "function" is incompatible with "CallableClass"
      Type "(df: DataFrame) -> DataFrame" cannot be assigned to type "(DataBatch) -> DataBatch"
        Parameter 1: type "DataBatch" cannot be assigned to type "DataFrame"
          Type "DataBatch" cannot be assigned to type "DataFrame"
            "bytes" is incompatible with "DataFrame" (reportGeneralTypeIssues)

Versions / Dependencies

ray 2.4.0

Reproduction script

import ray.data
import pandas as pd

def sum(df: pd.DataFrame) -> pd.DataFrame:
    key = df.iloc[0][0]
    sum = df.sum()["B"] + df.sum()["C"]
    return pd.DataFrame({"key": [key], "sum": [sum]})

df = pd.DataFrame(
    {"A": ["a", "a", "b"], "B": [1, 1, 3], "C": [4, 6, 5]}
)
ds = ray.data.from_pandas(df)
grouped = ds.groupby("A")
sumdf = grouped.map_groups(sum).to_pandas()

Issue Severity

Low: It annoys or frustrates me.

It seems this may be a problem with VScode? The DataBatch type is correctly defined here: https://github.com/ray-project/ray/blob/master/python/ray/data/block.py#L119

This works:

def sum(df: DataBatch) -> DataBatch:

ie: the function needs to take a Union and handle all cases.

If DataBatch were a generic it would allow the function to take only one of the values in the Union.

I can raise a PR to demonstrate if that's helpful?

If there's a way to do this without having to include the Union directly in the method signature that would be great!

Looks like this has all changed on master (ray 2.5) to use a new UserDefinedFunction type alias, so I'll close tis.