Remove code not being used in drop_duplicates_pkey

Question

Remove code not being used in drop_duplicates_pkey

brayanjuls opened this issue 2 years ago · 2 comments

I was trying to port that function into Jodie project and found that the code below will never get executed because of the validation applied to duplication_columns input parameter. I think it will be good to remove that part just to make the code clear and avoid confusion.

    else:
        duplicate_records = (
            data_frame.withColumn(
                "row_number",
                row_number().over(
                    Window().partitionBy(primary_key).orderBy(primary_key)
                ),
            )
            .filter(col("row_number") > 1)
            .drop("row_number")
            .distinct()
        )

Answer 1 · 2023-02-02T21:12:42.000Z

@brayanjuls - thanks for flagging this.

@robertkossendey - can you please take a look?

Answer 2 · 2023-02-15T21:54:15.000Z

This is fixed now @brayanjuls, thanks for flagging!