MrPowers/mack

Remove code not being used in drop_duplicates_pkey

Closed this issue · 2 comments

I was trying to port that function into Jodie project and found that the code below will never get executed because of the validation applied to duplication_columns input parameter. I think it will be good to remove that part just to make the code clear and avoid confusion.

    else:
        duplicate_records = (
            data_frame.withColumn(
                "row_number",
                row_number().over(
                    Window().partitionBy(primary_key).orderBy(primary_key)
                ),
            )
            .filter(col("row_number") > 1)
            .drop("row_number")
            .distinct()
        )

@brayanjuls - thanks for flagging this.

@robertkossendey - can you please take a look?

This is fixed now @brayanjuls, thanks for flagging!