Remove code not being used in drop_duplicates_pkey
brayanjuls opened this issue · 2 comments
brayanjuls commented
I was trying to port that function into Jodie project and found that the code below will never get executed because of the validation applied to duplication_columns input parameter. I think it will be good to remove that part just to make the code clear and avoid confusion.
else:
duplicate_records = (
data_frame.withColumn(
"row_number",
row_number().over(
Window().partitionBy(primary_key).orderBy(primary_key)
),
)
.filter(col("row_number") > 1)
.drop("row_number")
.distinct()
)
MrPowers commented
@brayanjuls - thanks for flagging this.
@robertkossendey - can you please take a look?
MrPowers commented
This is fixed now @brayanjuls, thanks for flagging!