MrPowers/mack

Brainstorm Python interface for ALTER TABLE

Opened this issue · 3 comments

ALTER TABLE is currently only exposed via the SQL interface.

It'd be nice to run ALTER TABLE with Python code.

Take a look at the code from this blog post for example:

ALTER TABLE delta.`/tmp/delta-table/` ADD COLUMNS (blah string)

There is already this syntax for creating a Delta table:

deltaTable = (DeltaTable.create(sparkSession)
    .tableName("testTable")
    .addColumn("c1", dataType = "INT", nullable = False)
    .addColumn("c2", dataType = IntegerType(), generatedAlwaysAs = "c1 + 1")
    .partitionedBy("c1")
    .execute())

Perhaps we could use this syntax for altering a Delta table:

(mack.alter(delta_table)
    .addColumn("blah", dataType = "string", nullable = False)
    .execute())

Should we overload it to handle modifying an existing column as well?

Please don't and make a PR to Delta Lake's DeltaTable instead 🙏

delta-io/delta#1656

@jaceklaskowski - thanks for commenting and I agree that this would be better in DeltaTable instead.

The Python API for adding constraints would probably be better as an official API as well.

Perhaps we can add these as experimental APIs here in mack to allow for quick iteration? We could even make the import something like import mack.experimental.alter to make it extra clear. Of course we can just skip all this work and go with what's added to Delta Lake itself if the issue you created will be completed in the short term.