laughingman7743/PyAthena

Running a DELETE query on Iceberg table results in a 404 error

dancoates opened this issue · 3 comments

Iceberg tables in Athena allow for running a DELETE statement, eg:

DELETE FROM my_table
WHERE timestamp <= cast('2023-05-01' as timestamp)

When running the above query through pyathena I get the following error:

An error occurred (404) when calling the HeadObject operation: Not Found

I assume this is because the delete query produces no result file for pyathena to read?
Is there a way for pyathena to execute such queries without returning results?

Is Cursor using the default cursor? Or are you using PandasCursor?

Sorry, I should have specified that, this is with the Arrow cursor. I've just tested again using the default cursor and the error doesn't seem to happen.

This is the code that causes the error for me.

from pyathena import connect
from pyathena.arrow.cursor import ArrowCursor

cursor = connect(
    s3_staging_dir="s3://bucket/results",
    cursor_class=ArrowCursor
).cursor(unload=True)

result = cursor.execute("""
    DELETE FROM my_table
    WHERE timestamp <= cast('2023-05-01' as timestamp)
""").as_arrow()

So it certainly isn't a major issue, as it is easy to work around by using a different cursor for deleting, but would be nice to be able to use the same cursor for reads and deletes for consistency's sake.

Thank you.
Yes, I think it should be addressed so that the error does not occur no matter which cursor is used. I will check this weekend.