databricks/iceberg-kafka-connect

Record projection Index out of bounds error

Closed this issue · 1 comments

Hi ,

Im getting Index 2 out of bounds error when writing deletes writer.deleteKey(keyProjection.wrap(row)); couldnt pinpoint the issue but looks like a bug in iceberg code?

It seems like. In the ParquetValueWriters for delete files writers.length has full record schema, instead of key schema.

then its failing to lookup non key fields. so its looping over all the fields, which are more than key fields, and then failing with index error.

https://github.com/apache/iceberg/blob/ab2c6f889d07eeee51a1f58605be248e9330d91b/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueWriters.java#L578-L583

here is a test to reproduce it #287

cc @bryanck

Just found the issue! it was the issue when generating GenericAppenderFactory in which full table schema was given, instead of key schema

https://github.com/tabular-io/iceberg-kafka-connect/blob/595f835f5d9174e57660b12f407dabc84781e500/kafka-connect/src/test/java/io/tabular/iceberg/connect/data2/IcebergUtil.java#L96-L103