snowflakedb/snowflake-kafka-connector

Struct array elements are being serialized before writing to Array column

Closed this issue · 4 comments

I have a field in avro schema that is an Array of Struct type. Table has an existing column that's of Array type. I'm seeing array elements being serialized (Seeing array of stringified JSON in the SF column). Since the column is Array, it can take an array of object (struct/variant). Is there any way to disable this serialization and treat the elements as a variant instead?

Here's a quick comparison:
Expected value in column:

[
  null,
  {
    "xCategory": null,
    "xEntityId": null,
    "xEntityName": null,
    "xId": "89asda9s0a"
  }
]

Actual:

[
  "null",
  "{\"xCategory\":null,\"xEntityId\":null,\"xId\":\"89asda9s0a\",\"xEntityName\":null}"
]

Was digging into the codebase. It looks like this RecordService#getMapFromJsonNodeForStreamingIngest() does not handle objects as array elements and just stringifies it. Is this intentional? What if you recursively build a List<Object> instead of List<String> ?

This is indeed not clear why to stringify array elements. SDK would indeed treat it as a string.
@sfc-gh-japatel @sfc-gh-tzhang wdyt?

@sfc-gh-zefan is it expected this way for Snowpipe?

Looks like this is being addressed in #730.

Closing as this is fixed in #730.