How do I serialize a dynamoDB column of string set datatype?
Opened this issue · 1 comments
Hi guys, thanks for creating this project, it has been of great help to me and I have enjoyed using it so far.
I have a column in my table that is of string set datatype, and it is currently being inferred as an Array[String]
which gets persisted as list of string when being written back to dynamoDB. I have tried coercing it toSet[String]
but it is still being written back to dynamoDB as list of string. What datatype should I coerce it to in order to write the column as a string set?
Expected
"names": {
"SS": [
"dummy-name"
]
}
Actual
"names": {
"L": [
{
"S": "dummy-name"
},
]
}
Hello!
Thank you for using our library.
The problem with this issue is that Spark does not have a Set type - the best option is to read it as an array. The problem is that now we forget that it used to be a Set, and when writing it will become a List (due to the array->List conversion).
I can imagine a few solutions:
- Maintain some kind of metadata in Spark about the field's origin type in Dynamo, and use this when writing back into Dynamo
- Add an option to write arrays as Set instead of List, perhaps on a per-column basis
I would prefer solution 1. We will consider building it if we have time. PRs are welcome :)