Azure/azure-cosmosdb-spark

JSONObject$Null error when select a null struct from CosmosDB

Opened this issue · 0 comments

I am getting the below error when trying to query Cosmosdb. I also get a similar error in the ToSQL method when trying to save a document back. The section of JSON causing the issue is as follows. I don't think it likes the fact the LoyaltyCustomerLink struct is null on the second item in the array.

…,
LoyaltyAccountLinks": [
{
"LoyaltyBrandCode": "Unknown",
"LoyaltyBrand": "Unknown",
"LoyaltySystemId": "771719",
"LoyaltyCustomerLink": {
"id": "CUST34241_1514493c-f9ec-4ae3-8629-c5831a6749db",
"PartitionKey": "CUST34241"
}
},
{
"LoyaltyBrandCode": "99",
"LoyaltyBrand": "brand1",
"LoyaltySystemId": "46201616",
"LoyaltyCustomerLink": null
}
],`
...
The inferred schema from SPARK looks OK i.e. it is allowing nulls on this property.

|-- LoyaltyAccountLinks: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- LoyaltySystemId: string (nullable = true)
| | |-- LoyaltyCustomerLink: struct (nullable = true)
| | | |-- PartitionKey: string (nullable = true)
| | | |-- id: string (nullable = true)
| | |-- LoyaltyBrandCode: string (nullable = true)
| | |-- LoyaltyBrand: string (nullable = true)

Error as follows

Error in SQL statement: SparkException: Job aborted due to stage failure: Task 0 in stage 64.0 failed 4 times, most recent failure: Lost task 0.3 in stage 64.0 (TID 129, 10.139.64.5, executor 0): scala.MatchError: null (of class cosmosdb_connector_shaded.org.json.JSONObject$Null)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$$anonfun$toSQL$1.apply(CosmosDBRowConverter.scala:104)
at scala.Option.map(Option.scala:146)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$.toSQL(CosmosDBRowConverter.scala:99)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$$anonfun$1$$anonfun$apply$5.apply(CosmosDBRowConverter.scala:93)
at scala.Option.map(Option.scala:146)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$$anonfun$1.apply(CosmosDBRowConverter.scala:93)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$$anonfun$1.apply(CosmosDBRowConverter.scala:84)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$.recordAsRow(CosmosDBRowConverter.scala:84)
at com.microsoft.azure.cosmosdb.spark.schema.CosmosDBRowConverter$$anonfun$toSQL$1.apply(CosmosDBRowConverter.scala:108)