GoogleCloudDataproc/spark-bigquery-connector

Map type with Complex Value not supported any more

walmaaoui opened this issue · 1 comments

I see what it could be a regression.

The following code works in connector version 0.25.2 but it doesn't in 0.34.0. Is it an expected/wanted change?

    case class Complex(v: Int)
    case class NestedComplexMapType(nested: Map[Int, Complex])
    
    val ds1 = Seq(NestedComplexMapType(Map(0 -> Complex(1)))).toDS

    ds1.write
      .format("bigquery")
      .option("temporaryGcsBucket", temporalBucket)
      .option("intermediateFormat", "orc")
      .option("dataset", "it_test")
      .mode("overwrite")
      .save("nested_map_complex_type")

The dataframe schema is

  root
 |-- nested: map (nullable = true)
 |    |-- key: integer
 |    |-- value: struct (valueContainsNull = true)
 |    |    |-- v: integer (nullable = false)

The above:

  • Works with version 0.25.2 and gives a table on BQ with the following schema
Screenshot 2024-03-05 at 15 42 51
  • Fails with version 0.34.0 with the following message
[info]   java.lang.IllegalArgumentException: Data type not expected: struct<v:int>
[info]   at com.google.cloud.spark.bigquery.SchemaConverters.toBigQueryType(SchemaConverters.java:607)
[info]   at com.google.cloud.spark.bigquery.SchemaConverters.createBigQueryColumn(SchemaConverters.java:510)
[info]   at com.google.cloud.spark.bigquery.SchemaConverters.sparkToBigQueryFields(SchemaConverters.java:468)
[info]   at com.google.cloud.spark.bigquery.SchemaConverters.toBigQuerySchema(SchemaConverters.java:456)
[info]   at com.google.cloud.spark.bigquery.write.BigQueryWriteHelper.<init>(BigQueryWriteHelper.java:95)

Although BQ doesn't support Maps:

  • The connector used (until 0.25.2) to allow it and transform it to a List of Records (the exact schema depends on the intermediate file format choice)
  • Now, the connector still allows Map of simple types as value but not complex types any more, which is strange

This is fixed and will be available in the next release of the connector (0.38).