YotpoLtd/metorikku

hudi - how to use no partitions or non date partition?

Opened this issue · 1 comments

how to define no partitions, or an 'unknown' column partition or a partition that is a string?

Caused by: java.lang.IllegalArgumentException: Partition path default is not in the form yyyy/mm/dd
at com.uber.hoodie.hive.SlashEncodedDayPartitionValueExtractor.extractPartitionValuesInPath(SlashEncodedDayPartitionValueExtractor.java:54)
at com.uber.hoodie.hive.HoodieHiveClient.getPartitionEvents(HoodieHiveClient.java:216)
at com.uber.hoodie.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:160)
... 62 more

We have 2 options either not sending a partitionBy config and then partitions are ignored and com.uber.hoodie.NonpartitionedKeyGenerator is used to partition. Or write a simple unknown partition:

steps:
  - dataFrameName: table_data
    sql:
      SELECT *, 'default' as default
      FROM table

output:
  - outputType: Hudi
    dataFrameName: table_data
    outputOptions:
      path: path.parquet
      keyColumn: id
      timeColumn: date
      saveMode: Append
      hivePartitions: default
      partitionBy: default
      tableName: hive_table