TPC-DS.. dataGen.. format
kgebaly opened this issue · 2 comments
kgebaly commented
table.genData(tableLocation, format, overwrite, clusterByPartitionColumns,
What value does format take when generating TPC-DS benchmarks?
npaluskar commented
format is for type of data. So it has to mentioned as a string thats what i have found out in Tables.scala
def genData(
location: String,
format: String,
overwrite: Boolean,
clusterByPartitionColumns: Boolean,
filterOutNullPartitionValues: Boolean,
numPartitions: Int)
e.g "text"
so you can give something like tables.genData("/path/to_Data", "text", true, true, true, true, true)
sridharpothamsetti commented
we can use parquet/avro etc. I tried with parquet.