Private Schema and other private methods and constructors
degloff opened this issue · 1 comments
degloff commented
Is there a reason why the convenience object Schema is private?
private[timeseries] object Schema
For instance:
// preferred but not working because Schema private
val tsRdd = TimeSeriesRDD.fromRDD(sc.parallelize(data, defaultNumPartitions), Schema("time" -> LongType, "id" -> IntegerType, "price" -> DoubleType))(isSorted = true, timeUnit = TimeUnit.NANOSECONDS)
val schema = StructType(
StructField("time", LongType) ::
StructField("id", IntegerType) ::
StructField("price", DoubleType) :: Nil)
val tsRdd1 = TimeSeriesRDD.fromRDD(sc.parallelize(data, defaultNumPartitions), schema)(isSorted = true, timeUnit = TimeUnit.NANOSECONDS)
Also, some TimeSeriesRDD constructors are private, which may be useful:
private[timeseries] def fromSeq(
sc: SparkContext,
rows: Seq[InternalRow],
schema: StructType,
isSorted: Boolean,
numSlices: Int = 1
): TimeSeriesRDD
private[flint] def fromOrderedRDD(
rdd: OrderedRDD[Long, Row],
schema: StructType
): TimeSeriesRDD = {
val converter = CatalystTypeConvertersWrapper.toCatalystRowConverter(schema)
TimeSeriesRDD.fromInternalOrderedRDD(rdd.mapValues {
case (_, row) => converter(row)
}, schema)
}
Also for testing access to the OrderedRdd is valuable, but that is also private
private[flint] def orderedRdd: OrderedRDD[Long, InternalRow]
This may open the implementation too much.
icexelloss commented
Hi,
Schema object can probably be public for convenience, but it shouldn't be considered as stable public API.
The private constructors are purely for internal uses and can change drastically, I prefer not to open them.