Support projections in ParquetAvroFileOperations/ParquetAvroSortedBucketIO
clairemcginty opened this issue · 1 comments
clairemcginty commented
ParquetAvroFileOperations always overrides the "projection" option to equal the full reflected schema, so you can't supply a projection for a SpecificRecord class:
clairemcginty commented
#5083 provides a workaround for this via the Configuration
parameter:
val projection: Schema = ...
val configuration = ParquetConfiguration.empty()
AvroReadSupport.setRequestedProjection(configuration, projection)
val read = ParquetAvroSortedBucketIO
.read(tupleTag, classOf[TestRecord])
.from(...)
.withConfiguration(configuration)
In 0.14 we can add projection
as a Builder method to ParquetAvroSortedBucketIO