samelamin/spark-bigquery

Bug: Streaming Window does not write window.StartTime and window.endTime

Closed this issue · 1 comments

Hi Sam,

Hope you are still enjoying some of the relaxation of your hols and are well rested!

Quick one: I've been trying to build a streaming dataframe using windowing and watermarking. Im getting data written to BQ, however in the target table and schema (and thus data written) produced in BQ, there is NO window start and end time, which spark usually produces itself when applying the windowing function. e.g

        .withWatermark("timestamp", "10 minutes")
        .groupBy(
          window($"timestamp","10 minutes","5 minutes")
        )

ref screen shot below:

screenshot 2017-08-11 21 23 07

Data does get written to BQ however I dont have the usual window start and end time columns, which grouping by a window function normally gives.

Cheers
Kurt

I have an idea as to why will investigate