GoogleCloudPlatform/DataflowJavaSDK

Streaming dataflow fails with org.xerial.snappy.SnappyIOException [v2.1.0]

ankurcha opened this issue · 6 comments

While running the streaming dataflow pipeline against messages from pubsub. I encountered this error:

java.lang.RuntimeException: java.lang.IllegalArgumentException: unable to deserialize WindowFn
    at com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:288)
    at com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:258)
    at com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:55)
    at com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:43)
    at com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:78)
    at com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:145)
    at com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:932)
    at com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:134)
    at com.google.cloud.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:778)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: unable to deserialize WindowFn
    at org.apache.beam.sdk.util.SerializableUtils.deserializeFromByteArray(SerializableUtils.java:75)
    at org.apache.beam.runners.core.construction.WindowingStrategyTranslation.windowFnFromProto(WindowingStrategyTranslation.java:380)
    at org.apache.beam.runners.core.construction.WindowingStrategyTranslation.fromProto(WindowingStrategyTranslation.java:336)
    at com.google.cloud.dataflow.worker.GroupAlsoByWindowParDoFnFactory.deserializeWindowingStrategy(GroupAlsoByWindowParDoFnFactory.java:239)
    at com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory.create(AssignWindowsParDoFnFactory.java:59)
    at com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:66)
    at com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:365)
    at com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:276)
    ... 11 more
Caused by: org.xerial.snappy.SnappyIOException: [EMPTY_INPUT] Cannot decompress empty stream
    at org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:94)
    at org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:59)
    at org.apache.beam.sdk.util.SerializableUtils.deserializeFromByteArray(SerializableUtils.java:70)
    ... 18 more

It seems like the pipeline does not do any work as it's failing to start.

Job Id: "2017-09-08_15_26_06-8381780606637465333"

We're also seeing this, but related to a BigQuery write. All our Dataflow jobs are failing due to this

We are experiencing the same issues while running batch jobs reading from BigQuery.
These issues appeared first at around 2017-09-08 23:00 UTC. We made no changes to the pipeline.

Version: Apache Beam SDK for Java 2.1.0
Region: us-central1-f... (Output truncated)

We've been rerunning the pipelines a few times and it always crashes just as the worker machines are ready to start. None of the functions report any elements in the input collection nor the output.

jkff commented

We are working on a rollback of a faulty 2.1.0 worker release. Workaround meanwhile is to go back to 2.0.0.

Thanks for the quick turnaround!

jkff commented

We believe the issue has been fixed. Please try again.

Thanks! confirmed that exceptions are gone and messages are being processed in new pipelines.