Blocking Issue when running on K8s on GCP
richiesgr opened this issue · 1 comments
richiesgr commented
Hi
I run secor on google cloud kubernetes.
The nodes pool use preemptive machine meaning the machine restart every 24H.
I've deployed 7 pods
The data in kafka is avro and I write Avro files, the files are uploaded in bucket in GCS.
I read data from 5 topics togethers, the data in the topics are filled using Mirror Maker
On some pod After a some hours I get an exception and this is cause the pod to Crashloop
I CANNOT RECOVER FROM HERE. all the failing pod crash the same way when restarted
java.lang.RuntimeException: Failed to write message Message <message binary>
at com.pinterest.secor.consumer.Consumer.handleWriteError(Consumer.java:272)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:232)
at com.pinterest.secor.consumer.Consumer.run(Consumer.java:164)
Caused by: java.io.FileNotFoundException: /mnt/secor_csv/message_logs/partition/1_25/prod-og-monitoring_agg_impressions_sg/dt=2020-12-02/hr=02/1_229_00000000000189724222.gz (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
at org.apache.avro.file.SyncableFileOutputStream.<init>(SyncableFileOutputStream.java:58)
at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:134)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory$AvroFileWriter.<init>(AvroFileReaderWriterFactory.java:131)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory.BuildFileWriter(AvroFileReaderWriterFactory.java:73)
at com.pinterest.secor.util.ReflectionUtil.createFileWriter(ReflectionUtil.java:156)
at com.pinterest.secor.common.FileRegistry.getOrCreateWriter(FileRegistry.java:138)
at com.pinterest.secor.writer.MessageWriter.write(MessageWriter.java:104)
at com.pinterest.secor.consumer.Consumer.writeMessage(Consumer.java:256)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:229)
Only doing this using the AvroFileReaderWriterFactory
HenryCaiHaiying commented
This is a bit weird, maybe the file creation has some delays on GCP (or
directory permission, or system quota)? If it's intermittent problem,
maybe you can take a look at the last few classes in the calling stack to
see whether there is race conditions and whether there are debug flags to
turn on for debugging.
…On Wed, Dec 2, 2020 at 1:05 AM Richard Grossman ***@***.***> wrote:
Hi
I run secor on google cloud kubernetes.
The nodes pool use preemptive machine meaning the machine restart every
24H.
I've deployed 7 pods
The data in kafka is avro and I write Avro files, the files are uploaded
in bucket in GCS.
*I read data from 5 topics togethers, the data in the topics are filled
using Mirror Maker*
On some pod After a some hours I get an exception and this is cause the
pod to Crashloop :
java.lang.RuntimeException: Failed to write message Message <message binary>
at com.pinterest.secor.consumer.Consumer.handleWriteError(Consumer.java:272)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:232)
at com.pinterest.secor.consumer.Consumer.run(Consumer.java:164)
Caused by: java.io.FileNotFoundException: /mnt/secor_csv/message_logs/partition/1_25/prod-og-monitoring_agg_impressions_sg/dt=2020-12-02/hr=02/1_229_00000000000189724222.gz (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
at org.apache.avro.file.SyncableFileOutputStream.<init>(SyncableFileOutputStream.java:58)
at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:134)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory$AvroFileWriter.<init>(AvroFileReaderWriterFactory.java:131)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory.BuildFileWriter(AvroFileReaderWriterFactory.java:73)
at com.pinterest.secor.util.ReflectionUtil.createFileWriter(ReflectionUtil.java:156)
at com.pinterest.secor.common.FileRegistry.getOrCreateWriter(FileRegistry.java:138)
at com.pinterest.secor.writer.MessageWriter.write(MessageWriter.java:104)
at com.pinterest.secor.consumer.Consumer.writeMessage(Consumer.java:256)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:229)
Only doing this using the AvroFileReaderWriterFactory
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1733>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYJP73ULN6JNHKLW2TA3A3SSX7MTANCNFSM4UKGAMDA>
.