Presto 0.157.1 + Lzop: NullPointerException
idanh opened this issue · 1 comments
Hey guys,
So I've been successfully using your library with EMR (emr-5.3.1) & Hive (2.1.1) with LZOP_X1 (no constraints) and now moving to Presto (0.157.1) I get the following stack trace:
com.facebook.presto.spi.PrestoException: java.lang.reflect.InvocationTargetException
at com.facebook.presto.hive.HiveSplitSource.propagatePrestoException(HiveSplitSource.java:137)
at com.facebook.presto.hive.HiveSplitSource.isFinished(HiveSplitSource.java:115)
at com.facebook.presto.split.ConnectorAwareSplitSource.isFinished(ConnectorAwareSplitSource.java:63)
at com.facebook.presto.split.BufferingSplitSource.fetchSplits(BufferingSplitSource.java:59)
at com.facebook.presto.split.BufferingSplitSource.lambda$fetchSplits$1(BufferingSplitSource.java:65)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:580)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at com.facebook.presto.hive.HiveUtil.isSplittable(HiveUtil.java:276)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:246)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:78)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:179)
at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
... 4 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.facebook.presto.hive.HiveUtil.isSplittable(HiveUtil.java:273)
... 9 more
Caused by: java.lang.NullPointerException
at com.hadoop.mapred.DeprecatedLzoTextInputFormat.isSplitable(DeprecatedLzoTextInputFormat.java:101)
... 14 more
Now, The query I'm getting this exception with works well in Hive. It's basically:
select * from table limit 10;
I've added an .lzo.index
near my lzop file in S3 but to no eval.
As far as I can tell, DeprecatedLzoTextInputFormat.class
has a member called indexes
which, if not populated well, get NPE here: https://github.com/twitter/hadoop-lzo/blob/master/src/main/java/com/hadoop/mapred/DeprecatedLzoTextInputFormat.java#L101
As no check is begin made on LzoIndex index
.
Now, I presumed with your library I could pass on that check by it seems like it's not working.
I'm using aircompressor-0.9.jar
. I've copied it to /usr/lib/presto/plugin/hive-hadoop2
and removed any older version that was in there.
I am confident that your code is actually called (from the stack trace, and many many tests I've done with and without aircompressor jar).
So for my question: Did you guys ever managed to resolve this?
Relevant EMR cluster configuration:
{
"classification": "core-site",
"properties": {
"io.compression.codec.lzo.class": "io.airlift.compress.lzo.LzopCodec",
"io.compression.codecs": "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,io.airlift.compress.lzo.LzoCodec,io.airlift.compress.lzo.LzopCodec"
},
"configurations": []
}
Thank you very much!
- Idan
Reading that DeprecatedLzoTextInputFormat, it doesn't seem to have anything to do with the compression implementation, and instead appears to be about creating Splits. If this is still an issue you are having with Presto, I suggest filing an issue there (maybe with instructions to reproduce).