qubole/rubix

Class not found: com.facebook.presto.hive.PrestoS3FileSystem

Closed this issue ยท 8 comments

I get the above error when trying to create a table in Hive using the rubix scheme. The log output shared at the bottom is from hive.log.

I installed Rubix as:

rubix_admin installer install --cluster-type presto
rubix_admin daemon start

For setting the fs.rubix.impl I set it directly in core-site.xml rather than setting it interactively. Not sure if it should make a difference.

What's interesting is that the class name com.facebook.presto.hive.PrestoS3FileSystem is wrong when looking at the source code.
The Rubix source code tries to instantiate com.facebook.presto.hive.**s3**.PrestoS3FileSystem but I'm not sure why I'm seeing a different class name when running on a cluster. I'm running on PrestoDB rather than PrestoSQL.

2020-05-04T22:58:41,336 ERROR [a5dea5fa-bb54-4adb-8d69-f3eeab84bcf9 main([])]: ql.Driver (SessionState.java:printError(1130)) - FAILED: SemanticException java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
org.apache.hadoop.hive.ql.parse.SemanticException: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1659)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1651)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1647)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:11988)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:11040)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11153)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:134)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2858)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2896)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2878)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:392)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.tryQualifyPath(BaseSemanticAnalyzer.java:1669)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1656)
        ... 24 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:132)
        ... 32 more
Caused by: java.lang.TypeNotPresentException: Type com.facebook.presto.hive.PrestoS3FileSystem not present
        at sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:117)
        at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
        at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
        at sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68)
        at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138)
        at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
        at sun.reflect.generics.repository.ClassRepository.getSuperclass(ClassRepository.java:90)
        at java.lang.Class.getGenericSuperclass(Class.java:777)
        at com.qubole.rubix.core.CachingFileSystem.getTypeParameterClass(CachingFileSystem.java:75)
        at com.qubole.rubix.core.CachingFileSystem.<init>(CachingFileSystem.java:83)
        at com.qubole.rubix.presto.CachingPrestoS3FileSystem.<init>(CachingPrestoS3FileSystem.java:31)
        ... 37 more
Caused by: java.lang.ClassNotFoundException: com.facebook.presto.hive.PrestoS3FileSystem
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
        ... 47 more

I think I have figured this out. The repo at version rubix-root-0.3.3.1 does indeed use the older class name while I am running a Presto release which has changed the class name.

I think a new RPM build needs to be published.

Few points:

  1. You are right, I see rpms are not updated for quite some time for rubix-admin
  2. I guess you are trying rubix with prestosql. In that case you do not need to install rubix using rubix-admin. You dont even need to make changes to core-site.xml, create tables from Hive as you normally would. Failure in your case is happening in Hive table creation because due to the change in core-site.xml, Hive is also trying to use rubix.

I'm using Rubix with Presto on EMR. And as far as I can see EMR uses prestodb (not prestosql) and prestodb had a very old commit regarding a change in class name at prestodb/presto@f38ec41

Regarding pt 2 above, are you pointing to the fact that Rubix is included as a plugin within prestosql by default? Is the same true for prestodb (not to my knowledge).

Correct me if the below seems wrong anywhere,

In that case for PrestoDB, the steps are:

  1. rubix_admin installer install --rpm path-to-compiled-from-source-rpm
  2. rubix_admin daemon start

No changes need to be made to hive-site.xml?

I was talking about prestosql commit (trinodb/trino@540e14c). Rubix is not available in prestodb.

For your use case, do this:

  1. Revert the change of mapping fs.rubix.impl in core-site.xml. This is getting in the way of Hive queries
  2. Setup Rubix using rubix-admin. We can get to the latest rpms next but right now you need to get a working setup running and existing rpms should work for that.
  3. Create table using rubix scheme as mentioned in the documentation and then access them via Presto.

Got the following error:

2020-05-11T20:43:43,171 INFO  [7d7564f2-82c5-4597-b2c1-ee5d1ea3b5a2 main([])]: parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(11150)) - Starting Semantic Analysis
2020-05-11T20:43:43,182 INFO  [7d7564f2-82c5-4597-b2c1-ee5d1ea3b5a2 main([])]: parse.CalcitePlanner (SemanticAnalyzer.java:analyzeCreateTable(11896)) - Creating table default.bag_s3_parquet_rubix position=22
2020-05-11T20:43:43,293 ERROR [7d7564f2-82c5-4597-b2c1-ee5d1ea3b5a2 main([])]: ql.Driver (SessionState.java:printError(1130)) - FAILED: SemanticException java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
org.apache.hadoop.hive.ql.parse.SemanticException: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1659)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1651)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1647)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:11988)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:11040)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11153)
	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:134)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2858)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2896)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2878)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:392)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.tryQualifyPath(BaseSemanticAnalyzer.java:1669)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.toReadEntity(BaseSemanticAnalyzer.java:1656)
	... 24 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:132)
	... 32 more
Caused by: java.lang.TypeNotPresentException: Type com.facebook.presto.hive.s3.PrestoS3FileSystem not present
	at sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:117)
	at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:125)
	at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
	at sun.reflect.generics.visitor.Reifier.reifyTypeArguments(Reifier.java:68)
	at sun.reflect.generics.visitor.Reifier.visitClassTypeSignature(Reifier.java:138)
	at sun.reflect.generics.tree.ClassTypeSignature.accept(ClassTypeSignature.java:49)
	at sun.reflect.generics.repository.ClassRepository.getSuperclass(ClassRepository.java:90)
	at java.lang.Class.getGenericSuperclass(Class.java:777)
	at com.qubole.rubix.core.CachingFileSystem.getTypeParameterClass(CachingFileSystem.java:85)
	at com.qubole.rubix.core.CachingFileSystem.<init>(CachingFileSystem.java:93)
	at com.qubole.rubix.presto.CachingPrestoS3FileSystem.<init>(CachingPrestoS3FileSystem.java:31)
	... 37 more
Caused by: java.lang.ClassNotFoundException: com.facebook.presto.hive.s3.PrestoS3FileSystem
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at sun.reflect.generics.factory.CoreReflectionFactory.makeNamedType(CoreReflectionFactory.java:114)
	... 47 more

Sorry for this being so painful. I see that documentation talking about setting fs.rubix.impl=com.qubole.rubix.presto.CachingPrestoS3FileSystem on hive side is wrong as hive will not have presto jars.

Instead of this, you need to set this in Hive side fs.rubix.impl=org.apache.hadoop.fs.s3a.S3AFileSystem. I see rubix-admin sets up presto fine with right fs mapping.

Thanks for sticking with me and helping me resolve this @stagraqubole.