salesforce/TransmogrifAI

CDH 6.3.2 not worked,throw NoClassDefFoundError( com.fasterxml.jackson.module.scala.modifiers.EitherModule)

faceany opened this issue · 3 comments

Describe the bug
The CDH 6.3.2 Jackson use 2.9.9 but TransmogrifAI 0.7.0 use 2.7.3,when run on yarn cluster,throw NoClassDefFoundError, com.fasterxml.jackson.module.scala.modifiers.EitherModule
The Cluster Jackson
image

To Reproduce
train any model

Expected behavior
train ok

Logs or screenshots

21/01/28 17:35:29 INFO yarn.Client: 
         client token: N/A
         diagnostics: User class threw exception: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/modifiers/EitherModule
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at com.salesforce.op.utils.json.JsonLike$class.toJson(JsonUtils.scala:179)
        at com.salesforce.op.stages.impl.feature.LinearScalerArgs.toJson(ScalingArgs.scala:52)
        at com.salesforce.op.stages.impl.feature.ScalerMetadata.toMetadata(ScalerTransformer.scala:108)
        at com.salesforce.op.stages.impl.feature.OpScalarStandardScaler.fitFn(OpScalarStandardScaler.scala:70)
        at com.salesforce.op.stages.base.unary.UnaryEstimator.fit(UnaryEstimator.scala:94)
        at com.salesforce.op.stages.base.unary.UnaryEstimator.fit(UnaryEstimator.scala:56)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:264)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:263)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at com.salesforce.op.utils.stages.FitStagesUtil$.com$salesforce$op$utils$stages$FitStagesUtil$$fitAndTransformLayer(FitStagesUtil.scala:263)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:226)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:224)
        at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
        at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
        at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:186)
        at com.salesforce.op.utils.stages.FitStagesUtil$.fitAndTransformDAG(FitStagesUtil.scala:224)
        at com.salesforce.op.OpWorkflow.fitStages(OpWorkflow.scala:407)
        at com.salesforce.op.OpWorkflowV2.train(OpWorkflowV2.scala:21)
        at workflow.models.training.TrainModel.trainModel(TrainModel.scala:58)
        at workflow.models.training.BinaryClassificationModel.train(BinaryClassificationModel.scala:30)
        at workflow.utils.ModelTrainUtil$.train(ModelTrainUtil.scala:53)
        at workflow.ModelApp$.main(ModelApp.scala:114)
        at workflow.ModelApp.main(ModelApp.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.module.scala.modifiers.EitherModule
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 55 more
         ApplicationMaster host: agent1.cdh
         ApplicationMaster RPC port: 38811
         queue: root.tiwisdom
         start time: 1611826496668
         final status: FAILED
         tracking URL: http://agent1.cdh:8088/proxy/application_1611193745794_0438/
         user: qxadmin
21/01/28 17:35:29 ERROR yarn.Client: Application diagnostics message: User class threw exception: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/modifiers/EitherModule
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at com.salesforce.op.utils.json.JsonLike$class.toJson(JsonUtils.scala:179)
        at com.salesforce.op.stages.impl.feature.LinearScalerArgs.toJson(ScalingArgs.scala:52)
        at com.salesforce.op.stages.impl.feature.ScalerMetadata.toMetadata(ScalerTransformer.scala:108)
        at com.salesforce.op.stages.impl.feature.OpScalarStandardScaler.fitFn(OpScalarStandardScaler.scala:70)
        at com.salesforce.op.stages.base.unary.UnaryEstimator.fit(UnaryEstimator.scala:94)
        at com.salesforce.op.stages.base.unary.UnaryEstimator.fit(UnaryEstimator.scala:56)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:264)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$20.apply(FitStagesUtil.scala:263)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at com.salesforce.op.utils.stages.FitStagesUtil$.com$salesforce$op$utils$stages$FitStagesUtil$$fitAndTransformLayer(FitStagesUtil.scala:263)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:226)
        at com.salesforce.op.utils.stages.FitStagesUtil$$anonfun$17.apply(FitStagesUtil.scala:224)
        at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
        at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
        at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:186)
        at com.salesforce.op.utils.stages.FitStagesUtil$.fitAndTransformDAG(FitStagesUtil.scala:224)
        at com.salesforce.op.OpWorkflow.fitStages(OpWorkflow.scala:407)
        at com.salesforce.op.OpWorkflowV2.train(OpWorkflowV2.scala:21)
        at workflow.models.training.TrainModel.trainModel(TrainModel.scala:58)
        at workflow.models.training.BinaryClassificationModel.train(BinaryClassificationModel.scala:30)
        at workflow.utils.ModelTrainUtil$.train(ModelTrainUtil.scala:53)
        at workflow.ModelApp$.main(ModelApp.scala:114)
        at workflow.ModelApp.main(ModelApp.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.module.scala.modifiers.EitherModule
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 55 more

Exception in thread "main" org.apache.spark.SparkException: Application application_1611193745794_0438 finished with failed status
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1158)
        at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1606)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)`

Additional context
Spark 2.4.0
CDH 6.3.2
TransmogrifAI 0.7.0

What does your POM look like? Are you excluding Jackson from your TransmogrifAI dependency?

@leahmcguire
The jackson hadoop spark jars are provided. Core Pom config as follows:

<properties>
        <encoding>UTF-8</encoding>
        <java.version>1.8</java.version>
        <scala.version>2.11.8</scala.version>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <spring-cloud.version>Greenwich.SR4</spring-cloud.version>
        <mysql.version>5.1.34</mysql.version>
        <commons.version>3.5</commons.version>
        <config.version>1.3.1</config.version>
        <scalikejdbc.version>2.5.0</scalikejdbc.version>
        <scala.binary.version>2.11</scala.binary.version>
        <json4s-version>3.2.11</json4s-version>
        <postgresql-version>42.2.14</postgresql-version>
        <commons-codec-version>1.15</commons-codec-version>
        <hive.version>2.2.0</hive.version>
    </properties>
      <profile>
            <id>x86_spark24</id>
            <properties>
                <transmogrifai.version>0.7.0</transmogrifai.version>
                <spark.version>2.4.0</spark.version>
                <xgboost4j.version>0.90</xgboost4j.version>
                <spark.dependence.scope>provided</spark.dependence.scope>
            </properties>

            <activation>
                <activeByDefault>true</activeByDefault>
            </activation>
        </profile>
 <dependency>
            <groupId>org.json4s</groupId>
            <artifactId>json4s-native_${scala.binary.version}</artifactId>
        </dependency>

<dependency>
                <groupId>commons-codec</groupId>
                <artifactId>commons-codec</artifactId>
                <version>${commons-codec-version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${spark.dependence.scope}</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-sql_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${spark.dependence.scope}</scope>
            </dependency> 
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-hive_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${spark.dependence.scope}</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-mllib_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${spark.dependence.scope}</scope>
            </dependency>
                 <dependency>
                <groupId>mysql</groupId>
                <artifactId>mysql-connector-java</artifactId>
                <version>${mysql.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.commons</groupId>
                <artifactId>commons-lang3</artifactId>
                <version>${commons.version}</version>
            </dependency>
            <dependency>
                <groupId>ml.dmlc</groupId>
                <artifactId>xgboost4j</artifactId>
                <version>${xgboost4j.version}</version>
            </dependency>
            <dependency>
                <groupId>ml.dmlc</groupId>
                <artifactId>xgboost4j-spark</artifactId>
                <version>${xgboost4j.version}</version>
            </dependency>
            <dependency>
                <groupId>com.salesforce.transmogrifai</groupId>
                <artifactId>transmogrifai-core_2.11</artifactId>
                <version>${transmogrifai.version}</version>
                <exclusions>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j</artifactId>
                    </exclusion>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j-spark</artifactId>
                    </exclusion>
                </exclusions>
            </dependency>
            <dependency>
                <groupId>com.salesforce.transmogrifai</groupId>
                <artifactId>transmogrifai-readers_2.11</artifactId>
                <version>${transmogrifai.version}</version>
                <exclusions>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j</artifactId>
                    </exclusion>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j-spark</artifactId>
                    </exclusion>
                </exclusions>
            </dependency>

            <dependency>
                <groupId>com.salesforce.transmogrifai</groupId>
                <artifactId>transmogrifai-features_2.11</artifactId>
                <version>${transmogrifai.version}</version>
                <exclusions>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j</artifactId>
                    </exclusion>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j-spark</artifactId>
                    </exclusion>
                </exclusions>
            </dependency>

            <dependency>
                <groupId>com.salesforce.transmogrifai</groupId>
                <artifactId>transmogrifai-utils_2.11</artifactId>
                <version>${transmogrifai.version}</version>
                <exclusions>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j</artifactId>
                    </exclusion>
                    <exclusion>
                        <groupId>ml.dmlc</groupId>
                        <artifactId>xgboost4j-spark</artifactId>
                    </exclusion>
                </exclusions>
            </dependency>

The only non standard Jackson dependency we use is jackson-dataformat-yaml:

<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.7.3</version>

Moreover, we are excluding the jackson-core from it (i.e. no transitive dependencies are included either):

compile ("com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:$jacksonVersion") { exclude group: "com.fasterxml.jackson.core" }

Therefore I don't think this what results in the error you're getting.

Instead, please make sure you have jackson-module-scala on your path - https://mvnrepository.com/artifact/com.fasterxml.jackson.module/jackson-module-scala_2.11

Update: after looking on your screenshot I realized that you should have jackson-module-scala version 2.9.9 on your path, which has EitherModule but it's under a different package name com.fasterxml.jackson.module.scala, while we expect it to be in com.fasterxml.jackson.module.scala.modifiers. I think it's worth the efforts to try an rework the TransmogrifAI code to use the newer package name.