RedisLabs/spark-redis

NullPointerException save to redis cluster

Opened this issue · 0 comments

got error when dataframe save to redis cluster:
java.lang.NullPointerException
at java.util.regex.Matcher.getTextLength(Matcher.java:1283)
at java.util.regex.Matcher.reset(Matcher.java:309)
at java.util.regex.Matcher.(Matcher.java:229)
at java.util.regex.Pattern.matcher(Pattern.java:1093)
at scala.util.matching.Regex.findFirstIn(Regex.scala:388)
at org.apache.spark.util.Utils$$anonfun$redact$1$$anonfun$apply$15.apply(Utils.scala:2643)
at org.apache.spark.util.Utils$$anonfun$redact$1$$anonfun$apply$15.apply(Utils.scala:2643)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.util.Utils$$anonfun$redact$1.apply(Utils.scala:2643)
at org.apache.spark.util.Utils$$anonfun$redact$1.apply(Utils.scala:2641)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.util.Utils$.redact(Utils.scala:2641)
at org.apache.spark.util.Utils$.redact(Utils.scala:2608)
at org.apache.spark.sql.internal.SQLConf$$anonfun$redactOptions$1.apply(SQLConf.scala:2083)
at org.apache.spark.sql.internal.SQLConf$$anonfun$redactOptions$1.apply(SQLConf.scala:2083)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.internal.SQLConf.redactOptions(SQLConf.scala:2083)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.simpleString(SaveIntoDataSourceCommand.scala:52)
at org.apache.spark.sql.catalyst.plans.QueryPlan.verboseString(QueryPlan.scala:177)
at org.apache.spark.sql.catalyst.trees.TreeNode.generateTreeString(TreeNode.scala:548)
at org.apache.spark.sql.catalyst.trees.TreeNode.treeString(TreeNode.scala:472)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$4.apply(QueryExecution.scala:197)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$4.apply(QueryExecution.scala:197)
at org.apache.spark.sql.execution.QueryExecution.stringOrError(QueryExecution.scala:99)
at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:197)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:75)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
at org.yl.excutor.SparkApplication.start(SparkApplication.java:271)

my code like this:
datasetResult.write()
.format("org.apache.spark.sql.redis")
.mode("append")
.option("table", jobDetail.getTo_table())
.option("host", jobDetail.getTo_host())
.option("port", jobDetail.getTo_port())
.option("auth", jobDetail.getTo_passwd())
.option("ttl", jobDetail.getRedis_table_ttl())
.option("key.column", "id").save();

and the code below can print:
datasetResult.show();

+---+---------+---+------------------+
| id| name|age| address|
+---+---------+---+------------------+
| 1| zhangsan| 22|fsdfsadfasdfsd|
| 2|hanmeimei| 23| bbbbccccddd|
+---+---------+---+------------------+

but when the jobDetail.getTo_host() point to a redis standalone ,it works
my dependencis:
<dependency> <groupId>com.redislabs</groupId> <artifactId>spark-redis_2.11</artifactId> <version>2.4.2</version> </dependency>

anything to help? tks