RedisLabs/spark-redis

Cannot use in Databricks JedisConnectionException: Could not get a resource from the pool

Opened this issue · 4 comments

I'm currently testing this in pyspark

df.write\
  .format("org.apache.spark.sql.redis")\
  .option("table", "mytable")\
  .option("infer.schema", True)\
  .option("spark.redis.host","somehost")\
  .option("host","somehost")\
  .option("spark.redis.port", "6666")\
  .option("port", "6666")\
  .option("spark.redis.ssl", False)\
  .option("auth", "")\
  .option("timeout", 5000)\
  .option("key.column", "key")\
  .save()
# JedisConnectionException: Could not get a resource from the pool

I've installed this
spark_redis_2_4_0_jar_with_dependencies.jar
From here: https://repo1.maven.org/maven2/com/redislabs/spark-redis/2.4.0/
The notebook currently runs: 10.4 LTS ML (includes Apache Spark 3.2.1, Scala 2.12)

I'm able to connect to redis from the notebook using the redis lib from python

Ok so I was facing exactly the same issue and I managed to solve it. I tested it with version spark-redis 3.1.0, scala 2.12 and Spark 3.2.1 (Databricks runtime 10.4 LTS).

You must set the variables in Spark configuration before launching the cluster. Otherwise if you put them directly in your spark session through spark.conf.set("", "") or directly when reading/wrinting your dataframe as .option(...), it would raise JedisConnectionException

image

spark.redis.host <your_host>
spark.redis.port <your_port> // usually 6379
spark.redis.auth <your_auth_token> // if needed
spark.redis.ssl true // in case you connect using TLS (port 6380)

Example code (in Scala)

case class Person(name: String, age: Int)

val personSeq = Seq(Person("John", 30), Person("Peter", 45))
val df = spark.createDataFrame(personSeq)

df.write
  .format("org.apache.spark.sql.redis")
  .option("table", "person-db")
  .save()

// Read the same table afterwards
val df = spark.read
  .format("org.apache.spark.sql.redis")
  .option("table", "person-db")
  .load()
df.show()

@tonofll hey sorry for asking in an old topic, I am having issues even adding the JAR to the cluster. How did you do it?

@tonofll hey sorry for asking in an old topic, I am having issues even adding the JAR to the cluster. How did you do it?

To install de JAR in the cluster, just go to the cluster configuration and open Libraries tab:

image

Afterwards click Install new and search spark-redis library in Maven central repository:

image

image

image

image

Once installed, simply restart the cluster and it should work properly. To avoid JedisConnectionException follow the steps in my previous comment.

Oh yeah I just noticed you switched to Maven Central from Spark Packages. In there, the latest is 2.3.0. I managed today to workaround this by just pasting the coordinates, repository and clicking Install with no browsing. It worked too. Thanks!