RedisLabs/spark-redis

Can we get an error if an RDD insert fails ?

rparme opened this issue · 0 comments

Hi,

We are using your library to insert an RDD into Redis, we do the following :
spark.sparkContext.toRedisKV(rdd)(redisConfig = new RedisConfig(RedisEndpoint()))

The problem is if the Redis memory is full we don't get any error from the command.

For the test purpose, I've setup a Redis instance on my local machine with the following configuration :
maxmemory 2mb
And then I ran this snippet of code :

import spark.implicits._

val arr = Array.fill[Byte](3 * 1024 * 1024)(1)
val df = Seq(("1", arr), ("2", arr)).toDF
val rdd = df.map(row => (
  row.getAs[String]("_1"),
  new String(row.getAs[Array[Byte]]("_2"), StandardCharsets.ISO_8859_1)
)).rdd

val redisConfig = new RedisConfig(RedisEndpoint())

spark.sparkContext.toRedisKV(rdd)(redisConfig = new RedisConfig(RedisEndpoint()))

But I don't get any error. I assume it's because the commands are batched with Pipeline but I wonder if there is anyway to know the insert had fail.

On the other hand, I've tested a single insert using Jedis :

val arr = Array.fill[Byte](3 * 1024 * 1024)(1)
val j = new Jedis()
j.append("1", new String(arr, StandardCharsets.ISO_8859_1))

And I get :

OOM command not allowed when used memory > 'maxmemory'.  
redis.clients.jedis.exceptions.JedisDataException: OOM command not allowed when used memory > 'maxmemory'.  
	at redis.clients.jedis.Protocol.processError(Protocol.java:132)
	at redis.clients.jedis.Protocol.process(Protocol.java:166)

Am I missing anything ?