Read timeout while reading data from redis in batch
Opened this issue · 1 comments
I am trying to read all the fields and its value present in a hash key from Redis in pyspark using spark-redis jar. I am able to read 500-600 fields but get Read time out error (snippet pasted below) while reading 5K-6K fields.
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:205)
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
at redis.clients.jedis.Protocol.process(Protocol.java:155)
at redis.clients.jedis.Protocol.read(Protocol.java:220)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:318)
at redis.clients.jedis.Connection.getBinaryMultiBulkReply(Connection.java:270)
at redis.clients.jedis.Jedis.hgetAll(Jedis.java:942)
I tried increasing the timeout via spark.redis.timeout
setting (tried setting it up to 3000 and higher), doing so i get Unexpected end of stream
error. (snippet below)
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:202)
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
at redis.clients.jedis.Protocol.process(Protocol.java:155)
at redis.clients.jedis.Protocol.read(Protocol.java:220)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:318)
at redis.clients.jedis.Connection.getBinaryMultiBulkReply(Connection.java:270)
at redis.clients.jedis.Jedis.hgetAll(Jedis.java:942)
i am reading data using the following command below -
data = spark.read.format("org.apache.spark.sql.redis").option('keys.pattern',"search_solution").option('infer.schema', 'true').load()
Is there any suggestion or possible areas to look at, to avoid these errors? Let me know if you have any knowledge on this issue.
Any new on this? Have you managed to solve it?