RedisLabs/spark-redis

DataFrame write to redis, set to overwrite might cause error

ljw7630 opened this issue · 2 comments

some redis cluster implementation does not support scan, so when mode set to overwrite, would get Error: Invalid Node response

fe2s commented

Hi @ljw7630 ,
I'm not aware of Redis that doesn't support scan, could you please share more information regarding that?
Spark-redis relies on scan heavily, so I'm not sure how this can be fixed.

I'm using tencent cloud redis cluster version. I think the problem exists in many unofficial redis cluster architecture.
Here is the tencent cloud redis doc: https://intl.cloud.tencent.com/document/product/239/18336#xianzhi

I think the problem can be divided into two parts:

  1. write
  2. read

in writing to redis scenario, people usually don't care about what keys exists in redis. Users can use ttl to get keys expire.
so in save mode, we can always skip scan by default, unless users explicitly declare.

in reading scenario, using cluster api to get the under-layer partitions, then scan them one by one might be a good idea. But, there are many unofficial redis cluster implementations, so adapt to them might be difficult