RedisLabs/spark-redis

Spark Dataset Write is too Slow - Java

salil3591 opened this issue · 1 comments

Hi,

I am trying to save a Spark Dataset to Redis in Java. My Dataset is having 117K rows and 350 columns.

The issue is while writing the OPS on servers are not going beyond 600. Is this Expected? If yes, is there any alternative? I have to save 10 M rows.
Spark Write Code:

dataset.write().format("org.apache.spark.sql.redis")

                          .option("table", tartgetTbl).option("key.column", keyColumnNm).option("max.pipeline.size", "100000")

                          //.option("batchsize", "100000")

                          .mode(saveMode.OverWrite)

    .save();

I have already set Spark Context with below Parameters.

.config("spark.redis.max.pipeline.size","100000") .config("spark.redis.scan.count","100000")

Note: I am able to read much faster with OPS of 15K.