Multiple output files - SFTP
shbadawy opened this issue · 3 comments
Hello,
I am trying to get data from Redshift / S3 using their input plugins, to SFTP server as CSV using SFTP output plugin. The output is always divided into 4 sequential files.
For example, if my data size is 4MB I get 4 files 1MB each ( 0_test.csv, 1_test.csv, 2_test.csv, 3_test.csv)
Is there a way to get them into one file?
Thanks
Hi @shbadawy Try the following exec settings It may solve the problem.
exec:
max_threads: 1
min_output_tasks: 1
in:
type: something
...
out:
type: sftp
...
This document may also be helpful.
https://www.embulk.org/docs/built-in.html
The min_output_tasks option enables “page scattering”. The feature is enabled if number of input tasks is less than min_output_tasks. It uses multiple filter & output threads for each input task so that one input task can use multiple threads. Setting larger number here is useful if embulk doesn’t use multi-threading with enough concurrency due to too few number of input tasks. Setting 1 here disables page scattering completely.