YotpoLtd/metorikku

download input from sftp

Closed this issue · 6 comments

Is there any way i can give input dataframe path as sftp server and metorikku downloads the file and use it?

Yes.
If you're running metorikku with spark-submit use the following command:

spark-submit --packages com.springml:spark-sftp_2.11:1.1.5 --class com.yotpo.metorikku.Metorikku metorikku.jar -c config.yaml

Then in your config file in the input define:

input_sftp:
    file:
      path: /sample.csv
      format: com.springml.spark.sftp
      options:
            host: SFTP_HOST
            username: SFTP_USER
            password: SFTP_PASSWORD

Check out the documentation of all available options here:
https://github.com/springml/spark-sftp

I am getting below exception:
Exception in thread "main" java.util.NoSuchElementException: None.get

My config file looks like this:
sample

I think maybe the yaml isn't formatted correctly, can you send the full YAML?

Hi, Please have a look.
sample.zip

inputs:
  movies:
    file:
      path: /home/movies.csv
      format: com.springml.spark.sftp
      options:
        host: HOST
        username: USER
        password: PASSWD

Please reopen if still not working