mmolimar/kafka-connect-fs

Consume Remote file using Kafka Connect FileSystem Connector

nareshkpotti opened this issue · 3 comments

I tried to use Kafka Connect FileSystem Connector from confluent to read different files from remote file server which utilizes Hadoop FS to process files and used SFTP protocol. I am able to connect to remote server, read file, extract each record and publish to Kafka topic. But source connector is only able to read files from /home/usr directory and not from any other directory. Included source connector configurations and how to configure to read from different folder like /systemname/domain/inbound

name=file-stream-demo-standalone
connector.class=com.github.mmolimar.kafka.connect.fs.FsSourceConnector
tasks.max=1
fs.uris=sftp://username:password@hostserver
topic=demo_file_reader_sftp
policy.class=com.github.mmolimar.kafka.connect.fs.policy.SleepyPolicy
policy.fs.fs.sftp.host=hostserver
policy.sleepy.sleep=10000
policy.recursive=false
policy.regexp=^.*.OUT$
policy.batch_size=0
policy.cleanup=none
file_reader.class=com.github.mmolimar.kafka.connect.fs.file.reader.TextFileReader
file_reader.batch_size=0

Hi!
That is due to how you configured your SFTP server. Try to set other home directory for that user account and it should work.
In the connector, it's not possible to change to another directory.

Does this mean we can only read from user's home directory? Isn't this a limitation?

The connector reads from the directory that is configured in the SFTP server. If you change that in the server it wouldn't be a limitation.