Stream from get ends before file data is fully read
joshrussell opened this issue · 1 comments
Running into an issue with not being able to fully download files. I've resumed the stream returned by get and have data and end listeners. What I'm seeing is that in some cases, the end event is fired before all the rows of the file I'm downloading are transferred. Sometimes I'm losing thousands of data rows. When I use Filezilla to access the data file on the FTP server, I am able to download the full results. I've added a 5 minute delay to the download from when I first find the file to when I attempt to download to ensure the file is fully written. Here's the code I'm using to handle the stream.
function unwrapPausedStream(pausedStream) {
return new Promise((resolve, reject) => {
const chunks = [];
pausedStream
.on('data', (chunk) => {
console.log('stream data');
chunks.push(chunk);
})
.on('end', async () => {
try {
console.log('stream end');
resolve(Buffer.concat(chunks));
} catch (e) {
reject(e);
}
});
pausedStream.resume();
});
}
The method is being used this way:
const ftpResult = await this.get(getPath);
const fileContents = await unwrapPausedStream(ftpResult);
this.endFtpSession();
resolve(fileContents);
Here's the implementation of get:
get(getPath) {
return this.connect().then(() => this.getClient().get(getPath)).catch((err) => {
throw new FTPConnectionError(err);
});
}
And connect:
connect() {
return this.getClient().connect(this.connectionOptions);
}
Connection options are here:
this.connectionOptions = {
host: process.env.FTP_HOST,
user: this.username,
password: this.password,
port: 21,
connTimeout: 20000,
autoReconnect: true,
preserveCwd: true,
pasvTimeout: 20000,
};
Any thoughts on why I'm not always able to get the full data file? This is causing issues in a production system so any help is appreciated. If there's anything I missed here that would help with an answer, let me know.
Hi, I have the same issue too.. did you find any solution?