Auto-compute SHA1 sum for streams
Opened this issue · 0 comments
Related to #32. Applies to uploadPart
and uploadFile
.
If hash
is not passed and data
is a stream, the hash can be computed on the fly and appended to the output, while providing the header X-Bz-Content-Sha1: hex_digits_at_end
. It would be nice if the client would wrap up this logic itself.
This change is simpler than it seems at first. I wrote the following transform stream that hashes the content as it passes through, then emits the hash before the stream ends. We are using this in production successfully.
const crypto = require('crypto');
const stream = require('stream');
function makeSha1AppendingStream() {
const d = crypto.createHash('sha1');
return new stream.Transform({
transform(chunk, encoding, cb) {
d.update(chunk, encoding);
this.push(chunk, encoding);
cb();
},
flush(cb) {
this.push(d.digest('hex'));
cb();
},
});
}
Used simply like (adjust variable names as needed):
if (hash === undefined && typeof data.pipe === 'function') {
const hashStream = makeSha1AppendingStream();
data.on('error', err => { hashStream.emit('error', err); });
data = data.pipe(hashStream);
hash = 'hex_digits_at_end';
contentLength += 40;
}
Side note: if streams are used, all retrying/redirect-following should be disabled. This is either unsafe since the stream has been consumed, or will likely consume a large amount of memory as the entire request body is buffered in memory in case the request needs to be replayed. We had to pass maxRedirects: 0
to axios or process memory would balloon (we're uploading several-hundred-MB files and this was killing us).