node-fetch/fetch-blob

use a byte sequence instead of a buffer

jimmywarting opened this issue · 0 comments

imagine how you would go about solving this:

const File = require('fetch-file') // creates a FIle-like IDL wrapper around fs
const Blob = require('fetch-blob')

const file = File.from('./package.json')
new Blob([ file ])

the blob's constructor spec says:

  1. Let bytes be an empty sequence of bytes. (it's not like a long ArrayBuffer holding all data)
  2. For each element in parts:
    1. ...
    2. ...
    3. If element is a Blob, append the bytes it represents to bytes.

2.3 don't say read the the blob and copy the content to a new instance. Beside the fetch-blob can't read the content of the File in a synchronous way to be able to successfully handle the blob parts. so the slicing method is also wrong, it should not slice the Buffer like fetch-blob dose today. it should instead update the byte sequences and and create a new offset of where it should start reading stuff from. This also means that arrayBuffer, text and stream is wrong too. fetch-blob should instead read each blobPart individually at a later stage to successfully be able to handle the File - the constructor should not read the file content.


when you take 2x2 GiB of files from a <input type="file" multiple> and concatenate b = new Blob([fileA, fileB]) then you won't end up with 4 GiB of RAM, it really means that you have a new blob container with two references point to where a snapshot is located. And if you slice it b.slice(0, -500) then you have one references to fileA and another reference to fileB with a offset from what it should represent. [[start to end of fileA], [start to end minus 500 of fileB]]

image taken from: https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit

Originally posted by @jimmywarting in node-fetch/node-fetch#835 (comment)

things needs fixing:

  • this[BUFFER] should be this[PARTS] since you can't concatenate all parts
  • read methods should loop over all parts and concatenate all parts into one stream/text/arraybuffer and taking the internal offset into consideration
  • slicing becomes more complicated since you are now dealing with parts instead of a buffer
    • what really should be done is just copying all parts (clone the blob) and kinda change the internal this[OFFSET] from where it should start reading from.
var b1 = new Blob([new Uint8Array(1000)])
var b2 = b1.slice(500) // uses the same parts with an offset=500
// b2 should not take up 500 byte more ram...