whatwg/html

enforce lowercase letter on multipart boundary

jimmywarting opened this issue · 4 comments

One thing that have bother me before is that the boundary header and the Blob's type can't work in sync.

what do i mean by that?
The boundary that gets generated in webkit (blink & safari) contains uppercase letters. Firefox seems to use only numbers and that is fine. But if you want to generate a blob out of a FormData for later use (for postMessaging, caching / reuse or what not) that's when you get into some issues.

When you construct a blob (or file) then the type is spec'ed to be transformed to all lowercased letters.

fd = new FormData()
res = new Response(fd)
res.headers.get('content-type') // "multipart/form-data; boundary=----WebKitFormBoundary4Abu3QeNVjbdNztI"
blob = await res.blob() // Blob { type: "multipart/form-data; boundary=----webkitformboundary4abu3qenvjbdnzti"}

res.headers.get('content-type') === blob.type // false
new Request(url, { method: 'POST', body: blob }) // use wrong content-type boundary (uses blob's type instead)
  • I had this issue a while back when i built my formdata polyfill that sends a generated blob out of formdata entries.
    (I did have mixed upper/lower case letter but was resolved by removing the uppercase letters when generating a boundary)
  • I also had to explain it in a stackoverflow answer when i had to reuse the content more then once and explain that you can't rely on the blob type that is generated from new Response(formData).blob()

This can be resolved if just all browser just didn't use any uppercase letters in the boundary. (or if the blob type wasn't casted to all lowercase letters)

The question of whether content types belong case-sensitive or case-insensitive is a surprisingly messy one. It does seem to me "obvious" from this example that the boundary= parameter for a multipart/form-data should be consistently case-sensitive, if there's a way to actually do that without breaking anything else, but I'm not so sure whether there is.

Imo i think the boundary was such a old, bad design decisions when it was invented. if each body part had a Content-Length header saying how large each part is then a decoder could just let x amount of bytes pass by instead of scanning for a boundary separator.

  • a decoder would probably do a faster job at decoding each part
  • you would know how large each file is before everything has been uploaded.

I think we should fix Blob's type instead, as per w3c/FileAPI#43. Not being able to represent all MIME types is a problem there.

I think we should fix Blob's type instead, as per w3c/FileAPI#43. Not being able to represent all MIME types is a problem there.

That sounds reasonable.
MIME parameter values should not be lowercased.

Restricting boundary to only lowercase solves only one of many other problems with Blob's type.

You can close this if you wish