brendanhay/gogol

gogol-storage: Multipart Upload incomplete/corrupted

Closed this issue · 6 comments

Hi,

I'm currently trying to upload a 2MB image to a bucket, but the upload seems to be corrupted. I'm running the example from examples/gogol-examples.

> stack ghci gogol-examples
ghci> example "my-bucket" "image.jpg"
...

ghci> :! gsutil ls gs://my-bucket
2035740  2016-03-14T20:59:51Z  gs://my-bucket/image.jpg
TOTAL: 1 objects, 2035740 bytes (1.94 MiB)
ghci> :! stat -f "%z" image.jpg
2035466

I discovered that this lib uses AltMedia for the uploadType query parameter. It resolves to "media" as a value for the parameter, but this is wrong for multipart uploads and should be "multipart" (https://cloud.google.com/storage/docs/json_api/v1/objects/insert).

For a quick fix, I changed the Line 65 to:

instance ToText AltMedia where toText = const "multipart"

and the image is now correctly uploaded.

> stack ghci gogol-examples
ghci> example "my-bucket" "image.jpg"
ghci> :! gsutil ls -rl gs://my-bucket/
   2035466  2016-03-14T21:10:40Z  gs://my-bucket/image.jpg
TOTAL: 1 objects, 2035466 bytes (1.94 MiB)

Thanks for the helpful debugging. I'm currently traveling, so a fixed Hackage/Stackage version might take a week or so to land.

Hi,

no problem. I detected an additional problem, but less critical. I'm setting the content type of the object via the provided lens like here: object' & objContentType ?~ "image/jpeg

But (with my quick fix from above), the object still remains "application/octet-stream". Without my fix it is completely wrong with the "multipart/form-data; Webkit-Boundary=" token as a content type.

Accidentally closed via PR. Reopening to continue the Content-Type issue.

Hi,

The missing "ContentType" is getting problematic because the google storage api will reject wrong content-types (on 1.August 2016) . I've got a "Action Required" mail today.

excerpt:

Dear Google Cloud Storage JSON API customer,

Important: An upcoming change to improve Google Cloud Storage's JSON API request validation will cause some of the types of object uploads that you have made recently to be rejected as invalid requests.

Google Cloud Storage is making a bug-fix change to ensure that objects are always given the Content-Type intended by the uploader. You are being contacted as a project-owner of a project whose buckets have received upload requests in the last two weeks that would, under the new validation, be rejected because the Content-Type as specified by the media does not match the Content-Type as specified by the metadata. The following buckets, listed by project, have been identified as having received such upload requests:

[...]

If you upload content using:

  • uploadType=media, there is nothing to address in that code path, as this upload type is not impacted by the change.
  • uploadType=multipart, then if you specify a contentType in the JSON body of the first part, and you specify a Content-Type header of the second part, and they do not match, the request will be rejected.
  • uploadType=resumable, then if you specify a contentType in the JSON body of the request to initiate the resumable upload session, and you specify a X-Upload-Content-Type header on that same request, and they do not match, the request will be rejected.

We plan to make the change on 1st August 2016. We are monitoring the requests that would be affected by this change, and will send out reminders leading up to the change date. If you have any questions or concerns, please do not hesitate to contact us.

I would love to provide some help to fix this, but this part is part of the generated code-base I guess? And currently I don't have the time to wrap my head around it without any entrypoint or guide.

Hi @MaxDaten, I'm currently in the midst of some GHC 8 build issues, once those are sorted I'll focus on this issue.

No problem - I'm aware the generation components are opaque, and since the caveats regarding 0.1 being a pre-release still apply - I haven't yet invested time into documenting the generator, the actual library documentation and usage/examples are a higher priority at this stage.

Thanks alot for your effort