multipart-put and multipart-put/file: Better error handling

Question

multipart-put and multipart-put/file: Better error handling

greghendershott opened this issue 9 years ago · 6 comments

As discussed in #46, the convenience functions multipart-put and multipart-put/file ought to handle things like exn:fail:network?. After all, being able to resume an interrupted upload is one of the main advantages of multipart uploads. Also, as a default in case the user doesn't want to deal with attempting to resume, the functions should automatically use abort-multipart-upload to clean up (so the user isn't paying for parts sitting on S3).

Quick/rough brain dump:

First, just double-check that the upload-part function is handling things like 5xx errors with exponential retry. i.e. Better not to fail at all, if possible.
Perhaps the put functions should take a new failure-proc arg that defaults to abort-multipart-upload, but may be a user-supplied function that stores the upload-id and parts-list, and can give them to a new resume-multipart-upload function?
Should there be a new suspend-multipart-upload function to interrupt intentionally? (That could be called from a break handler, for example?)

Answer 1 · 2015-09-03T13:55:11.000Z

I think a failure-proc would be helpful to get the upload-id and parts-list to decide on how to go further. A function for suspending an upload would be nice but nothing I really need at the moment. And I could not think of a situation where I would need that.

Answer 2 · 2015-09-03T15:10:30.000Z

A suspend function will be needed internally, to use when handling e.g. exn:fail exceptions. Multipart uploads use a pool of 4 threads. If one fails unrecoverably then the whole pool needs to be stopped gracefully.

After that's figured out, correctly, I think it would be helpful to provide it, too.

Example: Racket presents break, kill and hangup signals as exn:break exceptions. If the aws client is going to be killed intentionally, it would be helpful for it to catch these using with-handlers or call-with-exception-handler and suspend the multipart upload in a way that could be resumed later. (For example in many places residential broadband is not so fast, especially at uploads. Having to start over from scratch isn't great.)

Unlike exn:fail, I feel breaks should be left to the client of the aws library to handle, and it might want to use a suspend. Although I suppose I could handle breaks and re-raise them for the client. Just thinking out loud, here.

In any case, it may be awhile before I can work on this item, at all....

Answer 3 · 2015-09-03T15:34:17.000Z

Yeah, sounds interesting and a really nice feature. For me that would have no high priority. So more a "nice to have" or "nice enhancement" feature.

Edit: I for myself try to monitor the my S3 usage with CloudWatch and if something looks weird I can dive into the multipart things with the AWS cli.

Answer 4 · 2015-09-03T15:53:46.000Z

@krrrcks Thanks for letting me know that -- it helps me prioritize.

I shouldn't do this, at least not soon. [Unless I have time and want to work on it for fun. :)]

Answer 5 · 2015-09-09T00:30:25.000Z

So this was bothering me and I kept working on it.

I spent some time exploring how to make the worker pool handle exn:break cleanly, and return lists of "done" and "to-do" parts. Then I realized it didn't matter. I could focus on resuming, regardless of how cleanly it got interrupted (and without the need to persist a list of done/to-do parts locally).

So I pushed a commit with a couple "experimental" functions: incomplete-multipart-put/file and resume-multipart-put/file. Although the package docs haven't rebuilt yet, the commit message and aws.scrbl changes should explain it pretty well?

I marked it experimental because my testing worked, but was fairly limited. Although I'm fairly confident it's OK to use List Parts this way, because I'm providing Content-MD5 checksums on the uploaded parts, and ensuring they match... I'm not 100% sure.

Answer 6 · 2015-10-26T18:33:19.000Z

This has been open for awhile and I'm satisfied the commit closes this issue, until/unless someone feels otherwise.