Discussion: Reconsidering requestor pausing
Opened this issue · 2 comments
Currently, we have an implementation of requestor pausing that works as follows:
- Requestor can be paused in a block hook or a imperatively (calling PauseRequest on graphsync with a request ID)
- "Pausing" works by simply cancelling the request. "Resuming" works by resending the request with DoNotSendFirstBlocks
IMHO, this is ridiculous. The reason I did it is that I wanted the requestor to stop accepting blocks entirely as soon you paused. This once upon a time approach was based on trying to get payment channels to work for data transfer push requests. (see #346 for explanation).
I believe the proper way to do this is:
- add a Pause request type -- see #345
- have the responder when it receives a pause request pause the response and send back a ResponsePaused code
- requestor keeps receiving data until pause response is received.
- PauseRequest imperative call blocks until roundtrip completes
Questions:
- does this make sense?
- should this require some kind of validation? when I download stuff over HTTP, I don't seem to require anyone's auth to pause & resume?
- what happens if the responder never sends back a pause?
- should a responder time out a paused request that was paused by the requestor after certain amount of inactivity? (protect against dosing with lots of paused requests)
- should we remove pausing on the requestor side entirely -- move it to simply sending update requests with extensions and letting the responder pause if it wants to?
- What about requestor resuming? Do we need a resume request type as well?
Why should we have imperative requestor pausing?
- I'm not sure it's used today, but it certainly makes sense from a the standpoint of being like HTTP. If one imagines a download manager, one certainly imagines the ability to hit a pause button
- makes sense
- I don't see a case off hand for validation, maybe there is one but I don't have the markets coverage to see it?
- a
PauseRequest()
call could also have a timeout on the caller side, such that if the timeout is reached and the responder doesn't send back a pause, a forced cancel is initiated and an error propagated—seems appropriate for a misbehaving responder, no? - definitely yes there should be a timeout, you're asking for not just a connection to remain active but a lot of state to remain in memory—this is a big problem with http/2, maybe it'd be worth looking in to how that's being dealt with (I tuned out on that stuff a couple of years ago but it was interesting to watch the challenges of state management and easy DoS vectors)
- for resume, we either need to subtly rename "New" or introduce a new type that's obvious (this is where "Play/Pause" works nicely in the AV world). Maybe "New" should actually be "Start"?
Upon further reading, I have learned that HTTP's pause/resume works essentially exactly like GraphSyncs -- the client cancels the request, then sends a second range request to resume. As many often say "HTTP is a stateless protocol".
I guess it can be for graphsync as well :P
I think a better question is how to do better versions of our start / stop extensions -- we started with DoNotSendCids, then moved to DoNotSendFirstBlocks, and I think the next thing would be to support "start traversal at path" -- ideally in a way that server could pick up without maintaining state.