Syntax for adding other HTTP methods to URLFETCH
Closed this issue · 8 comments
Currently the URLFETCH fonction only support GET method. The signatures are:
url:STRING headers:MAP<STRING> URLFETCH
urls:LIST<STRING> headers:MAP<STRING> URLFETCH
with headers being optional.
To support additional methods, I suggest:
request:MAP<STRING> headers:MAP<STRING> URLFETCH
requests:LIST<MAP<STRING>> headers:MAP<STRING> URLFETCH
where request shall have the fields 'url', 'method' and optionally 'message', and headers are still being optional.
Is that signature good ?
It looks good, even if I think the keys of the map could have a different name (url, verb and body or payload).
Too bad I added the header signature without thinking about that kind of signature before because it would definitively make sense to have also a header key in the map you want to define. As it is a fresh addition (2.7.0) maybe it should be considered whether to break the header signature to allow
request:MAP<STRING> URLFETCH
requests:LIST<MAP<STRING>> URLFETCH
with requests
containing a header key.
Not sure about verb, I think method is more explicit. I agree with body or payload instead of message though. I think body is better.
Putting a MAP inside a MAP makes for a bit clunky codes.
Maybe, in order not to break the header signature, we can add only the signature request:MAP<STRING> headers:MAP<STRING> URLFETCH
, and not the second one, since anyway, the second signature is not that much of a gain compared to using FOREACH or LMAP, and I agree it doesn't always make sense to use the same headers for all requests, especially when they are not all GET methods.
Agreed on url/method/body keys.
I'm not convinced that maps of maps are a problem but breaking the 2.7.0 is more of a problem, I think @hbs is best to arbitrate on that. If we want to keep backward-compatibility I think both
request:MAP<STRING> headers:MAP<STRING> URLFETCH
requests:LIST<MAP<STRING>> headers:MAP<STRING> URLFETCH
are nice to have, even if, I do agree, it can be done with FOREACH
or LMAP
.
Also,there might be an issue for the signature request:MAP<STRING> headers:MAP<STRING> URLFETCH
. Since headers is optional, URLFETCH would have to check if the first map it retrieves from the stack is a headers or request. I don't see a way to do this without limiting the names of the fields that are allowed in headers.
So I think that if we choose that signature, headers cannot be optional.
I intend to run many URLFETCH at once (it's a data platform and users can run their scripts as they please) which can quickly hog all memory so a streaming option could be really useful. Otherwise, I need to limit URLFETCH maxsize which is not ideal.
Maybe something like requests:LIST<MAP<STRING>> headers:MAP<STRING> chunk_size:LONG URLFETCH
. chunk_size
can be -1 by default but if set, it returns chunks of content with that size as an iterator.
Another way could be requests:LIST<MAP<STRING>> headers:MAP<STRING> start_byte:LONG end_byte:LONG URLFETCH end:BOOL content:STRING
. When start_byte
and end_byte
are defined URLFETCH reads and drops all data before start and reads until end_byte
. If the end of the stream is reached, end
is set to True.
I expect it to be slower but scales much better.
Maybe an idea similar to #855 can be used with a byte delimiter (or a chunk size) and a macro to be run for each of these chunks.
The other way around, for POST, a macro could create chunks to be sent, but that may be overengineering.
Good suggestion. It is added in PR #891