Add a client-side alternative to echoData
Gallaecio opened this issue · 5 comments
While working on documentation for https://docs.zyte.com/ about setting request metadata, I am starting to think that maybe we should not send echoData
to the server, and instead keep track of it on the client side.
I get the usefulness of echoData
for generic HTTP clients that have no request metadata tracking feature, but a client with that feature, or one made specifically for Zyte Data API, should probably keep track of request metadata on the client side. In fact, maybe https://github.com/scrapy-plugins/scrapy-zyte-api should discourage its use altogether, in favor of Request.meta
.
This is of course based solely on the current implementation of Zyte Data API. It is possible that future features will make it worthwhile to include echoData (e.g. some web UI that allows to keep track of your requests, where getting access to echoData
would be useful).
Thoughts?
This makes sense to me, in general.
It seems echoData is not used in scrapy-zyte-api; are you thinking about documentation changes?
Here echoData is used in the CLI tool, so that the output contains source URL by default. What alternative do you see?
maybe we should not send echoData to the server, and instead keep track of it on the client side.
If it's still called echoData
in the client, then it can be surprising if it's not sent to the server IMO for someone aware of the HTTP API, e.g. if you're doing some debugging.
It seems echoData is not used in scrapy-zyte-api; are you thinking about documentation changes?
Yes. In the docs.zyte.com content I am preparing about echoData
, in the code example tab set, we could cover how to use echoData
in Scrapy but explain how Request.meta may be a better choice. I would also remove echoData
from the example in https://github.com/scrapy-plugins/scrapy-zyte-api/blob/main/README.rst#how-to-use.
If it's still called
echoData
in the client, then it can be surprising if it's not sent to the server IMO for someone aware of the HTTP API, e.g. if you're doing some debugging.
Yes, that is a very good point.
For python-zyte-api, maybe we should leave things as they are for now, but in the future, if there is demand for it, add an AsyncClient
parameter and command-line interface option to track echoData
on the client side. I don’t expect demand for it any time soon, but it may be a good option for people hitting the API request length limit because of their echoData
content.
I wonder if it'd be better to have this feature, but not have it tied to echoData. Or, alternatively, make echoData handling optional.
I wonder if it'd be better to have this feature, but not have it tied to echoData.
Makes a lot of sense. In the scenario where echoData
causes the API request to be too long, users could move some data from echoData
to a client-specific field (e.g. clientData
, requestMetadata
), and still keep some data in echoData
(in case there is a benefit in the future to have data there beyond getting it back in the client).