stac-utils/pystac-client

Search is creating multiple http requests

chiarch84 opened this issue · 3 comments

When I perform this search through the Pystac client it seems that instead of sending 1 request to the APIs by calling the GET /search method, it first calls the landing page, and then getting all the collections ,and only then performing the search. I checked this by controlling the traffic going through the proxy and it seemed quite weird. In particular this causes problems when I try to do simultaneous searches from parallel processes since the web server feels a sort of attack.

catalog = Client.open("...")
my_search = catalog.search(
    max_items=100,
    collections=['EO.Copernicus.S2.L2A'],
    bbox = [11.2, 46.4, 11.4, 46.5],
    query={"eo:cloud_cover":{"lt":70}},
    datetime=['2023-01-01T00:00:00Z', '2023-01-02T00:00:00Z'],
    method='GET')

this causes problems when I try to do simultaneous searches from parallel processes

That is a fair point. In the future it might make sense for pystac-client to make it easier to skip that initial GET.

In the meantime I took a look at the code and it doesn't look like ignore_conformance=True will skip the initial GET. That argument just makes it so the conformance classes are not considered when the user requests certain actions.

If you want to avoid any superfluous network calls, I would recommend skipping the Client object entirely and just using ItemSearch directly. Here is what that would look like:

from pystac_client import ItemSearch

search = ItemSearch(url="https://earth-search.aws.element84.com/v1/search", collections=["cop-dem-glo-30"], max_items=1)

Notice that the url ends in /search.

Thanks for the suggestion! I will try!