stac-utils/pystac-client

Client.search().pages() using paging does not work correctly

honza801 opened this issue · 1 comments

Hi,

I'm trying to count all items in collection using paging, but it seems, that paging is not working correctly. It seems it that Client.search().pages() yields ony the first page.

According to the pystac-client API reference for pystac_client.Client.search method

max_items – The maximum number of items to return from the search
limit – A recommendation to the service as to the number of items to return per page of results

I wrote code inspired by Usage/ItemSearch documentation. Following code results in 1 page and 100 items, but 2 pages and 200 items should be proceeded.

results = client.search(limit=100, max_items=200, collections=collection)
pages = 0
items = 0
for page in results.pages():
    pages+=1
    for item in page.items:
        items+=1
print(pages, items)

Having limit=200 and max_items=100 does not work either.

Am I doing something wrong?

Thanks,
Jan

Yup, I can confirm a bug, though it's not quite as you describe. With the following code:

from pystac_client import Client

client = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")
item_search = client.search(collections=["sentinel-2-l2a"], limit=100, max_items=200)
total_items = 0
for i, page in enumerate(item_search.pages()):
    total_items += len(page)
    print(f"Page {i}, num_items={len(page)}, total_items={total_items}")

the iterator runs forever. This is because we only check max_items in our items_as_dicts iterator:

if self._max_items and nitems >= self._max_items:
return

This check should be pushed down to pages_as_dicts.