stac-utils/pgstac

Add support for loading from pystac objects

hrodmn opened this issue · 3 comments

Would it make sense to add support for loading Collection, Item, and ItemCollection objects from pystac? It could be useful in python workflows that generate STAC metadata and load directly into pgstac.

Right now, I do something like this:

import io
import json

from pypgstac.db import PgstacDB
from pypgstac.load import Loader, Methods
from pystac import ItemCollection
from stactools.package import create_item

# fire up pgstac loader
db = PgstacDB()
loader = Loader(db=db)

# create item collection
item_collection = ItemCollection(
  create_item(href) for href in hrefs
)

# write to ndjson and load into pgstac
buffer = io.BytesIO()
for item in item_collection:
    item.collection_id = COLLECTION_ID_FORMAT.format(
        region=region.value, product=product.value
    )

    buffer.write((json.dumps(item.to_dict()) + "\n").encode("utf-8"))

buffer.seek(0)

loader.load_items(buffer, insert_mode=Methods.upsert)

It isn't that hard to write to ndjson and pass that to load_items, but it would be nice if that operation was handled by pypgstac!

looks like you can pass a sequence of dict to load_items and load_collections. So this could work already:

items = [item.to_dict() for item in item_collection]
loader.load_items(items, insert_mode=Methods.upsert)

Yeah, that does work. I was thrown off by a bad type hint, I think. Passing a list in does in fact work but my type checker complains about list[Unknown] being incompatible with Iterator[Any]. Maybe this should include Iterable[Any] instead of Iterator[Any]:

def read_json(file: Union[Path, str, Iterator[Any]] = "stdin") -> Iterable:

true, for read_json that could be Iterable: the code checks for Iterable and iterates over the variable, but never calls iter() on the variable.

However, Iterator inherits from Iterable, and load_items at least needs the parameter to be an Iterator, so unless you call read_json manually you won't be able to pass the narrower type (I have no idea how to deal with list[Unknown] vs Iterator[Any], though).