aleph-im/pyaleph

Feature: new endpoint to get all aggregates

Opened this issue · 0 comments

The problem

For custom domains, we wish to retrieve the value of a specific aggregate key for all users. At the moment, we only have GET /aggregates/<address>, so a user would first need to get the list of all addresses on the network then call this endpoint N times, resulting in N+1 API calls.

Proposed solution

Add a new GET /aggregates endpoint with appropriate filters and pagination.

Request parameters

The following query parameters must be supported:

  • addresses: A comma-separated list of addresses.
  • keys: A comma-separated list of aggregate keys to fetch.
  • created_after/created_before: Datetime range to select aggregates created since created_after or before created_before. Note that the range should be [created_after, created_before).
  • updated_after/updated_before: Datetime range to select aggregates updated since updated_after or before updated_before. Note that the range should be [updated_after, updated_before).
  • pagination: Maximum number of aggregates per page. Default should be 20, like we do for messages.
  • page: Offset in pages. Starts at 1.
  • sort_by: How to sort the aggregates (paginated endpoints must always be sorted). Can be created, last_updated, address. Defaults to created.
  • with_info: Whether to include metadata about the aggregate in the response. Defaults to false.

Response

The endpoint shall return a dictionary of address -> aggregate + info (if with_info=true). Example:

{
    "address1": {
        "data": {
            "key1": ...,
            "key2": ...,
        },
        "info": {
            "key1": {
                "created": "2023-01-01T00:00:00Z",
                "last_updated": "2023-03-01T00:00:00Z",
                "original_item_hash": ...,
                "last_update_item_hash": ...,
            },
            "key2": ...
        },
    },
    "address2": {
        "data": {
            "key1": ...,
            "key2": ...,
        },
        "info": {
            "key1": {
                "created": "2023-01-01T00:00:00Z",
                "last_updated": "2023-01-01T00:00:00Z",
                "original_item_hash": ...,
                "last_update_item_hash": ...,
            },
            "key2": ...
        },
    }
}

Possible implementation issues

Indexes

Currently, the aggregates table only has an index on addresses. We need an additional index on creation_datetime. Currently, the update datetime can be fetched from the aggregate_elements table through a join. We need to check if performance is acceptable or find a solution (ex: denormalize the data and add a last_updated column in aggregates).