Add pagination to list APIs.
sgotti opened this issue · 10 comments
Current "list" APIs (with few exceptions) return all the results. This was meant to be only temporary since a sort of pagination is always needed to avoid returning too much data.
There are different options:
- classic pagination style (see github v3 api). Classic pagination has the downside of becoming slow on the query side when asking for deeper pages (since the sql query will require the use of offset). To avoid this, some APIs, limit the maximum page number.
- cursor based pagination. Here we can return data after the provided id and could use an opaque cursor that could contain data about the last id and also the API arguments (sorting by, filtering etc...). This will also avoid the use of sql offset in underlying sql queries. One of the cons is that this doesn't provide the ability to jump to a specific page but can only fetch the elements serially.
I'll prefer cursor based pagination (like already done for user/project get runs APIs).
NOTES
This will break current API but the APIs are not yet stable.
Additional required Actions
- Due to API breakage changes to also agola-web are required since it currently doesn't expect paginated results and should handle them (add a "load more" button, automatically fetch when scrolling, fetch all the whole data doing multiple calls etc...).
We can use a Cursor with this data
StartID string
PointsNext bool
OrderBy string
SortOrder SortOrder
Filters map[string]string
Limit int
StartID
contain the starting data id(excluded) to fetch
PointsNext
if the user ask for next page is true
Filters
can contains some filter, else it is empty
When the client will call the gateway API it will can provide some query values like limit, orderby, asc and some filters(if expected), for example:
/projects?orderby=name&asc&limit=10
or /projects?limit=10
There are some APIs(example /runs?start&limit=10
) that use start query value that, if we implement cursor pagination, we can remove start(because it can use cursor to ask next page): @sgotti do you think is better to maintain start or remove it?
There are some agola cmd list. We should also add here the cursor asking him in the args. Do you can confirm?
The APIs will return the data and pagination for example:
{
Orgs []*OrgResponse
PaginationInfo *htypes.PaginationInfo
}
where PaginationInfo contains:
{
NextCursor string
PrevCursor string
}
The client will use for the next calls only NextCursor for ask the next page or PrevCursor for ask the previous page, if the values are present(for example the first page don't have PrevCursor).
NextCursor and PrevCursor contain a base64 string of the Cursor described above.
The Gateway take the query values(limit, orderby,ecc.), or the cursor(that contains all the data needed), and use them to call the configstore/runservice.
For example /projects?cursor=eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0%3D
The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors.
The Gateway only copy the PaginationInfo to the reponse(and the data).
There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.
PointsNext
if the user ask for next page is true
I'm not sure I understand this.
we can remove start(because it can use cursor to ask next page)
We could also keep start if useful for some api to fetch from a specific id with specific option. Then for the next results the cursor could be used.
There are some agola cmd list. We should also add here the cursor asking him in the args. Can you confirm?
The APIs will return the data and pagination for example:
{ Orgs []*OrgResponse PaginationInfo *htypes.PaginationInfo }
where PaginationInfo contains:
{ NextCursor string PrevCursor string }
The output should contain a cursor. But there will only be one cursor. It doesn't makes sense to have a prev or next cursor. It'll just be a cursor that point to the next batch of data based on the original query. If the original query has a sort order the cursor will use that.
The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors.
It's the gateway that manages the exposed API and so it MUST also manage the cursor, since it could call multiple underlying services to create a response. The underlying services don't require a cursor since they aren't exposed.
There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.
I don't see why this won't work. Can you explain it?
Remember that if you're going to implement this, also agola-web should be changed and the related PRs should be created to be able to test them.
I remove PointsNext since we maintain only one cursor for the next page, so the cursor contains this data:
StartID string
OrderBy string
SortOrder SortOrder
Filters map[string]string
Limit int
The response of the gateway can be, for example:
{
Orgs []*OrgResponse
Cursor string
}
or we can create a generic response like this:
{
Data any
Cursor string
}
What do you like is better?
We keep start for some api where is already implemented.
@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user.
For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=
Filters map[string]string
Don't be rigid on this. Filters, but also the other fields could be different based on the api.
What do you like is better?
The latter doesn't make sense.
@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user.
For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=
By default the client could list all the entries by fetching all of them doing multiple calls to the api.
Filters map[string]string
Don't be rigid on this. Filters, but also the other fields could be different based on the api.
@sgotti I can define the Cursor asmap[string]interface{}
@sgotti I can define the Cursor as
map[string]interface{}
There's no need to make cursor the same type for every api.
@sgotti when the user call the last page do you think is better to return an empty cursor(so it know there are no other pages to ask) or the cursor with the last data id(the same as the other calls)?
- In the first case the gateway must call the configstore/runservice APIs with a limit+1 to check if there are other pages.
- In the second case Instead, the client will understand there are no other data when the gateway return no data or len data < limit.
@sgotti when the user call the last page do you think is better to return an empty cursor
I'll return an empty cursor.
In the first case the gateway must call the configstore/runservice APIs with a limit+1 to check if there are other pages.
The underlying services could return if there's more data themself.
@sgotti with the APIs list secrets/variables I think the pagination can not work when the client use the removeoverridden
parameter, because this filter is implemented by the gateway.
For example we have this Secrets:
{
"name": "secret1","parent_path": "user/user01",
"name": "secret1","parent_path": "user/user01/subgroup1",
"name": "secret1","parent_path": "user/user01/subgroup2",
"name": "secret2","parent_path": "user/user01",
}
If the client use parameter limit=2 the Gateway should return {"name": "secret1","parent_path": "user/user01"}
and {"name": "secret2","parent_path": "user/user01"}
, but calling the configstore whith limit=2, and next making the overriden filter it will return only {"name": "secret1","parent_path": "user/user01"}
that is wrong(it should return also {"name": "secret2","parent_path": "user/user01"}
).
We should avoid this implementing the limit by the gateway(instead of by the configstore), but it can be much complex and with too many calls to the configstore.
I suggest to avoid/ignore limit when the client use removeoverridden parameter, or avoid limit for this APIs(secrets/variables).
@alessandro-sorint I'm not sure I understand. If the api returns reproducible output as it should, then the cursor will contain the information on the next batch of data. How you get them is an implementation detail.
Limit int
I don't think Limit should be part of the cursor but should be set every time by the call and on the server side there'll be an hard limit.