coreinfrastructure/best-practices-badge

enhance user API to query for info on many users at once

Opened this issue · 6 comments

my dashboard for ONAP uses the user API, for example https://www.bestpractices.dev/en/users/1597?format=json to look up information on the editors for each of our projects. This returns information such as:

{"id":1597,"name":"Tony Hansen","nickname":null,"uid":null,"provider":"local","created_at":"2017-08-16T13:25:18.530Z","updated_at":"2021-04-08T17:21:41.539Z","projects":[],"additional_rights":{"1718":["edit"],"1737":["edit"],"3777":["edit"]}}

Given the number of editors listed for each of these projects, this leads into many calls to the API and subsequently hits the rate limiting. It would be really good if there were an alternative way to request multiple users at once and receive back an array for the results, such as https://www.bestpractices.dev/en/users/1597,32875?format=json returning:

[
{"id":1597,"name":"Tony Hansen","nickname":null,"uid":null,"provider":"local","created_at":"2017-08-16T13:25:18.530Z","updated_at":"2021-04-08T17:21:41.539Z","projects":[],"additional_rights":{"1718":["edit"],"1737":["edit"],"3777":["edit"]}},
{"id":32875,"name":null,"nickname":"mrsjackson76","uid":"88408910","provider":"github","created_at":"2024-03-14T16:51:47.387Z","updated_at":"2024-03-14T16:51:47.387Z","projects":[1197,1441,1519,1540,1578,1579,1588,1591,1601,1602,1604,1608,161\
4,1629,1630,1631,1658,1702,1703,1706,1718,1720,1722,1737,1738,1742,1743,1751,1759,1771,1774,1799,1820,1821,2147,2192,2259,2303,2315,2316,3777,4398],"additional_rights":{}}
]

We don't have a specific limit on the number of editors, so this query system would probably need to support a paged response.

We could provide up to N (pick N) answers in an initial reply, as for almost all projects there's a relatively small number of additional users.

The API format proposed isn't a usual REST syntax. If you wanted to know about some specific user, you'd make a request for that user. In this case, you'd be querying against a collection so I think it'd look more like this:

https://www.bestpractices.dev/en/users.json?ids=1597,32875

Also: Formats are much better selected using an extension. Embedding them in the query string (format=...) technically works, but causes problems with some caching systems that are a pain to work around. Having the format explicitly in the URL excluding the query string eliminates a lot of subtle problems.

What do you think?

users.json?ids= would work perfectly fine for me. (I'm not concerned with how to do it, rather that I am able to do it.)

Regarding paging:

A URL is limited to 2k. That forces an upper limit of ~500 per query, using an average size of 5-digit user IDs. Switching to a POST could allow more per query.

Paging on the projects returns files that are consistently ~500k in size. Using an average response size of ~250 characters per user ID, that suggests an upper limit of 2000 per query in a POST.

However, unless you think we should support a way to return the data on ALL users and not just a list of users, I'm not convinced that providing an explicit "page=N" makes that much sense.

I think setting a cap on the number of IDs that can be queried at one time would make more sense. 500?

PS. ONAP has ~40 unique editors.

@david-a-wheeler , any additional thoughts on this?