open-contracting/cove-ocds

Big files: the Web result page is huge

Closed this issue · 5 comments

Hello!

As we are about to publish French award data, I had to validate it (4 days ago): https://standard.open-contracting.org/review/data/55390859-63fd-453a-989b-21c612d69687

If you clicked the above link and the validation have not expired:

  1. it's going to be a little while before you see something
  2. then your browser may be struggling a bit to display the page

It's not surprising, it's trying to display 120,000+ releases.

I don't think that displaying a release table with so many lines is relevant, especially if it costs so much on both the client and the server side.

Would it make sense to disable the display of the release table from a certain number of releases?

More generally, should the reviewing process be optimized for big files? That could be changes on cove-ocds, but also the release of command line tool that would be run locally.

#35 deals well with valid files. In one comment:

This change makes a big difference for valid data, but not so much for invalid data (which has long lists of information about each error).

I don't think a super long list (e.g. 42000 entries, like when submitting one of the files mentioned in #35) is useful to any user.

I think we can have a configurable setting to limit the number of results returned. To address performance issues, we can set a high limit that still exceeds usefulness, like 1000.

If we want a smaller number like 100, we'll want to randomize the results returned, so that we're not simply reporting e.g. the first 100 errors all caused by old data and none of the errors caused by newer data (publishers who are only making improvements to new data are likely to ignore the results if they only seem to pertain to old data).

Narrower follow-up issue is #59