biglocalnews/court-scraper

Refactor OK to expand case discovery to all counties

Closed this issue · 0 comments

Oklahoma's site class currently uses the Daily Filings by County search to support the search_by_date functionality. This search only supports a fraction of all counties in Oklahoma.

This alternate search page allows searching by county and filing date, among a number of other parameters.

Here's an example search for Alfalfa County.

These search results provide more useful fielded data. However, there are some important caveats:

  • search results pages contain an entry for every party to a case, so we'll need to deduplicate results from such pages
  • Results are truncated at 500 records. When combined with the duplicate entries, this means results get routinely truncated in searches for larger counties such as Tulsa, even when restricting search to a single day

To achieve full case discovery coverage for all OK counties, we may want a blended approach where we use Daily Filings by County for the available counties, and then use the alternate search page with a single filing date for all other counties. The alternate search provides a warning on the results page when a search has been truncated to 500 records, so we could use that to signal and alert about the possibility of missing data.

A final refinement to address the truncation issue would be to support additional search criteria, e.g. search by case type. Searches for a single day for specific case types should stand a better chance of avoiding truncation, especially in smaller counties.