`sc` scraped docket numbers are incorrect
grossir opened this issue · 2 comments
grossir commented
For example in this case, we collect "28236" as docket number, since it's a number available on the HTML results page, labelled on the CSS class as "case number"
However, on the downloaded files themselves, that value is labeled as the "Opinion Number". The docket number is also on the extracted text, as "Appellate Case No. 2021-001296". We could correct this with second pass of extract_from_text over the already extracted content