biglocalnews/court-scraper

AttributeError: 'CaseInfo' object has no attribute 'html'

Opened this issue · 1 comments

In an effort to run an Odyssey search for the first time, I create a login with the Kern County site. I added it to my config.yaml and tried to search for a case using a number I dug out of the site. Here's the error I got, which I'm not sure how to interpret. Even if I've misconfigured my credentials, I suspect this error could be more helpful.

$ pipenv run court-scraper search -p ca_kern -c S1500CL255317
Loading .env environment variables...
Executing search for ca_kern
Traceback (most recent call last):
  File "/home/palewire/Code/court-scraper/court_scraper/base/runner.py", line 54, in cache_detail_pages
    page_source = case.page_source
AttributeError: 'CaseInfo' object has no attribute 'page_source'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/bin/court-scraper", line 33, in <module>
    sys.exit(load_entry_point('court-scraper', 'console_scripts', 'court-scraper')())
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/lib/python3.9/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/palewire/.local/share/virtualenvs/court-scraper-Z6Ojlrhv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/palewire/Code/court-scraper/court_scraper/cli.py", line 85, in search
    runner.cache_detail_pages(results)
  File "/home/palewire/Code/court-scraper/court_scraper/base/runner.py", line 56, in cache_detail_pages
    page_source = case.html
AttributeError: 'CaseInfo' object has no attribute 'html'

@palewire As part of the refactorings leading up to the release of v0.1.0, Site.search methods were standardized to not download the HTML for cases by default. In other words, they would only return case metadata from search results page, rather than "clicking through" to get case details. The goal was to simplify and speed up the process of building an index of known cases in scripts. Those refactorings also added a runner.py for all platforms, which is the mechanism used by the CLI to instantiate Site and configure search. It appears the issue is that the search methods are not being configured to require download of case detail pages, which causes this downstream error when running the CLI.

In the case Kern County, we can fix the bug by passing the case_details=True argument to Site.search in Odyssey runner.py:

data = site.search(case_numbers=case_numbers, case_details=True)

I suspect similar fixes will be required for WI and OK, though we may want to create separate tickets for those.

Anyhow, shout back if you'd like to take a stab at the fix, or I can add it to my list of To Dos.