Garmelon/PFERD

Login: 'NoneType' object is not subscriptable

compilebreak opened this issue · 7 comments

At first I started to get this error occasionally on some runs, however for the past few days I get this error on every run.

Running crawl:Foo
Crawled     '.'

Error An unexpected exception occurred

Traceback (most recent call last):
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/pferd.py", line 156, in run
    await crawler.run()
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/http_crawler.py", line 193, in run
    await super().run()
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/crawler.py", line 85, in wrapper
    return await f(*args, **kwargs)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/crawler.py", line 338, in run
    await self._run()
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 196, in _run
    await self._crawl_course(self._target)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 210, in _crawl_course
    await self._crawl_url(root_url, expected_id=course_id)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 248, in _crawl_url
    await gather_elements()
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 98, in wrapper
    return await f(*args, **kwargs)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 231, in gather_elements
    soup = await self._get_page(url)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 718, in _get_page
    await self.authenticate(auth_id)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/http_crawler.py", line 84, in authenticate
    await self._authenticate()
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 98, in wrapper
    return await f(*args, **kwargs)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 755, in _authenticate
    await self._shibboleth_login.login(self.session)
  File "/home/user/projects/pferd_upgrade/pferd_env/lib/python3.9/site-packages/PFERD/crawl/ilias/kit_ilias_web_crawler.py", line 821, in login
    action = form["action"]
TypeError: 'NoneType' object is not subscriptable

When running PFERD for the first time off the day, it can mostly crawl one course before crashing. From there on it crashes every time, however after deleting the cookies file and using --no-share-cookies it gets back to 'first time of the day' kind of stability.

Hm, I can't reproduce that. What version of PFERD are you using?

I am using master but latest has the same issue

Hm, I am not sure how I can debug this remotely considering I can't get it to appear locally. Could you add a print(soup) before here and then post the output somewhere? You can also send it in a DM, but I think the page itself shouldn't contain any sensitive information.

Ok, I found something... I ran my config from last semester and it runs fine. Then I edited global options of my config for this semester from

videos = yes
forums = yes
links = plaintext
transform = (.*) -re->> "{g1.lower().replace(' ', '_').replace(',', '')}"

to

videos = no
forums = no
links = ignore
# transform = (.*) -re->> "{g1.lower().replace(' ', '_').replace(',', '')}"

and PFERD runs without any issues. (Also this transform rule is not present in my last semester config.)

After deleting all files I just downloaded using PFERD, I tried to enable videos, links and transform again without a problem. However when turning forums back to yes, it starts crashing again (I can get rid of this error again by turning forums = no AND deleting all .cookies files).
I can reproduce this with multiple config files and with no forums on my hard drive. (Currently I'm using master)

Same behavior on branch latest

Hm, I can not reproduce it and nothing particular comes to mind. Do you mind sharing what course it crashes on somewhere (Discord DM, here, somewhere else, maybe email)?

fb4631b fixes the issue. Thank you!