jsfenfen/990-xml-reader

Better error message for missing / mangled xml

Closed this issue · 3 comments

A lot of my 2016v3.0 returns process, but on calling xml_runner.run_filing(201711459349300346) as well as irsx 201711459349300346, object ID 201711459349300346 is throwing:

File "/usr/local/bin/irsx", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/irsx/irsx_cli.py", line 107, in main
    run_main(args_read)
  File "/usr/local/lib/python3.6/site-packages/irsx/irsx_cli.py", line 93, in run_main
    verbose=args_read.verbose
  File "/usr/local/lib/python3.6/site-packages/irsx/xmlrunner.py", line 101, in run_filing
    this_filing.process(verbose=verbose)
  File "/usr/local/lib/python3.6/site-packages/irsx/filing.py", line 168, in process
    self._set_version()
  File "/usr/local/lib/python3.6/site-packages/irsx/filing.py", line 71, in _set_version
    self.version_string = self.raw_irs_dict['Return']['@returnVersion']
KeyError: 'Return'

Nevermind, just a botched XML download. Should have investigated first.

Reopening, because this should have a better error message. This actually happens a fair amount (for me at least, when I kill a process in the middle, and the download gets mangled).

Should probably throw something specific, like a invalidxmlerror. Downstream one could erase the bad file and rerun once before failing?