Parsing some enex files fails with AttributeError: 'NoneType' object has no attribute 'set_mime'
sweth opened this issue · 3 comments
Running on about 30k enex files, this error was returned on a few hundred of them. I haven't been able to identify anything that the files that produce the error have in common other than having mime tags; around 18k of the enex files have mime sections, though, and only a few hundred of them fail like this. Is there any way you could add some error handling or debugging that might help identify where specifically in the files in question the problem lies? A sample traceback from one of the errors is:
File "/Users/sweth/projects/exomut/evernote-dump/source/run_script.py", line 5, in <module>
evernote_dump.main(sys.argv[1:])
File "/Users/sweth/projects/exomut/evernote-dump/source/evernote_dump/evernote_dump.py", line 143, in main
run_parse(args)
File "/Users/sweth/projects/exomut/evernote-dump/source/evernote_dump/evernote_dump.py", line 126, in run_parse
parser.parse(args[i])
File "/Users/sweth/.pyenv/versions/3.6.4/Python.framework/Versions/3.6/lib/python3.6/xml/sax/expatreader.py", line 111, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/Users/sweth/.pyenv/versions/3.6.4/Python.framework/Versions/3.6/lib/python3.6/xml/sax/xmlreader.py", line 125, in parse
self.feed(buffer)
File "/Users/sweth/.pyenv/versions/3.6.4/Python.framework/Versions/3.6/lib/python3.6/xml/sax/expatreader.py", line 217, in feed
self._parser.Parse(data, isFinal)
File "/private/var/folders/6w/kst472s93_1dx7sw6lpg24_w0000gn/T/python-build.20180416180227.67807/Python-3.6.4/Modules/pyexpat.c", line 282, in CharacterData
File "/Users/sweth/projects/exomut/evernote-dump/source/evernote_dump/evernote_dump.py", line 100, in characters
self.attachment.set_mime(content_stream)
AttributeError: 'NoneType' object has no attribute 'set_mime'
Hello! Thanks for you help on the project. Unfortunately, I don't have the time to add error checking to my project. Maybe some day when I get some free time. I think I found the problem though. Thanks for adding the trackback that helped a lot!
I was checking for attachments by finding the first instance of the tag "data". However, it seems that some of your files have the tag "mime" come before the "data" tag. So I moved the check one level up to "resource". Which I should have done from the beginning!
I won't mark this issue as resolved until you have replied saying it works. (I hope it does!)
Thanks again for your help.
Heh. So despite being a total python newbie, I did some poking around and noticed that there was no "data" tag in the notes in question, just a "resource" tag, and so I did a quick local edit to startElement to start attachments on either data or resource, which did work. :)
However: I then went to create a PR for that, but saw your latest commit, and so I deleted that local branch, and only too late remembered that the branch I had been in locally was one where I had made some tweak that I can no longer remember in order to make running the script from the CLI work, and thus the script is failing again. I'm invoking it as python3 source/run_script.py $FILE
, and when I do so, I get
Traceback (most recent call last):
File "/Users/sweth/projects/exomut/evernote-dump/source/run_script.py", line 2, in <module>
from .evernote_dump import evernote_dump
ModuleNotFoundError: No module named '__main__.evernote_dump'; '__main__' is not a package
I can't for the life of me remember what the tweak was that I had made in that local branch to make run_script correctly find evernote_dump. Any thoughts?
I am not sure, but module relations have always caused me a ton of trouble. Try running the script directly from the directory.
cd source
python3 run_script.py FILENAME
instead of
python3 source/run_script.py FILENAME
You could also try the GUI
python3 main.py
Tell me if it works out!