matthewwithanm/python-markdownify

"UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 57218: character maps to <undefined>"

hadupa opened this issue · 2 comments

hadupa commented

I'm testing this out on a few webpages.

-I'm running the CLI command
markdownify > htmlFile.html > mdFile.md

Error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 57218: character maps to <undefined>

Unfortunately, this is the first error I've encountered and I haven't found anything in the Issues about it. Thought I'd add it here. I assume from this:

File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
...
print(markdownify(**vars(args)))
^^^^^^^^^^^^^^^^^^^^^^^^^
...
return MarkdownConverter(**options).convert(html)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
soup = BeautifulSoup(html, 'html.parser')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
markup = markup.read()
^^^^^^^^^^^^^
...
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 57218: character maps to <undefined>

It may be an issue with BeautifulSoup?

hadupa commented

I also tried running it with the --code-language 'utf-8' option to see if that would resolve the issue. Same error.