kurtmckee/feedparser

`io.StringIO` not working in `parse()`

Opened this issue · 0 comments

Using version 6.0.11 and Python 3.12

The docs for parse() say

...snip...
Wrap an untrusted string in a :class:`io.StringIO` or :class:`io.BytesIO` to avoid this. Do not pass untrusted strings to this function.
...snip...

If you pass an io.BytesIO, this works fine:

>>> import feedparser
>>> import io
>>> feedparser.parse(io.BytesIO(b"<rss></rss>"))
{'bozo': False, 'entries': [], 'feed': {}, 'headers': {}, 'encoding': 'utf-8', 'version': 'rss', 'namespaces': {}}

However, if you pass a io.StringIO, it crashes:

>>> import feedparser
>>> import io
>>> feedparser.parse(io.StringIO("<rss></rss>"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mchristo/.local/share/virtualenvs/rss_temple-pQQQnncW/lib/python3.12/site-packages/feedparser/api.py", line 230, in parse
    data = convert_to_utf8(result['headers'], data, result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mchristo/.local/share/virtualenvs/rss_temple-pQQQnncW/lib/python3.12/site-packages/feedparser/encodings.py", line 189, in convert_to_utf8
    xml_encoding_match = RE_XML_PI_ENCODING.match(tempdata)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot use a bytes pattern on a string-like object