j-andrews7/kenpompy

FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version

Closed this issue · 0 comments

esqew commented

The latest run of our test cases has highlighted that the current paradigm we use to pass raw HTML from mechanicalsoup to a pandas' DataFrame structure is in the process of being deprecated as of pandas@2.1.0, and some warnings are now being thrown as a result when using a pandas version >= 2.1.0:

Deprecated since version 2.1.0: Passing html literal strings is deprecated. Wrap literal string/bytes input in io.StringIO/io.BytesIO instead.

Source

This will necessitate a small change to several lines in the current codebase, namely:

While I can't say I'm quite up to speed on what the rationale for this change is, the fix itself should be particularly easy even when considering backwards compatibility for Python versions >= 3.8 for which we currently test compatibility, since io.StringIO has been available in Python core since pre-3.x. Using conference.py:63 as an example, its fixed version would become:

from io import StringIO
# ...
conf_df = pd.read_html(StringIO(str(table)))[0]