FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version
Closed this issue · 0 comments
The latest run of our test cases has highlighted that the current paradigm we use to pass raw HTML from mechanicalsoup
to a pandas
' DataFrame
structure is in the process of being deprecated as of pandas@2.1.0
, and some warnings are now being thrown as a result when using a pandas
version >= 2.1.0:
Deprecated since version 2.1.0: Passing html literal strings is deprecated. Wrap literal string/bytes input in
io.StringIO
/io.BytesIO
instead.
This will necessitate a small change to several lines in the current codebase, namely:
conference.py:63
conference.py:66
conference.py:83
conference.py:110
conference.py:142
conference.py:171
misc.py:31
misc.py:69
misc.py:106
misc.py:135
misc.py:174
misc.py:237
misc.py:266
summary.py:40
summary.py:98
summary.py:156
summary.py:204
summary.py:250
summary.py:349
summary.py:367
summary.py:418
summary.py:434
FanMatch.py:55
team.py:29
team.py:92
While I can't say I'm quite up to speed on what the rationale for this change is, the fix itself should be particularly easy even when considering backwards compatibility for Python versions >= 3.8 for which we currently test compatibility, since io.StringIO
has been available in Python core since pre-3.x. Using conference.py:63
as an example, its fixed version would become:
from io import StringIO
# ...
conf_df = pd.read_html(StringIO(str(table)))[0]