JasonKessler/scattertext

saving to HTML breaks encoding

Stevod opened this issue · 2 comments

Thank you for submitting a bug report!

Steps to Reproduce

run this code, and it fails when saving html to disk:

# Here is the code snippet

import scattertext as st

df = st.SampleCorpora.ConventionData2012.get_data().assign(
    parse=lambda df: df.text.apply(st.whitespace_nlp_with_sentences)
)

corpus = st.CorpusFromParsedDocuments(
    df, category_col='party', parsed_col='parse'
).build().get_unigram_corpus().compact(st.AssociationCompactor(2000))

html = st.produce_scattertext_explorer(
    corpus,
    category='democrat', category_name='Democratic', not_category_name='Republican',
    minimum_term_frequency=0, pmi_threshold_coefficient=0,
    width_in_pixels=1000, metadata=corpus.get_df()['speaker'],
    transform=st.Scalers.dense_rank
)
open('./demo_compact.html', 'w').write(html)

Expected behavior

It is expected to open a .interactive .html file in a browser

What is the error you're seeing?

Since I can't reproduce the locally, and an incomplete bug report was submitted, I'm closing the issue. Happy to open it up again if some actionable information is provided.