How to have the Scattertext without showing collocation?
Lipahak opened this issue · 1 comments
Lipahak commented
Steps to Reproduce
I have these steps to generate a Scattertext with my data:
# Here is the code snippet
import scattertext as st
from pprint import pprint
from scattertext import SampleCorpora, PhraseMachinePhrases, dense_rank, RankDifference, AssociationCompactor, produce_scattertext_explorer
from scattertext.CorpusFromPandas import CorpusFromPandas
import spacy
nlp = spacy.load("en_core_web_sm")
# corpus
corpus = st.CorpusFromPandas(df, #############edit
category_col='tag',
text_col='text',
nlp=nlp).build()
# textscatter plot
path1="xxx"
html = st.produce_scattertext_explorer(corpus,
category='Y',
category_name='A',
not_category_name='B',
width_in_pixels=1000,
metadata=df['text_remove']) ##############edit
Expected behavior
There are couple of words showing in collocation.
I expected words scattering word by word without collocation.
How could I avoid it?
Deeply appreciate your kind answer.
Additional context
JasonKessler commented
Run corpus=corpus.get_unigram_corpus(). This will remove bigrams.