Blue bars change their width when topics aren't selected or when different topic is selected
scherbakovdmitri opened this issue · 0 comments
scherbakovdmitri commented
Maybe I don't get how it works, but I am following the example in the package text2vec:
library(text2vec)
data("movie_review")
N = 500
tokens = word_tokenizer(tolower(movie_review$review[1:N]))
it = itoken(tokens, ids = movie_review$id[1:N])
v = create_vocabulary(it)
v = prune_vocabulary(v, term_count_min = 5, doc_proportion_max = 0.2)
dtm = create_dtm(it, vocab_vectorizer(v))
lda_model = LDA$new(n_topics = 10)
doc_topic_distr = lda_model$fit_transform(dtm, n_iter = 20)
# run LDAvis visualisation if needed (make sure LDAvis package installed)
lda_model$plot()
Notice how for the token "end" the bars are different (one crosses the tick , and the other - does not)
This becomes more obvious if you have few tokens in corpus, then the width changes considerably.
Any explanation to this? Thanks!