Datafable/epu-index

Long and frequently occurring words sometimes fall off the word cloud

Closed this issue · 2 comments

I inserted the word "hypothalamus" 14 times in the cleaned text of an article, making it the most frequently occurring word in the dataset. When the word cloud is rendered, "hypothalamus" is not always there. If I decrease the maxFontSize, and set it to 40 instead of 60, this increases chances of the word being present in the word cloud, but even still, sometimes it isn't.

Is there a way to make sure this does not happen? This side effect is affecting the most important terms in our dataset so this is definitely not desirable.

This is apparently a known issue in the word cloud library.

In the real-world data, stopwords are removed so chances are lower that this will become an issue. Furthermore, we are positioning words only horizontally, making it even more improbable that large words won't fit.

Labeling this issue as "wontfix"