Nikronic opened this issue 6 years ago · 2 comments
We could use the most frequent words in the dataset as stopwords for the first stage.
I applied a stopword list and referenced to repository already. We can also remove most and least frequent words too.