
finding stem words, removing stop word, most frequent stems, list n-gram, scoring bigram, POS-tagger and numOfTags

Primary LanguagePython

Python NLP Project on English language using NLTK library and sklearn.

  • Reading csv file.
  • Finding the stem of words in text.
  • Removing stopwords by using stopwords list in NLTK.
  • Finding most frequent 10 stems.
  • listing n-grams.
  • scoring bigram.
  • Using POS-tagger that attaches a part of speech tag to each word.
  • Vectorizing text.