/featureEngineering

Some important features for different kind of NLP tasks.

Some specific textual features based on the task

1. Sentiment Analysis

  • bag-of-words or bag-of-ngrams with their frequency.
  • word/sentece embeddings.
  • Don't remove punctuations like "!", "?", etc. they do contain some semantics of the sentence sentiment.
  • Stemming and lematization is ok unless we are using some embeding for words/ sentences.

2. Language Identification

  • Don't remove stop words, (but how do you know which language stop word to remove in the first place :P)

3. Hate speech detection

4. Intent Detection

5. Topic Labeling