- bag-of-words or bag-of-ngrams with their frequency.
- word/sentece embeddings.
- Don't remove punctuations like "!", "?", etc. they do contain some semantics of the sentence sentiment.
- Stemming and lematization is ok unless we are using some embeding for words/ sentences.
- Don't remove stop words, (but how do you know which language stop word to remove in the first place :P)