Generic solution to preprocess data and output a final df
Python
dataPreprocessing
Parsing tweet object to lower and clean text. Remove entities (RT, Hashtags, mentions, urls) from the text (can keep the hashtags) and can replace them with placeholder.
Do some basic retweet detection to return if it is a Retweet or no.