Data Science Projects
Project 1 is on SQL
Project 2 is on Entity Resolution and data cleaning
Project 3 is on data extraction, cleaning and how to use Twitter API. Once data is preprocessed scikit-learn package is used to build vectorizers to cluster tweets based on similarity and also shpow top 10 most similar tweets to a particular query.
Please read README and Project.pdf in each of them for description.