This Project is divided into two parts
- Business understanding & Data Processing
- Data Analysis (Topic Modeling and Sentiment Analysis)
This Involves; Uderstanding requirements for the project/Task, Data Collection and Understanding, and data preparation.
Data source is Twitter Social Media, Data preparation involves the following; i. Handling NA, ii. Handling missing value, iii. Data standardization/ scaling of data, iv. Feature engineering, v. Dimensionality reduction
- The repository was Forked from 10 Academy (https://github.com/10xac/Twitter-Data-Analysis)
- “fix_bug” branch was Created to fix the bugs in the fix_clean_tweets_dataframe.py and fix_extract_dataframe.py
- In branch
fix_bug
the filefix_clean_tweets_dataframe.py
andfix_extract_dataframe.py
were renamed toclean_tweets_dataframe.py
andextract_dataframe.py
respectivelly. - The bugs on
clean_tweets_dataframe.py
andextract_dataframe.py
were fixed, - Multiple pushes to git was made during fixing bugs and testing, and when the fix was completed, the
fix_bug
branch was merged to main and master branch - A new branch
make_unittest
was Create for creating a new unit test for extract_dataframe.py code. - After completing the unit test writing, the
make_unittest
branch was merged to main branch - Travis CI was set to the repository such that when code was pushed to git or branch(s) merged to the main branch, the unit test in tests/*.py ran automatically.
- All tests passed.