This project is done as part of Bertelsmann Udacity Data Science Challenge. Details are on this link: https://sites.google.com/udacity.com/bertelsmanndatascholarship/project-showcase
We are a group of three learners who submitted the project.
- Jicksy John (slack: @jicksy) LinkedIn: https://www.linkedin.com/in/jicksy/)
- Kristin Bässe (slack: kristin) LinkedIn: https://www.linkedin.com/in/kristin-b%C3%A4sse-9819a115b/
- Lakshmi Prasannakumar(slack: @lakshmi) LinkedIn: https://www.linkedin.com/in/lakshmiprasannakumar/
- We collected data using NewsAPI[1] in Python (Python file: NewsAPI Python.ipynb)
- Articles are collected from 16 popular public publishing sources
- Total Articles collected: 9961
- Time Period: 02/20/2018 to 07/14/2018
- Articles are fetched for Data Science, Data Analytics, Machine Learning, Business Analytics and Artificial Intelligence
- Findings: General Article Publishing Trend, Top Publishers, Gender Ratio of Authors
- Adding Date column: CSV Generation From DataFrames.ipynb
- Visualization (Google Spreadsheet): https://docs.google.com/spreadsheets/d/1eMTNnWewMJzSqhjySWGRK7FQVayYiwe17-9ITIDJMxw/edit#gid=1231999941
- Using Wordcloud[2] in Python: Generate Title Word Cloud.ipynb
- Top 60 authors: Exporting top 60 authors to spreadsheet.ipynb
- Visualization (Google Spreadsheet) : https://docs.google.com/spreadsheets/d/1zKRYH7TMkveScwEknkTDNdrpoxHGCoQUTze-2RNcpvI/edit#gid=0
-
[1] NEWS API, News API; 07/14/18 accessed; https://newsapi.org/
-
[2] Adiljadoon-Kaggle, WordCloud with Python; 07/19/18 accessed; https://www.kaggle.com/adiljadoon/word-cloud-with-python
Google Spreadsheet https://docs.google.com/presentation/d/1_FH0Ll_OME7WucoxowxVXkVh5pzAX3SoAV3ysNxCBJA/edit#slide=id.g3cb93ce6dc_0_19