I would like to explore the data related to the World Happiness Report through visualisations, on the following areas:
- Is there an evolution of the Happiness Score over the period of 2015 – 2019 ?
- Is “Economy” the most decisive factor of the Happiness Score ?
- Is the World Bank's "Income Share" index correlated with Happiness Score ?
- Create a database from the available data
- Exploit the database to produce visualisations and obtain answers to our queries.
- Data: Kaggle, World Bank
- Database: MySQL
- Database editor: MySQL Workbench 8.0
- Python: Jupyter Notebook, Pandas, Pymysql, SQLAlchemy
- Data visualisation: Plotly
- Web: Flask, Heroku
- Data fetch, cleaning & merge
- Create a database from the results
- Optimize the database:
- Optimize table data types.
- Set up the relations between the tables, and verify the constraints.
- Secure the database.
- Optimize the SQL queries.
Now that the data are ready, I can try to get some directions to the questions. The steps are:
- Injecting the tables into the dataframes (Pymysql &SQLAlchemy).
- Produce the charts with Plotly.
Note: GitHub performs a static render of the notebooks and doesn't include the embedded HTML/JavaScript that makes up graphs. Please use this link to see the WHR_Exploit file with the interactive plotly charts.
-
Is there an evolution of the Happiness Score over the period of 2015 – 2019 ?
- There have been no remarkable developments over the period in question
-
Is “Economy” the most decisive factor of the Happiness Score ?
- The most decisive factors are Economy & Health
-
Is the World Bank's "Income Share" index correlated with Happiness Score ?
- The index has no correlation with any of the factors making up the Happiness Score
- The happiest countries do not significantly redistribute their wealth to the less fortunate.