/Udacity-Data-Analyst-Nanodegree

Repository for the projects needed to complete the Data Analyst Nanodegree

Primary LanguageJupyter Notebook

Udacity-Data-Analyst-Nanodegree

Discover insights from Data via Python and SQL

skills Acquired ( Summary)

Prerequisites

You'll need to install

  • Python (3.x or higher)
  • Jupyter Notebook
  • Numpy
  • Pandas
  • Matplotlib
  • Seaborn

And additional libraries defined in each project.

Recommended:

  • Anaconda

project Overview

P1: Investigate a Dataset (Gapminder World DATASET)

This chapter was all about the data analysis process as whole. From gathering to cleaning, assessing and wrangling to exploring and visualizing the data over the programming workflow and communication was everything included.

This project included therefore all steps of the typical data analysis process. This includes:

  • posing questions

  • gather, wrangle and clean data

  • communicate answers to the questions

  • assited through visualizations and statistic

Capture d’écran du 2022-11-09 10-33-01

D'après le graphique linéaire des ventes d'armes à feu par rapport aux années 1997 à 2016, il y a une tendance à la hausse des achats d'armes à feu avec des augmentations soudaines en 2015 et une diminution en 2016, en partie due à la collecte de données de seulement 9 mois cette année-là.

P2: Gather, Clean and Analyze Twitter Data (WeRateDogs™ (@dog_rates))

This chapter was a deep dive into the data wrangling part of the data analysis process. We learned about the difference between messy and dirty data, how tidy data should look like, about the assessing, defining, cleaning and testing process, etc. Moreover, we talked about many different file types and different methods of gathering data.

In this project we had to deal with the reality of dirty and messy data (again). We gathered data from different sources (for example the Twitter API), identified issues with the dataset in terms of tidiness and quality. Afterwards we had to solve these problems while documenting each step. The end of the project was then focused on the exploration of the data.

Capture d’écran du 2022-11-09 10-36-25

P3: Communicate Data Findings

The final chapter was focused on proper visualization of data. We learned about chart junk, uni-, bi- and multivariate visualization, use of color, data/ink ratio, the lief factor, other encodings, [...].

The task of the final project was to analyze and visualize real-world data. I chose the Ford GoBike dataset.

Capture d’écran du 2022-11-09 10-39-44