/scpo-data-science-bootcamp

Data science bootcamp at Sciences Po

Primary LanguageJupyter NotebookMIT LicenseMIT

scpo-data-science-bootcamp

Data science bootcamp at Sciences Po

Day 1

Introduction (20 min)

  • lead : Sylvain ;
  • format : Xaringan ;
  • presentation : link

Practical session

Colab Notebook Python basics

Source : https://github.com/moreymat/scpo-data-science-bootcamp/blob/main/notebooks/1_python.ipynb

Sequence 1.1

Goals :

  • use a notebook
  • declare a variable
  • execute a statement
  • print a message
  • wrap-up

Sequence 1.2

Goals :

  • import a package (NumPy)
  • manipulate single numbers
  • create a one-dimensional data structure
  • select elements from a one-dimensional data structure
  • perform operations on elements of a one-dimensional data structure
  • create a two-dimensional data structure
  • select elements from a two-dimensional data structure
  • perform operations on elements of a two-dimensional data structure
  • wrap-up

Day 2 :

Introduction

  • lead : Sylvain ;
  • format : Xaringan ;
  • presentation : link

Program : ()

Practical session

Colab Notebook Tabular data analysis 1 : Loading Open Food Facts data with pandas

Source : https://github.com/moreymat/scpo-data-science-bootcamp/blob/main/notebooks/2_pandas.ipynb

Sequence 2.1

  • load the data from a tabular format
  • store data in a variable
  • print data from the variable
  • identify the dimensions from a dataset
  • identify variable names
  • select elements from a dataset
  • subset a dataset
  • write an object
  • wrap-up

Sequence 2.2

  • load the OFF dataset
  • filter the data by a specific variable

Day 3 :

Introduction

  • lead : Sylvain ;
  • format : Xaringan ;
  • presentation : link

Program : (brands are fighting on the method of calculation of the nutrition score, let's make it ourselves)

Practical session

Colab Notebook Data visualization

Source : https://github.com/moreymat/scpo-data-science-bootcamp/blob/main/notebooks/3_dataviz.ipynb

Sequence 3.1 (make a function)

Sequence 3.2 (apply)

Resources

You should be able to access the OFF data files from the notebook but just in case, here are the direct links to the Google Drive (access restricted to Sciences Po) :

The first notebook is adapted from https://colab.research.google.com/github/data-psl/lectures2020/blob/master/notebooks/01_python_basics.ipynb