/datascience_101_tunapanda

Introduction to Python, Jupyter and scientific python stack for Tunapanda training course.

Primary LanguageJupyter Notebook

datascience_101_tunapanda

Class 1: (Sat 26th Jan)

  • Why learn Python for data analysis?
  • How to install Python?
  • How to install libraries?
  • Basic programming with Python (data structures, iterations, etc.)
  • How to work with jupyter notebooks?

Exercises:

  • Create a virtualenv in your machine.
  • Run python code in the console.
  • Create first ipython notebook.

Class 2, Class 3, Class 4: (Sat 2nd Feb, Sat 23rd Feb, Sat 9nd March)

  • Introduction to Numpy & Pandas.
  • Numpy arrays and operations.
  • Introduction to series and dataframes. Operations.
  • Reading and writing data with pandas.
  • Working with incomplete data.
  • Merge/Join/Concat.
  • SQL dbs and pandas.

Exercises:

  • Open csv file with pandas.
  • Remove incomplete values.
  • Save data to SQL database.

Class 5: (Sat 30th March)

  • Introduction to Visualization.
  • Libraries.
  • Different types of plots (line, scatter, bar, histograms).
  • Interactive plots libraries.
  • Save figures.

Exercises:

  • Load data with pandas. Choose two values and plot them.
  • Count values with different targets and visualize it.
  • Save a png figure.

Class 6, Class 7: (Sat 30th March, Sat 6th April)

  • Introduction to Scikit-learn.
  • Estandarization and Normalization of data.
  • Dimension reduction methods.
  • What is a predictive model? How can we measure the performance?
  • Create, train and evaluate SVM.

Exercises:

  • Load dataset and Scale values between [0, 1].
  • Create baseline model and a SVM for classification with dataset provided.
  • Show metrics of the model.
  • Save model to pickle.

Class 8: (Sat 27th April)

  • Interesting resources, projects and other libraries.
  • Deep learning techniques and libraries.
  • Simple pipeline overview with Fashion MNIST (from data loading to creating the model and visualize it)