mwalughabura/datascience_101_tunapanda

Introduction to Python, Jupyter and scientific python stack for Tunapanda training course.

Jupyter Notebook

datascience_101_tunapanda

Class 1: (Sat 26th Jan)

Why learn Python for data analysis?
How to install Python?
How to install libraries?
Basic programming with Python (data structures, iterations, etc.)
How to work with jupyter notebooks?

Exercises:

Create a virtualenv in your machine.
Run python code in the console.
Create first ipython notebook.

Class 2, Class 3, Class 4: (Sat 2nd Feb, Sat 23rd Feb, Sat 9nd March)

Introduction to Numpy & Pandas.
Numpy arrays and operations.
Introduction to series and dataframes. Operations.
Reading and writing data with pandas.
Working with incomplete data.
Merge/Join/Concat.
SQL dbs and pandas.

Exercises:

Open csv file with pandas.
Remove incomplete values.
Save data to SQL database.

Class 5: (Sat 30th March)

Introduction to Visualization.
Libraries.
Different types of plots (line, scatter, bar, histograms).
Interactive plots libraries.
Save figures.

Exercises:

Load data with pandas. Choose two values and plot them.
Count values with different targets and visualize it.
Save a png figure.

Class 6, Class 7: (Sat 30th March, Sat 6th April)

Introduction to Scikit-learn.
Estandarization and Normalization of data.
Dimension reduction methods.
What is a predictive model? How can we measure the performance?
Create, train and evaluate SVM.

Exercises:

Load dataset and Scale values between [0, 1].
Create baseline model and a SVM for classification with dataset provided.
Show metrics of the model.
Save model to pickle.

Class 8: (Sat 27th April)

Interesting resources, projects and other libraries.
Deep learning techniques and libraries.
Simple pipeline overview with Fashion MNIST (from data loading to creating the model and visualize it)