This lecture is focused on the following concepts:
- Introduction the Python programming language;
- Data wrangling using Pandas;
- Applied mathematics using NumPy;
- Understand linear models;
- Understand tree-based algorithms;
- Evaluate a machine-learning model;
- Manage mixed data types in machine-learning pipeline;
- Fine tuning model by hyper-parameters search.
Some intro slides: http://ogrisel.github.io/decks/2017_intro_sklearn
In case that you have any issues, you click on the binder link below which will setup an online machine for you:
Alternatively you can create a new conda environment which will be called
dsc
by default and whill contain all the packages required to run the
notebooks:
conda env create -f environment.yml
conda activate dsc
cd path/to/datascience_started_course
jupyter notebook
You can also update an existing conda
environment:
conda env update -f environment.yml
This material is inspired and reused part of the following materials: