/AdvancedAnalyticsLabs

Analytics labs notebooks for Statistics and Business School students

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

AdvancedAnalyticsLabs

Analytics labs notebooks, supporting analytics teaching for BSc and MSc courses. I've tought these at a business school and a statistics department, so I think they fit both reasonably well. Currently, there are 14 labs uploaded divided into five topics:

Intro to Python

  1. Introduction to Python: First few steps. Simple intro for people who might be already familiar with other languages, not meant for people with no programming experience!

  2. Functions and Revenue Management: Implementation of simple algorithms (Littlewood, EMSR-a and EMSR-b). Covers function creation and an introduction to PyPlot. Taught until 2019 in Southampton University as part of Advanced Analytics course.

Banking Regulation

  1. Basel Capital Requirements: Covers Lambda functions and an introduction to Pandas in the context of the Basel capital requirements formulas.

  2. Bond Pricing: Teaches bond pricing, yields and clean/dirty prices. Taught from 2019 at Western University, as part of the Banking Analytics course I created. Replaces Revenue Management lab above, and also covers function creation and an introduction to PyPlot.

Credit Scoring

  1. Data Preprocessing: Simple data preprocessing using pandas and scikit-learn.

  2. Weight of evidence transformation: How to calculate Weight of Evidence transformations in Python. Uses the excellent scorecardpy package by @ShichenXie.

  3. Logistic Regression and Scorecards: Intro to scikit-learn, how to run a Lasso and Ridge regression, and how to calculatea scorecard.

  4. Random Forest and XGBoosting: How to run a Random Forest, an XGBoost model, tune parameters over a grid, and compare ROC curves.

  5. PD / LGD Calibration: How to define ratings by segmenting the AUC curve and calibrate a long-run PD / downturn LGD adjusted by macroeconomic factors.

Deep Learning

  1. Introduction to Keras and Shallow ANN: Gentle introduction to Keras and Tensorflow. Updated to Tensorflow 2.0.

  2. Embeddings: How to calculate embedding layers, and how to use pre-trained embeddings. Currently uses Facebook's fasttext library. Updated to Tensorflow 2.0.

  3. 1D CNN and Keras' Model API: Intro to CNN, and how to use Keras' Model API. Also contains an implementation of Kim et al. (2014) CNN for text analytics. Updated to Tensorflow 2.0.

  4. 2D CNN and Gradient Backtracing: 2D Convolutions for image classification. Use of pre-trained models (VGG16), and gradient backtracing to visualize what is being used to discriminate.

  5. Multimodal Learning: Example shown at the 2019 Machine Learning Bootcamp at the University of Toronto (video recording to follow). Mixed content from all other labs applied to sentiment analysis. Shows how to use categorical embeddings, text embeddings, and traditional structured data to improve evaluations.

Data Management and a Primer on Visualization

  1. SQL Refresher: Refresher on SQL, how to access it from Python, and a very light introduction to SQLAlchemy.

  2. Primer on Visualization: A few plots using pyplot, seaborn and plotly. Very introductory primer.

These labs are available under the GPL v3, feel free to use them as you wish. I'll be grateful if you can point to the Github, as I'll keep these updated in subsequent iterations of the modules where I teach this. As always, these notebooks are provided with no guarantees.

Comments are welcome!