/workshop-loss-reserv-fraud

Course material for a workshop on loss modelling, reserving and insurance fraud analytics

Primary LanguageHTML

Workshop Analytics for Loss Modelling, Reserving and Insurance Fraud

by Katrien Antonio and Jonas Crevecoeur

Course materials for the online Loss Modelling, Reserving and Insurance Fraud Analytics course in June 2021.

📆 June 3, 10 & 17 2021
⏰ From 3pm to 6pm
📌 online via MS Teams

Course materials will be posted in the days before the workshop.

Overview

This three-half-day workshop introduces the essential concepts of building insurance loss models with R.

On the first half-day, you will gain insights in the foundations of handling loss data, including useful data wrangling and visualization steps. You will cover a variety of discrete and continuous loss distributions, and techniques to build more flexible distributions from standard distributions (by mixing and splicing). You will learn how to fit these models to actual data and inspect their goodness-of-fit. Then, you will use the fitted model to estimate risk measures.

The second half-day then puts focus on reserving analytics. Starting from a granular data set with the development of individual claims, you will learn how to get useful insights from the data. Step-by-step you will move from the granular data to aggregated data in a run-off triangle. You will fit the famous chain ladder method and examine its validity on the given data. Alternative modelling strategies based on recent research will be discussed.

The third half-day then covers challenges in building fraud detection models from tabular and networked data. You will acquire insights in the foundations of these analytic methods, learn how to set-up the model building process, and focus on building a good understanding of the resulting model output and predictions.

Leaving this workshop, you should have a firm grasp of the working principles of a variety of loss models for frequency and severity data and be able to explore their use in practical settings. Moreover, you should have acquired the fundamental insights to explore some other methods on your own.

Schedule and Course Material

The detailed schedule is subject to small changes.

Session Duration Description Lecture material
Day 0 your own pace Prework sheets prework in html and in pdf
Day 1 3 - 3.15 Prologue sheets in html and in pdf
3.15 - 3.50 Data sets sheets in html
4 - 4.50 Frequency models sheets in html
5 - 6 Severity models sheets in html
Day 2 3 - 3.20 Motivation and strategies sheets in html and pdf
3.20 - 3.50 Reserving data structures (part 1): daily and yearly data sheets in html and pdf
4 - 4.15 Reserving data structures (part 2): individual and aggregated data sheets in html and pdf
4.15 - 4.50 Claims reserving with triangles sheets in html and pdf
5 - 5.45 When the chain ladder method fails sheets in html and pdf
5.45 - 6 Research outlook sheets in html and pdf
Day 3 3 - 3.30 Motivation and fraud detection cycle sheets in pdf
3.30 - 3.50 Network and tabular data
4 - 4.30 Ranking algorithm
4.30 - 4.50 Featurization
5 - 5.30 Supervised learning with imbalanced data
5.30 - 5.50 Evaluation metrics
5.50 - 6 Wrap-up
Half-day 1: Loss modelling analytics

Topics include:

  • data sets used in the course: MTPL and SecuraRe losses
  • data handling and visualization tools with {ggplot} and {dplyr}
  • building frequency models: Poisson, Negative Binomial, ZI and Hurdle, maximum likelihood estimation and goodness-of-fit
  • building severity models: simple to complex parametric distributions, splicing to construct a global fit, mixing, estimate a risk measure.

Download lecture sheets in html and pdf.

Half-day 2: Claims reserving analytics

Topics include:

  • data sets used in the course: individual claims, daily data
  • data handling and visualization tools with {ggplot} and {dplyr}
  • from daily to yearly data, from individual claims to triangles
  • fitting well known claims reserving methods
  • when the simple methods fail: what else is there?

Download lecture sheets in html and pdf.

Half-day 3: Insurance fraud analytics

Topics include:

  • insurance fraud detection strategies and numbers
  • network and tabular data
  • ranking algorithms, e.g. PageRank, BiRank
  • network featurization steps
  • supervised learning with imbalanced data, sampling methods like SMOTE
  • evaluation metrics, eg AUROC, AUPR, top decile lift
  • wrap up.

Download lecture sheets pdf.

Prework

The workshop requires a basic understanding of R. We prepared some prework sheets that you can take a look at before the workshop (html or pdf).

Being familiar with statistical or machine learning methods is not required. The workshop gradually builds up these concepts, with an emphasis on hands-on demonstrations and exercises.

Software Requirements

You have two options to join the coding exercises covered during the workshop. Either you join the RStudio cloud workspace dedicated to the workshop, and then you’ll run R in the cloud, from your browser. Or you use your local installation of R and RStudio.

We kindly ask participants to join the RStudio Cloud as default!

RStudio Cloud - default!

You will join our workspace on R Studio Cloud. This enables a very accessible set-up for working with R in the cloud for the less experienced user!

https://rstudio.cloud/spaces/147546/join?access_code=YgLiHaQpPvZdcbh943QMO%2FqtF%2FKTct855m5tYtmH

Here are the steps you should take (before the workshop):

  • visit the above link
  • log in by creating an account for RStudio Cloud or by using your Google or GitHub login credentials
  • join the space
  • at the top of your screen you see ‘Projects’, click ‘Projects’
  • with the ‘copy’ button (on the right) you can make your own version of the ‘day 1 on loss modelling’ project; in this copy you can work on the exercises, add comments etc.
  • you should now be able to visit the project and see the ‘scripts’ and ‘data’ folders on the right. Open and run the ‘installation-instructions.R’ script from the scripts folder, to see if everything works fine.

We will have everything set up for you in the correct way. You only have to login!

Local installation - optional (make sure you have access to the RStudio Cloud as sketched above)!

Alternatively, you can bring a laptop with a recent version of R and RStudio installed. Make sure you can connect your laptop to the internet (or download the course material one day before the start of the workshop). You will need:

In the prework folder you will find a step-by-step guide to installing R and RStudio (though a bit outdated).

Please run the following script in your R session to install the required packages

packages <- c("tidyverse", "here", "gridExtra", "caret", "rsample", "recipes", "grid", "rstudioapi", "MASS", "actuar", "statmod", "ReIns", "pscl", "zoo", "igraph", "expm", "pROC", "ChainLadder", "lubridate")
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

if(sum(!(packages %in% installed.packages()[, "Package"]))) {
  stop(paste('The following required packages are not installed:\n', 
             paste(packages[which(!(packages %in% installed.packages()[, "Package"]))], collapse = ', ')));
} else {
  message("Everything is set up correctly. You are ready to go.")
}

Instructors

Katrien Antonio is professor in insurance data science at KU Leuven and associate professor at University of Amsterdam. She teaches courses on data science for insurance, life and non-life insurance mathematics and loss models. Research-wise Katrien puts focus on pricing, reserving and fraud analytics, as well as mortality dynamics.

Jonas Crevecoeur is a post-doctoral researcher in biostatistics at KU Leuven. He recently obtained his PhD within the insurance research group at KU Leuven and holds the degrees of MSc in Mathematics, MSc in Insurance Studies and MSc in Financial and Actuarial Engineering (KU Leuven). Before starting the PhD program he worked as an intern with QBE Re (Belgium office) where he studied multiline products and copulas. Jonas was a PhD fellow of the Research Foundation - Flanders (FWO, PhD fellowship fundamental research).

Happy learning!