/fraud-detection-handbook

Machine Learning for Credit Card Fraud Detection - Practical Handbook

Primary LanguageJupyter NotebookOtherNOASSERTION

Machine Learning for Credit Card Fraud Detection - Practical Handbook

Early access

Preliminary version available at https://fraud-detection-handbook.github.io/fraud-detection-handbook/Foreword.html.

Motivations

Machine learning for credit card fraud detection (ML for CCFD) has become an active research field. This is illustrated by the remarkable amount of publications on the topic in the last decade.

It makes no doubt that the integration of machine learning techniques in payment card fraud detection systems has greatly improved their ability to more efficiently detect frauds. At the same time, a major issue in this new research field is the lack of reproducibility. There do not exist any recognized benchmarks, nor methodologies, to compare and assess the proposed techniques.

This book aims at making a first step in this direction. All the techniques and results provided in this book are reproducible. Sections that include code are Jupyter notebooks, which can be executed either locally, or on the cloud using Google Colab or Binder.

The intended audience is students or professionals, interested in the specific problem of credit card fraud detection from a practical point of view. More generally, we think the book is also of interest for data practitioners and data scientists dealing with machine learning problems that involve sequential data and/or imbalanced classification problems.

Provisional table of content:

  • Chapter 1: Book overview
  • Chapter 2: Background
  • Chapter 3: Getting started
  • Chapter 4: Performance metrics
  • Chapter 5: Model selection
  • Chapter 6: Imbalanced learning*
  • Chapter 7: Feature engineering*
  • Chapter 8: Deep learning*
  • Chapter 9: Interpretability*

(*): Not yet published.

Current draft

The writing of the book is ongoing. We provide through this Github repository an early access to the book. As of May 2021, the first five chapters are made available. They aim at providing a state-of-the-art background to the topic, and a baseline methodology for addressing the problem.

The online version of the current draft of this book is available here.

Any comment or suggestion is welcome. We recommend using Github issues to start a discussion on a topic, and to use pull requests for fixing typos.

Compiling the book

In order to read and/or execute this book on your computer, you will need to clone this repository and compile the book.

This book is a Jupyter book. You will therefore first need to install Jupyter Book.

The compilation was tested with the following package versions:

sphinxcontrib-bibtex==2.1.4
Sphinx==3.5.2
jupyter-book==0.10.2

Once done, this is a two-step process:

  1. Clone this repository:
git clone https://github.com/Fraud-Detection-Handbook/fraud-detection-handbook
  1. Compile the book
jupyter-book build fraud-detection-handbook

The book will be available locally at fraud-detection-handbook/_build/html/index.html.

License

The code in the notebooks is released under a GNU GPL v3.0 license. The prose and pictures are released under a CC BY-SA 4.0 license.

If you wish to cite this book, you may use the following:

@book{leborgne2021fraud,
title={Machine Learning for Credit Card Fraud Detection - Practical Handbook},
author={Le Borgne, Yann-A{\"e}l and Bontempi, Gianluca},
url={https://github.com/Fraud-Detection-Handbook/fraud-detection-handbook},
year={2021},
publisher={Universit{\'e} Libre de Bruxelles}
}

Authors

Acknowledgments

This book is the result of ten years of collaboration between the Machine Learning Group, University of Brussels, Belgium and Worldline.

  • ULB-MLG, Principal investigator: Gianluca Bontempi
  • Worldline, R&D Manager: Frédéric Oblé

We wish to thank all the colleagues who worked on this topic during this collaboration: Olivier Caelen (ULB-MLG/Worldline), Fabrizio Carcillo (ULB-MLG), Guillaume Coter (Worldline), Andrea Dal Pozzolo (ULB-MLG), Jacopo De Stefani (ULB-MLG), Rémy Fabry (Worldline), Liyun He-Guelton (Worldline), Bertrand Lebichot (ULB-MLG), Gian Marco Paldino (ULB-MLG), Wissam Siblini (Worldline), Théo Verhelst (ULB-MLG).

The collaboration was made possible thanks to Innoviris, the Brussels Region Institute for Research and Innovation, through a series of grants which started in 2012 and ended in 2021.

  • 2018 to 2021. DefeatFraud: Assessment and validation of deep feature engineering and learning solutions for fraud detection. Innoviris Team Up Programme.
  • 2015 to 2018. BruFence: Scalable machine learning for automating defense system. Innoviris Bridge Programme.
  • 2012 to 2015. Adaptive real-time machine learning for credit card fraud detection. Innoviris Doctiris Programme.