/EnronFraudProject

Using machine learning modeling to predict who in the Enron fraud investigation will be labeled a Person of Interest (POI).

Primary LanguageJupyter Notebook

Purpose

To build machine learning and workflow pipeline skills, this project (from the Udacity Data Analyst Nanodegree program) seeks to make a model that can reasonably predict whether a person in the Enron fraud case from the early 2000s would eventually be labeled as a Person of Interest (POI) by the SEC as part of their investigation.

The Data

Effectively, we're using metadata that is a combination of individuals' financial records (e.g. how much their stock was worth) and email records (e.g. how many messages were sent to them by someone who would eventually be named as a POI). It's a relatively small data set, with only 146 samples and 21 features, including the POI labels.