Sample data for risk analysis

Contained in this repository is some sample data that we'd like you to take a look at and do a bit of analysis.

test_data.csv contains a sample of (made-up) data structured as follows.

Variable Description
date Date of a data Breach
org_id A numerical identifier noting the org associated with the breach
sector A string indicating what industry sector the organization operates in
cause The category of the cause fo the breach
cost The cost of the breach in $

Questions

Loss Event Frequency

  1. What is a 'typical' number of breaches an organization will experience in a year?
  2. How many breaches would an organization in the Education sector expect to experience in a year?
  3. What is a reasonable range for the number of breaches an Education organization would experience in a year?
  4. Create a breakdown of the frequency of the breach causes by sector similar to Figure 51 in the 2020 DBIR.

Loss Magnitude

  1. What is a 'typical' cost?
  2. What is a typical cost for each of the different cause types?
  3. What is a reasonable range of costs for each cause type?

Loss Summary

How would you estimate the total losses an organization might accrue in a single year from multiple breaches? There is no need to actually code this one up if you don't want to, but rather just describe how you'd go about it.