/datathon-2023

Our Submisson for the BDSS x LV= 2023 Datathon

Primary LanguageJupyter Notebook

datathon-2023

Table of Contents


About the Datathon

This Datathon was hosted by the Bristol Data Science Society (BDSS) in association with LV=. The format of the competition was a data science oriented Hackathon where we were given a real world dataset and a prediction task.


Team

Team Name

Our team (called work in progress) consists of 2 people from the University of Bristol:


Objectives

Predict whether a fatal/serious casualty occurs

Predict the casualty_severity column in the casualty_test.csv dataset for whether an accident is fatal/serious or slight using road traffic data about the casualty, accident, and vehicles involved.

Note: Map targets to binary 0 (fatal, serious) and 1 (slight).


Dataset

1. Casualty (primary)

2. Vehicle (secondary)

3. Data Dictionary (reference)


Preprocessing

  1. Merged both csv files together on common field accident_reference to make detailed statistical analyses about the features
  2. Merged the values of the column casualty_severity into binary form; mapping 1 to fatal / serious and 0 to slight.

Modelling

  1. Dropped features with no gaussian or monotonically increasing/decreasing correlation
  2. Used scikit-learn to train various machine learning models, including Naive Bayes and Support Vector Machines