This Datathon was hosted by the Bristol Data Science Society (BDSS) in association with LV=. The format of the competition was a data science oriented Hackathon where we were given a real world dataset and a prediction task.
Our team (called work in progress
) consists of 2 people from the University of Bristol:
- Nikhil Parimi (Computer Science, 2nd Year)
- Jadesola Bejide (Computer Science, 2nd Year)
Predict whether a fatal/serious casualty occurs
Predict the casualty_severity
column in the casualty_test.csv dataset for whether an accident is fatal/serious or slight using road traffic data about the casualty, accident, and vehicles involved.
Note: Map targets to binary 0
(fatal, serious) and 1
(slight).
- Data on the person(s) involved in the casualty; casualty_train.csv
- Data on the vehicles involved in the casualty; vehicle_train.csv
- Field descriptions and value mappings; DatathonReference.xlsx
- Merged both csv files together on common field
accident_reference
to make detailed statistical analyses about the features - Merged the values of the column
casualty_severity
into binary form; mapping1
to fatal / serious and0
to slight.
- Dropped features with no gaussian or monotonically increasing/decreasing correlation
- Used scikit-learn to train various machine learning models, including Naive Bayes and Support Vector Machines