/Titanic-DataSet-Analysis

This short project take the Titanic Survival Data Set and Analyze it.

Primary LanguageJupyter Notebook

Titanic Survival DataSet

This short project take the Titanic Survival Data Set, analyze the data and tries to figure out the factors likely to effect the survival rate of a person.

Some description about the DataSet.

Data Dictionary

Variable Definition Key

  • survival Survival 0 = No, 1 = Yes
  • pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
  • sex Sex
  • Age Age in years
  • sibsp # of siblings / spouses aboard the Titanic
  • parch # of parents / children aboard the Titanic
  • ticket Ticket number
  • fare Passenger fare
  • cabin Cabin number
  • embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

Variable Notes

  • pclass: A proxy for socio-economic status (SES)
  • 1st = Upper
  • 2nd = Middle
  • 3rd = Lower
  • age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
  • sibsp: The dataset defines family relations in this way...
  • Sibling = brother, sister, stepbrother, stepsister
  • Spouse = husband, wife (mistresses and fiancĂ©s were ignored)
  • parch: The dataset defines family relations in this way...
  • Parent = mother, father
  • Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.