/Death-Big-data-Analytics

This repo analyse the data coming from Kaggle's Death in the United States dataset

Primary LanguageJupyter NotebookMIT LicenseMIT

Death-Big-data-Analytics

This repo analyses the data coming from Kaggle's Death in the United States dataset

Data description

  • The data contains the log file of deaths in the united states from the year 2005 till the year 2015. It has a long list of attributes that can be analyses and related to each other like race , year of death, education, gender, cause of death , education level and so on.
  • Each year has its own CSV file of around 500 Mb and the schema of the legal attributes of each year. The total size of the dataset after decompression is 4 Gb.
  • Some fields may be empty or not filled for a specific case.

Tasks

In this section we get the relation between multiple attributes of the dataset along with others. The list of of the search analytics are as follows :

  • The relation between education and cause of death.(Mousa)

    • The relation between work and death. (don't know if it will work)
    • sports and cause of death.
    • How does education affect lifespan?
    • How does education affect cause of death?
  • Business report. (Each did his part)

  • The most frequent causes of death generally. (Moustafa)

    • The most frequent causes of death for each race.(done)
    • and for each gender (done)
    • The day and month that most people died in.(done)
  • time series analysis (Khalid)

    • The causes of death for each year.(done)
    • The most dangerous causes of death for each season .(done)
    • Trend fitting (Machine learning) (done)
  • Violence and death (Ahmed)

    • Gun vs Vehicle Deaths. (done)
    • Guns and race. (done)
    • Homicide Vs other causes of death.
    • suicide correlation of age and education. (done)

    (interesting but not required)

    • Comparative analysis between men and women in attributes.
    • Race - age record - place of death.
    • Cardiovascular disease for men and women : it's a medical fact that men usually die out of cardiovascular disease more than women so let's test it.
    • Heart disease analysis