⭐ Farnaz and Racquelle ⭐
This compiled dataset pulled from four other datasets linked by time and place was built to find signals correlated to increased suicide rates among different cohorts globally, across the socio-economic spectrum. The inspiration for this study was to prevent suicide. This data set includes 11 columns and provides information about country, year, sex, age group, count of suicides, population, suicide rate, country-year composite key, gdp_for_year, gdp_per_capita, generation (based on age grouping average).
The references for this study are:
-
United Nations Development Program. (2018). Human development index (HDI)
-
World Bank. (2018). World development indicators: GDP (current US$) by country:1985 to 2016
Our exploratory data analysis can be found here.
(1) Ensure the following packages are installed:
tidyr
dplyr
ggplot2
here
tidyverse
docopt
glue
(2) Our Rscripts to load, process, and conduct exploratory data analysis can be fould in the links below:
-
load_data.R: This loads the raw data file into a CSV.
-
process_data.R: Cleaned data via omitting NAs and eliminating one of the columns.
-
EDA_script.R: Performed exploratory data analysis of our cleaned dataset.
To replicate this analysis, clone this repository, navigate to the src
folder in your terminal, and type in the commands below:
Rscript src/load_data.R --url_to_read="https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/suiciderates.csv"
Rscript src/process_data.R --url_to_read="https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/suiciderates_clean.csv"
Rscript src/EDA_script.R --url_to_read="https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/suiciderates_clean.csv"
(1) Ensure the following packages are installed:
tidyr
dplyr
ggplot2
here
tidyverse
docopt
glue
(2) Our Rscripts for linear regression of our dataset, and knitting final report.
To replicate this analysis, clone this repository, navigate to the src
folder in your terminal, and type in the commands below:
Rscript src/linear_regression.R --url_to_read="https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/suiciderates_clean.csv"
Rscript src/knit.R --final_report="docs/final_report.Rmd"
(3) Usage for GNU Make
Dashboard Proposal
In this app you can find information on how socio-economic factors play an important role in the number of suicide rates in different countries and how they change between 1985 to 2016. There are 4 plots that show how suicide rates change according to changes in socio-economic factors. The bar chart provides information on how Generation can affect the number of suicides. Another bar chart provides information on how suicide rates change according to sex. The line graph is very informative in interpreting how suicide numbers change across different years (between year 1985-2016). Our linear regression is looking at how number of suicides correlate to GDP for year. This dashboard has a dropdown menu that will essentially allow the user to choose another variable rather than what is shown on the plots. As an example the user can change country instead of generation in the fist plot and see how that is related to number of suicides.
David is a sociologist who is interested in the assessment and prevention of suicide as part of his major. He is wondering how different socio-economic variables can affect suicide rates. He is also curious to know if those socio-economic factors intersect together. One day he is searching on the web and he finds our app that provides information on how socio-economic variables interact with the number of suicides.
Below please find our dashboard sketches. Icons were taken from FLATICON:
Please click this link to view the Rscript for our final dashboard app and here for its deployment on heroku!