/Murders

Data Analysis of the US murders dataset

Primary LanguageR

Murders

Data Analysis of the US murders dataset

Packages:

dslabs
tidyverse
statip
ggrepel
ggthemes

Data:

US Murders data

ORDER OF SERVICE

Exploring the Data:

- find out about the different parameters
- know variable count
- find out the class and characteristics of variables
- checking for data integrity

Cleaning the Data:

Fixing; - Duplicate data - incomplete data - inaccurate data - inconsistent data

Manipulate the data

- renaming variables
- creating new columns;
    - status = if state is safe to live in or not
    - rate = the death rate of gun murders

Describing and Summarising the data:

- central tendency
- spread

Visualising the data

- Data component:
    variables picked based of geometric component
- Geometric component:
    boxplot, scatterplot
- Aeshetic mapping;
    colors = different regions
    text identification
- Scale component
    specific axises in log10 scale
    ranges are axises depend on the data
- Labels, titles, Legends
- facets