Charts are like pasta - Data visualization part 1 by CrashCourse Statistics
EmbraceLife opened this issue · 0 comments
Charts are like pasta - Data visualization part 1
key words
frequency table, relative frequency table, bar chart, pie chart, histogram, categorical and quantitative, binning, misleading
video links
Key questions
how to visualize data (categorical and quantitative)
why visualizing data
how to avoid be misled or lied by data visualization
Interesting Points
Quantitative data
quantiles, numbers have both order and consistent spacing
Categorical data
having no order or consistent spacing
e.g., 4 types of pasta have no order or consistent spacing
Frequency Table
to visualize categorical data, focusing on frequency rather than order
Relative Frequency Table
make categories easy to compare than mere frequencies
combined table and contingency table (with two variables) for making more complex relative frequency table
Bar chart
another way of visualizing categorical data
display frequency and relative frequency into bars chart
put one or more variables into a bar chart
Pie chart
another way of visualizing categorical data
relative frequency is used
Pictograph
compare different size and number of some shapes
misleading visualization
against common assumption and without telling
- same graph, same variable, same Y axis have different scales
- axes not start from 0
Binning
divide a quantitative variable into several bins (categories)
binning sizes defined pre-existing (age groups)
made up (deceiving purposes?)
- unequal binning against equal binning (common practice)
- usually for deceiving or hiding something (unpopularity among 30-yr group)
Histogram
bins are connected and continuous (no separation), and equal
bar height tells us the frequency or relative frequency on certain range
overall shape of all bins tells us how the data are distributed
it shows more details than tables but still ignore man individual data information
Takeaway
what the visuals actually tell us
what the visuals try to hide from us
they provide alternative ways of representing numbers
help to see the bigger picture of dataset