EmbraceLife/shendusuipian

Charts are like pasta - Data visualization part 1 by CrashCourse Statistics

EmbraceLife opened this issue · 0 comments

Charts are like pasta - Data visualization part 1

key words

frequency table, relative frequency table, bar chart, pie chart, histogram, categorical and quantitative, binning, misleading

video links

youtube

bilibili

Key questions

how to visualize data (categorical and quantitative)

why visualizing data

how to avoid be misled or lied by data visualization

Interesting Points

Quantitative data

quantiles, numbers have both order and consistent spacing

Categorical data

having no order or consistent spacing

e.g., 4 types of pasta have no order or consistent spacing

Frequency Table

to visualize categorical data, focusing on frequency rather than order

image

Relative Frequency Table

make categories easy to compare than mere frequencies

combined table and contingency table (with two variables) for making more complex relative frequency table

image

Bar chart

another way of visualizing categorical data

display frequency and relative frequency into bars chart

put one or more variables into a bar chart

Pie chart

another way of visualizing categorical data

relative frequency is used

Pictograph

compare different size and number of some shapes

misleading visualization

against common assumption and without telling

  • same graph, same variable, same Y axis have different scales
  • axes not start from 0

image

Binning

divide a quantitative variable into several bins (categories)

binning sizes defined pre-existing (age groups)

made up (deceiving purposes?)

  • unequal binning against equal binning (common practice)
  • usually for deceiving or hiding something (unpopularity among 30-yr group)

image

Histogram

bins are connected and continuous (no separation), and equal

bar height tells us the frequency or relative frequency on certain range

overall shape of all bins tells us how the data are distributed

it shows more details than tables but still ignore man individual data information

Takeaway

what the visuals actually tell us

what the visuals try to hide from us

they provide alternative ways of representing numbers

help to see the bigger picture of dataset