To illustrate the Simpson paradox, I created a typical salary right skewed distribution from 4 salary normal distributions:
- Female non management (Number: 950 employees, Mean: 2,600 EUR, Standard deviation: 700 EUR)
- Female management (Number: 50 employees, Mean: 4,200 EUR, Standard deviation: 1,000 EUR)
- Male non management (Number: 750 employees, Mean: 2,400 EUR, Standard deviation: 700 EUR)
- Male management (Number: 250 employees, Mean: 4,000 EUR, Standard deviation: 1,000 EUR)
You can visualize the different distributions:
- The salary right skewed distribution
- The salary distribution by gender
- The salary distribution by gender & job family
- The salary distribution by job family
Eventually, you realize that:
- while, the mean salary for males is higher than the mean salary for females
- the mean salary for males is lower than the mean salary for females when you consider job families