vineetparikh/ML-fairness

Midterm Peer Review

Opened this issue · 0 comments

Your project is on the fairness of Machine Learning. Your goal is to determine how to measure fairness and the robustness of different metrics of fairness. You generate synthetic data and actual datasets to test your hypothesis.

Things I like:

  1. Your project covers a very interesting topic. I like how it relates to the logistic loss we learned in class.
  2. Your analysis on the synthetic data set helped me understand your research topic. By associating each feature with a probability, it made your metrics more intuitive.
  3. Your visualization on the opportunity versus predictive value was great. It provided a great, easy way to visualize the fairness between the two groups.

Recommendations for Improvement:

  1. It would be helpful if you described how you cleaned your data. What forms of data imputation did you use?
  2. Please explain how you estimate the distribution of the features for both groups in the German Credit data.
  3. It would be helpful if you went more in depth on the results of the testing on the German credit data. Since group 2 is less likely to have the "Not Foreign Worker" attribute, would decreasing the weight on that attribute make the model fairer. What is the tradeoff with accuracy?