Responsible Data Science Workflows

A lightning talk presentation for Collaborations Workshop 2022 (CW22) https://www.software.ac.uk/cw22

Acknowledgements

This is a collaborative project that has been initiated with fellow participants Ben Marwick, Brandeis Marshall, Kirstie Whitaker, Sara Stoudt, Thibault Lestang, and Yacine Jerniteat at the workshop Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design, PyData Global 2021 conference, 28–30 October 2021. I would also like to thank Tiffany Timbers, Emma Rand, Ben Marwick, Luc Rocher, the Turing Way Project, and Greg Wilson for helpful Twitter discussions and pointers to resources.

References

Barocas, Solon, Moritz Hardt and Arvind Narayanan. “Fairness and Machine Learning Limitations and Opportunities.” (2018). https://fairmlbook.org

Barocas, Solon, Kate Crawford, Aaron Shapiro, and Hanna Wallach. “The Problem with Bias: From Allocative to Representational Harms in Machine Learning.” Presented at the 9th Annual Conference of the Special Interest Group for Computing, Information and Society (SIGCIS), Philadelphia, PA, October 29, 2017.

Christensen, G., Freese, J. and Miguel, E., 2019. Transparent and reproducible social science research. University of California Press.

Kitzes, Justin, Daniel Turek, and Fatma Deniz. 2018. The practice of reproducible research: case studies and lessons from the data-intensive sciences. University of California Press.

Lipton, Z.C., 2018. The Mythos of Model Interpretability. ACM Queue, 16(3), pp.31–57.

Narayanan, A., 2018, February. Tutorial: 21 fairness definitions and their politics. In Proc. Conf. Fairness Accountability Transp., New York, USA (Vol. 1170, p. 3). https://www.youtube.com/watch?v=jIXIuYdnyyk

Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019 Oct 25;366(6464):447-453. doi: 10.1126/science.aax2342.

Ostblom, Joel. 2021. Data Science Ethics. BAIT 509 Business Applications of Machine Learning. https://bait509-ubc.github.io/BAIT509/lectures/lecture10a.html

Stoudt S, Vásquez VN, Martinez CC (2021) Principles for data analysis workflows. PLOS Computational Biology 17(3): e1008770. https://doi.org/10.1371/journal.pcbi.1008770

Srinivasan, R. and Chander, A., 2021. Biases in AI systems. Communications of the ACM, 64(8), pp.44-49. https://doi.org/10.1145/3464903

Suresh, H., Guttag, J., Kaiser, D., & Shah, J. (2021). Understanding Potential Sources of Harm throughout the Machine Learning Life Cycle. MIT Case Studies in Social and Ethical Responsibilities of Computing, (Summer 2021). https://doi.org/10.21428/2c646de5.c16a07bb

The Turing Way Community, Becky Arnold, Louise Bowler, Sarah Gibson, Patricia Herterich, Rosie Higman, … Kirstie Whitaker. (2019, March 25). The Turing Way: A Handbook for Reproducible Data Science (Version v0.0.4). Zenodo. http://doi.org/10.5281/zenodo.3233986

License

The work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

valdanchev/CW22

Responsible Data Science Workflows

Acknowledgements

References

License