/responsible-ai

Wiki to capture relevant information and developments in the field of Responsible AI

Primary LanguageRich Text FormatCreative Commons Zero v1.0 UniversalCC0-1.0

Responsible AI repository

Repo to capture relevant information and developments in the field of Responsible AI

Responsible-AI-wiki

A repo for a curated, but incomplete, overview of what we're reading, listening to, using and creating related to Responsible AI.

Creating

Responsible AI webinar series

An in-depth webinar series about the responsible use and application of AI. In this series, various invited speakers will discuss a range of topics related to the ethical, accountable, and/or sustainable use of AI. Xomnia teams up with partners, experts, and guest speakers to cover a broad range of expertise and perspectives on the field. The series aims to contribute to issues identified at various levels within companies, ranging from individual employees to data science and business teams, to boardroom meetings.

The aim is to inform, educate and support anyone who is interested in Responsible AI, by offering an overview of cutting-edge tools, methods, best practices, and lessons learned, spreading the knowledge of how to apply AI more responsibly starting now.

Date Topic Description Speakers Link (inc recap)
2020-11-18 Responsible IT An overview of what we actually mean by "Responsible AI" Nanda Piersma https://www.xomnia.com/event/responsible-it/
2020-12-10 Sustainable & Climate AI We will once again dive into the sustainability theme! With Lucia Loher, Product Lead at Jina AI, and Ricardo Vinuesa, who is an Associate Professor and Affiliated Researcher in the AI Sustainability Center at KTH Royal Institute of Technology. Lucia Loher, Ricardo Vinuesa https://www.xomnia.com/event/sustainable-climate-ai/
2021-01-21 Meaningful human control of AI About AI systems that play a role in tasks that clearly contain ethical challenges. Marc Steen, Jurriaan van Diggelen https://www.xomnia.com/event/ai-trough-meaningful-human-control/
2021-02-10 Sustainable & Climate AI An appeal for sustainable AI and how we can contribute Angela Fan, Peter van de Putten, Jeroen van der Most https://www.xomnia.com/event/sustainable-and-climate-ai/
2021-04-15 Ethics in Data Projects Ethics in the design of data products and intelligent technology Chris Detweiler, Stan Guldemond https://www.xomnia.com/event/ethics-in-data-projects/
2021-05-20 AI risk assessment AI risk assessment and management in high-risk applications Alexander Boer, Mark Roboff https://www.xomnia.com/event/ai-risk-assessment-and-management-in-high-risk-applications/
2021-06-10 Fair and Explainable AI Fair and explainable AI Hilde Weerts, Jasper van der Waa https://www.xomnia.com/event/fair-and-explainable-ai/
2021-07-08 AI and systems regulations An introduction to to the proposal of an EU AI regulation and the challenges of AI optimization of new energy systems. Joas van Ham, Pallas Agterberg https://www.xomnia.com/event/responsible-ai-webinar-ai-and-systems-regulations/

Reading

Literature

Articles

Accountability

Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. new media & society, 20(3), 973-989. http://ananny.org/papers/anannyCrawford_seeingWithoutKnowing_2016.pdf

Bennett Moses, Lyria, and Janet Chan. "Algorithmic prediction in policing: assumptions, evaluation, and accountability." Policing and society 28, no. 7 (2018): 806-822. https://www.tandfonline.com/doi/pdf/10.1080/10439463.2016.1253695

Chopra, A. K., & Singh, M. P. (2016, April). From social machines to social protocols: Software engineering foundations for sociotechnical systems. In Proceedings of the 25th International Conference on World Wide Web (pp. 903-914). https://eprints.lancs.ac.uk/id/eprint/78048/1/IOSE_akc_v123.pdf

Dignum, F., & Dignum, V. (2020). How to center AI on humans. In NeHuAI@ ECAI (pp. 59-62). http://ceur-ws.org/Vol-2659/dignum.pdf

Hayes, P., Van De Poel, I., & Steen, M. (2020). Algorithms and values in justice and security. AI & SOCIETY, 1-23. https://link.springer.com/content/pdf/10.1007/s00146-019-00932-9.pdf

Kroll, J. A. (2015). Accountable algorithms (Doctoral dissertation, Princeton University). http://scholarship.law.upenn.edu/cgi/viewcontent.cgi?article=9570&context=penn_law_review

Lepri, B., Oliver, N., Letouzé, E., Pentland, A., & Vinck, P. (2018). Fair, transparent, and accountable algorithmic decision-making processes. Philosophy & Technology, 31(4), 611-627. https://dspace.mit.edu/bitstream/handle/1721.1/122933/13347_2017_279_ReferencePDF.pdf?sequence=2

Lepri, B., Oliver, N., & Pentland, A. (2021). Ethical machines: the human-centric use of artificial intelligence. Iscience, 102249. https://www.sciencedirect.com/science/article/pii/S2589004221002170/pdf?md5=e9698938e71c25097265d5b3edd1a124&pid=1-s2.0-S2589004221002170-main.pdf

Sanders, C. B., Weston, C., & Schott, N. (2015). Police innovations,‘secret squirrels’ and accountability: Empirically studying intelligence-led policing in Canada. British journal of criminology, 55(4), 711-729. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.818.331&rep=rep1&type=pdf

van de Poel, I. (2020). Embedding values in artificial intelligence (AI) systems. Minds and Machines, 30(3), 385-409. https://link.springer.com/article/10.1007/s11023-020-09537-4

Verdiesen, I., de Sio, F. S., & Dignum, V. (2021). Accountability and control over autonomous weapon systems: A framework for comprehensive human oversight. Minds and Machines, 31(1), 137-163. https://link.springer.com/article/10.1007/s11023-020-09532-9

Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Transparent, explainable, and accountable AI for robotics. Science (Robotics), 2(6). https://philarchive.org/archive/WACTEA

Ethics

High-Level Expert Group on Artificial Intelligence (European Commission) - The Assessment List for Trustworthy Artificial Intelligence (ALTAI) for self assessment (2020) https://op.europa.eu/en/publication-detail/-/publication/73552fcd-f7c2-11ea-991b-01aa75ed71a1/language-en/format-PDF/source-search

High-Level Expert Group on Artificial Intelligence (European Commission) - Ethics Guidelines for Trustworthy AI (2019) https://op.europa.eu/en/publication-detail/-/publication/d3988569-0434-11ea-8c1f-01aa75ed71a1

Abel, D., MacGlashan, J., & Littman, M. L. (2016, March). Reinforcement learning as a framework for ethical decision making. In Workshops at the thirtieth AAAI conference on artificial intelligence. https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/viewPDFInterstitial/12582/12346

Anderson, J., & Kamphorst, B. (2014, March). Ethics of e-coaching: Implications of employing pervasive computing to promote healthy and sustainable lifestyles. In 2014 IEEE international conference on pervasive computing and communication workshops (PERCOM WORKSHOPS) (pp. 351-356). IEEE. https://www.academia.edu/download/50305277/SIPC_-_Anderson_-_Kamphorst_-_camera-ready-version.pdf

Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI magazine, 28(4), 15-15. https://ojs.aaai.org/index.php/aimagazine/article/download/2065/2052/

Cowgill, B., Dell'Acqua, F., Deng, S., Hsu, D., Verma, N., & Chaintreau, A. (2020, July). Biased programmers? or biased data? a field experiment in operationalizing ai ethics. In Proceedings of the 21st ACM Conference on Economics and Computation (pp. 679-681). https://arxiv.org/pdf/2012.02394

De Ethische Data Assistent, D. E. D. A., & Over, U. D. S. Utrecht Data School. https://dataschool.nl/nieuws/challenging-citizenship-social-media-and-big-data/

Dignum, V. (2017). Responsible artificial intelligence: designing AI for human values. http://dspace.daffodilvarsity.edu.bd:8080/bitstream/handle/123456789/2181/itu2017-1.pdf?sequence=1&isAllowed=y

Dignum, V. (2021, June). The Myth of Complete AI-Fairness. In International Conference on Artificial Intelligence in Medicine (pp. 3-8). Springer, Cham. https://arxiv.org/pdf/2104.12544

Drosou, M., Jagadish, H. V., Pitoura, E., & Stoyanovich, J. (2017). Diversity in big data: A review. Big data, 5(2), 73-84. https://www.cs.drexel.edu/~julia/documents/big.2016.0054.pdf

Eitel-Porter, R. (2021). Beyond the promise: implementing ethical AI. AI and Ethics, 1(1), 73-80. https://link.springer.com/article/10.1007/s43681-020-00011-6

Fazelpour, S., & Lipton, Z. C. (2020, February). Algorithmic fairness from a non-ideal perspective. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 57-63). https://arxiv.org/pdf/2001.09773

Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., ... & Vayena, E. (2018). AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines, 28(4), 689-707. https://link.springer.com/article/10.1007/s11023-018-9482-5

Franzke, A. S., Muis, I., & Schäfer, M. T. (2021). Data Ethics Decision Aid (DEDA): a dialogical framework for ethical inquiry of AI and data projects in the Netherlands. Ethics and Information Technology, 1-17. https://link.springer.com/article/10.1007/s10676-020-09577-5

Friedman, B., Harbers, M., Hendry, D. G., van den Hoven, J., Jonker, C., & Logler, N. (2021). Eight grand challenges for value sensitive design from the 2016 Lorentz workshop. Ethics and Information Technology, 23(1), 5-16. https://link.springer.com/article/10.1007/s10676-021-09586-y

Friedman, B., Harbers, M., Hendry, D. G., van den Hoven, J., Jonker, C., & Logler, N. (2021). Introduction to the special issue: value sensitive design: charting the next decade. Ethics and Information Technology, 23(1), 1-3. https://link.springer.com/article/10.1007/s10676-021-09585-z

Hagendorff, T. (2020). The ethics of AI ethics: An evaluation of guidelines. Minds and Machines, 30(1), 99-120. https://link.springer.com/article/10.1007/s11023-020-09517-8

Harbers, M., Peeters, M. M., & Neerincx, M. A. (2017). Perceived autonomy of robots: effects of appearance and context. In A World with Robots (pp. 19-33). Springer, Cham. http://mariekepeeters.com/wp-content/uploads/2013/11/Perc_auton_ICRE-16_camready.pdf

Harbers, M., de Greeff, J., Kruijff-Korbayová, I., Neerincx, M. A., & Hindriks, K. V. (2017). Exploring the ethical landscape of robot-assisted search and rescue. In A World with Robots (pp. 93-107). Springer, Cham. https://www.academia.edu/download/61488529/Exploring_the_Ethical_Landscape_of_Robot-Assisted_Search_and_Rescue20191211-45204-9fbxe9.pdf

Sosa Hidalgo, M. (2019). Design of an Ethical Toolkit for the Development of AI Applications. https://repository.tudelft.nl/islandora/object/uuid:b5679758-343d-4437-b202-86b3c5cef6aa/datastream/OBJ3/download

Köbis, N., Bonnefon, J. F., & Rahwan, I. (2021). Bad machines corrupt good morals. Nature Human Behaviour, 5(6), 679-685. http://publications.ut-capitole.fr/43547/1/wp_tse_1212.pdf

Lepri, B., Oliver, N., & Pentland, A. (2021). Ethical machines: the human-centric use of artificial intelligence. Iscience, 102249. https://www.sciencedirect.com/science/article/pii/S2589004221002170/pdf?md5=e9698938e71c25097265d5b3edd1a124&pid=1-s2.0-S2589004221002170-main.pdf

Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. Available at SSRN 3403301. https://arxiv.org/pdf/1906.05684

Madaio, M. A., Stark, L., Wortman Vaughan, J., & Wallach, H. (2020, April). Co-designing checklists to understand organizational challenges and opportunities around fairness in ai. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14). https://dl.acm.org/doi/pdf/10.1145/3313831.3376445

Olteanu, A., Castillo, C., Diaz, F., & Kıcıman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://www.frontiersin.org/articles/10.3389/fdata.2019.00013/full

Peeters, M. M., van Diggelen, J., Van Den Bosch, K., Bronkhorst, A., Neerincx, M. A., Schraagen, J. M., & Raaijmakers, S. (2021). Hybrid collective intelligence in a human–AI society. AI & SOCIETY, 36(1), 217-238. https://www.karelvandenbosch.nl/documents/2020_Peeters_etal_AI&S_Hybrid_collective_intelligence_in_a_human%E2%80%93AI_society.pdf

Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020, February). Saving face: Investigating the ethical concerns of facial recognition auditing. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 145-151). https://dl.acm.org/doi/pdf/10.1145/3375627.3375820

STEUER, F. (2018). Machine Learning for Public Policy Making (Doctoral dissertation, School of Public Policy, Central European University). http://www.etd.ceu.edu/2018/steuer_fabian.pdf

van de Poel, I. (2020). Embedding values in artificial intelligence (AI) systems. Minds and Machines, 30(3), 385-409. https://link.springer.com/article/10.1007/s11023-020-09537-4

Van Wynsberghe, A., & Robbins, S. (2019). Critiquing the reasons for making artificial moral agents. Science and engineering ethics, 25(3), 719-735. https://link.springer.com/article/10.1007/s11948-018-0030-8

Zardiashvili, L., Bieger, J., Dechesne, F., & Dignum, V. (2019). AI Ethics for Law Enforcement: A Study into Requirements for Responsible Use of AI at the Dutch Police. Delphi, 2, 179. https://scholarlypublications.universiteitleiden.nl/access/item%3A2984332/view

Fairness

Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., & Wallach, H. (2018, July). A reductions approach to fair classification. In International Conference on Machine Learning (pp. 60-69). PMLR. http://proceedings.mlr.press/v80/agarwal18a/agarwal18a.pdf

Agarwal, A., Dudík, M., & Wu, Z. S. (2019, May). Fair regression: Quantitative definitions and reduction-based algorithms. In International Conference on Machine Learning (pp. 120-129). PMLR. http://proceedings.mlr.press/v97/agarwal19d/agarwal19d.pdf

Albarghouthi, A., D'Antoni, L., Drews, S., & Nori, A. V. (2017). Fairsquare: probabilistic verification of program fairness. Proceedings of the ACM on Programming Languages, 1(OOPSLA), 1-30. https://dl.acm.org/doi/pdf/10.1145/3133904

Barocas, S., & Selbst, A. D. (2016). Big data's disparate impact. Calif. L. Rev., 104, 671. http://www.datascienceassn.org/sites/default/files/Big%20Data%27s%20Disparate%20Impact.pdf

Baur, T., Mehlmann, G., Damian, I., Lingenfelser, F., Wagner, J., Lugrin, B., ... & Gebhard, P. (2015). Context-Aware Automated Analysis and Annotation of Social Human--Agent Interactions. ACM Transactions on Interactive Intelligent Systems (TiiS), 5(2), 1-33. https://www.researchgate.net/profile/Tobias-Baur-3/publication/279748956_Context-Aware_Automated_Analysis_and_Annotation_of_Social_Human-Agent_Interactions/links/56f3c58808ae95e8b6ccff71/Context-Aware-Automated-Analysis-and-Annotation-of-Social-Human-Agent-Interactions.pdf

Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29, 4349-4357. https://papers.nips.cc/paper/6228-estimating-the-size-of-a-large-network-and-its-communities-from-a-random-sample.pdf

Broesch, J., Barrett, H. C., & Henrich, J. (2014). Adaptive content biases in learning about animals across the life course. Human Nature, 25(2), 181-199. https://robobees.seas.harvard.edu/files/culture_cognition_coevol_lab/files/broesch_barrett_henrich_2014.pdf

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186. https://arxiv.org/pdf/1608.07187.pdf?ref=hackernoon.com

Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., & Elhadad, N. (2015, August). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1721-1730). https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.700.1729&rep=rep1&type=pdf

Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2), 153-163. https://arxiv.org/pdf/1703.00056

Corbett-Davies, S., & Goel, S. (2018). The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023. https://arxiv.org/pdf/1808.00023.pdf,

Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017, August). Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining (pp. 797-806). https://arxiv.org/pdf/1701.08230.pdf?source=post_page

Cowgill, B., Dell'Acqua, F., Deng, S., Hsu, D., Verma, N., & Chaintreau, A. (2020, July). Biased programmers? or biased data? a field experiment in operationalizing ai ethics. In Proceedings of the 21st ACM Conference on Economics and Computation (pp. 679-681). https://arxiv.org/pdf/2012.02394

Cronin, A. M., & Vickers, A. J. (2008). Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study. BMC Medical Research Methodology, 8(1), 1-9. https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-8-75

D'Amour, A., Srinivasan, H., Atwood, J., Baljekar, P., Sculley, D., & Halpern, Y. (2020, January). Fairness is not static: deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 525-534). https://dl.acm.org/doi/pdf/10.1145/3351095.3372878

Datta, A., Sen, S., & Zick, Y. (2016, May). Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE symposium on security and privacy (SP) (pp. 598-617). IEEE. https://www.frogheart.ca/wp-content/uploads/2016/06/CarnegieMellon_AlgorithmicTransparency.pdf

de Greeff, J., de Boer, M. H., Hillerström, F. H., Bomhof, F., Jorritsma, W., & Neerincx, M. A. (2021). The FATE System: FAir, Transparent and Explainable Decision Making. In AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering. http://ceur-ws.org/Vol-2846/paper35.pdf

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012, January). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference (pp. 214-226). https://arxiv.org/pdf/1104.3913

Eitel-Porter, R. (2021). Beyond the promise: implementing ethical AI. AI and Ethics, 1(1), 73-80. https://link.springer.com/article/10.1007/s43681-020-00011-6

Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 3315-3323. https://papers.nips.cc/paper/6374-interaction-screening-efficient-and-sample-optimal-learning-of-ising-models.pdf

Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015, August). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 259-268). https://arxiv.org/pdf/1412.3756.pdf;

Friedler, S. A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E. P., & Roth, D. (2019, January). A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the conference on fairness, accountability, and transparency (pp. 329-338). https://arxiv.org/pdf/1802.04422

George, Joey F., Kevin Duffy, and Manju Ahuja. "Countering the anchoring and adjustment bias with decision support systems." Decision Support Systems 29.2 (2000): 195-206. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.3329&rep=rep1&type=pdf

Grgic-Hlaca, N., Zafar, M. B., Gummadi, K. P., & Weller, A. (2016, December). The case for process fairness in learning: Feature selection for fair decision making. In NIPS Symposium on Machine Learning and the Law (Vol. 1, p. 2). http://www.mlandthelaw.org/papers/grgic.pdf

Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 3315-3323. https://papers.nips.cc/paper/6374-interaction-screening-efficient-and-sample-optimal-learning-of-ising-models.pdf

Helwegen, R., Louizos, C., & Forré, P. (2020). Improving fair predictions using variational inference in causal models. arXiv preprint arXiv:2008.10880. https://arxiv.org/pdf/2008.10880

Holstein, K., Wortman Vaughan, J., Daumé III, H., Dudik, M., & Wallach, H. (2019, May). Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1-16). https://arxiv.org/pdf/1812.05239

Hooker, S. (2021). Moving beyond “algorithmic bias is a data problem”. Patterns, 2(4), 100241. https://www.sciencedirect.com/science/article/pii/S2666389921000611

Hutchinson, B., & Mitchell, M. (2019, January). 50 years of test (un) fairness: Lessons for machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 49-58). https://arxiv.org/pdf/1811.10104

Jacobs, A. Z., & Wallach, H. (2021, March). Measurement and fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 375-385). https://dl.acm.org/doi/pdf/10.1145/3442188.3445901

Kay, M., Matuszek, C., & Munson, S. A. (2015, April). Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 3819-3828). https://mdsoar.org/bitstream/handle/11603/11254/KayMatuszekMunsonCHI2015GenderImageSearch.pdf?sequence=1

Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., & Schölkopf, B. (2017). Avoiding discrimination through causal reasoning. arXiv preprint arXiv:1706.02744. https://arxiv.org/pdf/1706.02744

Kleinberg, J., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807. https://arxiv.org/pdf/1609.05807

Kusner, M. J., Loftus, J. R., Russell, C., & Silva, R. (2017). Counterfactual fairness. arXiv preprint arXiv:1703.06856. https://arxiv.org/pdf/1703.06856

Lee, M. K. (2018). Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, 5(1), 2053951718756684. https://journals.sagepub.com/doi/pdf/10.1177/2053951718756684

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), 1-35. https://arxiv.org/pdf/1908.09635

Olteanu, A., Castillo, C., Diaz, F., & Kıcıman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://www.frontiersin.org/articles/10.3389/fdata.2019.00013/full

Pessach, D., & Shmueli, E. (2020). Algorithmic fairness. arXiv preprint arXiv:2001.09784. https://arxiv.org/pdf/2001.09784

Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020, February). Saving face: Investigating the ethical concerns of facial recognition auditing. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 145-151). https://dl.acm.org/doi/pdf/10.1145/3375627.3375820

Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019, January). Fairness and abstraction in sociotechnical systems. In Proceedings of the conference on fairness, accountability, and transparency (pp. 59-68). https://dl.acm.org/doi/pdf/10.1145/3287560.3287598

Sweeney, L. (2013). Discrimination in online ad delivery. Communications of the ACM, 56(5), 44-54. https://arxiv.org/pdf/1301.6822

Torralba, A., & Efros, A. A. (2011, June). Unbiased look at dataset bias. In CVPR 2011 (pp. 1521-1528). IEEE. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.944.9518&rep=rep1&type=pdf

Verma, S., & Rubin, J. (2018, May). Fairness definitions explained. In 2018 ieee/acm international workshop on software fairness (fairware) (pp. 1-7). IEEE. https://fairware.cs.umass.edu/papers/Verma.pdf

Yang, K., Huang, B., Stoyanovich, J., & Schelter, S. (2020, January). Fairness-Aware Instrumentation of Preprocessing~ Pipelines for Machine Learning. In Workshop on Human-In-the-Loop Data Analytics (HILDA'20). https://par.nsf.gov/servlets/purl/10182459

Zemel, R., Wu, Y., Swersky, K., Pitassi, T., & Dwork, C. (2013, May). Learning fair representations. In International conference on machine learning (pp. 325-333). PMLR. http://proceedings.mlr.press/v28/zemel13.pdf

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457. https://arxiv.org/pdf/1707.09457

Interpretability, transparency, explainability

Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access, 6, 52138-52160. https://ieeexplore.ieee.org/iel7/6287639/6514899/08466590.pdf

Akata, Z., Balliet, D., De Rijke, M., Dignum, F., Dignum, V., Eiben, G., ... & Welling, M. (2020). A research agenda for hybrid intelligence: augmenting human intellect with collaborative, adaptive, responsible, and explainable artificial intelligence. IEEE Annals of the History of Computing, 53(08), 18-28. https://vossen.info/wp-content/uploads/2020/08/akata-2020-research.pdf

Anderson, A., Dodge, J., Sadarangani, A., Juozapaitis, Z., Newman, E., Irvine, J., ... & Burnett, M. (2020). Mental models of mere mortals with explanations of reinforcement learning. ACM Transactions on Interactive Intelligent Systems (TiiS), 10(2), 1-37. https://dl.acm.org/doi/pdf/10.1145/3366485

Anjomshoae, S., Najjar, A., Calvaresi, D., & Främling, K. (2019). Explainable agents and robots: Results from a systematic literature review. In 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Montreal, Canada, May 13–17, 2019 (pp. 1078-1088). International Foundation for Autonomous Agents and Multiagent Systems. https://www.diva-portal.org/smash/get/diva2:1303810/FULLTEXT01.pdf

Barocas, S., Selbst, A. D., & Raghavan, M. (2020, January). The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 80-89). https://dl.acm.org/doi/pdf/10.1145/3351095.3372830

Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82-115. https://arxiv.org/pdf/1910.10045

Broekens, J., Harbers, M., Hindriks, K., Van Den Bosch, K., Jonker, C., & Meyer, J. J. (2010, September). Do you get it? User-evaluated explainable BDI agents. In German Conference on Multiagent System Technologies (pp. 28-39). Springer, Berlin, Heidelberg. https://www.researchgate.net/profile/Maaike_Harbers/publication/221248771_Do_you_get_it_User-evaluated_explainable_BDI_agents/links/02e7e51a3698beeec1000000/Do-you-get-it-User-evaluated-explainable-BDI-agents.pdf

Byrne, R. M. (2019, August). Counterfactuals in Explainable Artificial Intelligence (XAI): Evidence from Human Reasoning. In IJCAI (pp. 6276-6282). https://www.ijcai.org/proceedings/2019/0876.pdf

Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. https://arxiv.org/pdf/1702.08608

Fox, M., Long, D., & Magazzeni, D. (2017). Explainable planning. arXiv preprint arXiv:1709.10256. https://arxiv.org/pdf/1709.10256

Gomez, O., Holter, S., Yuan, J., & Bertini, E. (2020, March). ViCE: visual counterfactual explanations for machine learning models. In Proceedings of the 25th International Conference on Intelligent User Interfaces (pp. 531-535). https://arxiv.org/pdf/2003.02428

Gregor, S., & Benbasat, I. (1999). Explanations from intelligent systems: Theoretical foundations and implications for practice. MIS quarterly, 497-530. https://dl.acm.org/doi/abs/10.2307/249487

Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM computing surveys (CSUR), 51(5), 1-42. https://dl.acm.org/doi/pdf/10.1145/3236009

Harbers, M., van den Bosch, K., & Meyer, J. J. (2010, August). Design and evaluation of explainable BDI agents. In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (Vol. 2, pp. 125-132). IEEE. https://www.karelvandenbosch.nl/documents/2010_Harbers_etal_IAT_Design_and_Evaluation_of_Explainable_BDI_Agents.pdf

Hayes, B., & Shah, J. A. (2017, March). Improving robot controller transparency through autonomous policy explanation. In 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI (pp. 303-312). IEEE. https://dspace.mit.edu/bitstream/handle/1721.1/116013/hri17.pdf?sequence=1&isAllowed=y

Holzinger, A., Langs, G., Denk, H., Zatloukal, K., & Müller, H. (2019). Causability and explainability of artificial intelligence in medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(4), e1312. https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1312

Janzing, D., Minorics, L., & Blöbaum, P. (2020, June). Feature relevance quantification in explainable AI: A causal problem. In International Conference on Artificial Intelligence and Statistics (pp. 2907-2916). PMLR. http://proceedings.mlr.press/v108/janzing20a/janzing20a.pdf

Kaptein, F., Broekens, J., Hindriks, K., & Neerincx, M. (2017, August). Personalised self-explanation by robots: The role of goals versus beliefs in robot-action explanation for children and adults. In 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) (pp. 676-682). IEEE. https://www.researchgate.net/profile/Frank-Kaptein/publication/321811208_Personalised_self-explanation_by_robots_The_role_of_goals_versus_beliefs_in_robot-action_explanation_for_children_and_adults/links/5a38de3f0f7e9b7c4870083e/Personalised-self-explanation-by-robots-The-role-of-goals-versus-beliefs-in-robot-action-explanation-for-children-and-adults.pdf

Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., & Wortman Vaughan, J. (2020, April). Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14). http://www.jennwv.com/papers/interp-ds.pdf

Koh, P. W., & Liang, P. (2017, July). Understanding black-box predictions via influence functions. In International Conference on Machine Learning (pp. 1885-1894). PMLR. http://proceedings.mlr.press/v70/koh17a/koh17a.pdf

Kumar, I. E., Venkatasubramanian, S., Scheidegger, C., & Friedler, S. (2020, November). Problems with Shapley-value-based explanations as feature importance measures. In International Conference on Machine Learning (pp. 5491-5500). PMLR. http://proceedings.mlr.press/v119/kumar20e/kumar20e.pdf

Langley, P., Meadows, B., Sridharan, M., & Choi, D. (2017, February). Explainable agency for intelligent autonomous systems. In Twenty-Ninth IAAI Conference. https://www.aaai.org/ocs/index.php/IAAI/IAAI17/paper/viewPDFInterstitial/15046/13734

Lecue, F. (2020). On the role of knowledge graphs in explainable AI. Semantic Web, 11(1), 41-51. http://semantic-web-journal.org/system/files/swj2259.pdf

Letham, B., Rudin, C., McCormick, T. H., & Madigan, D. (2015). Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics, 9(3), 1350-1371. https://projecteuclid.org/journals/annals-of-applied-statistics/volume-9/issue-3/Interpretable-classifiers-using-rules-and-Bayesian-analysis--Building-a/10.1214/15-AOAS848.pdf

Lim, B. Y., Dey, A. K., & Avrahami, D. (2009, April). Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2119-2128). https://www.academia.edu/download/6079173/lim_chi_09.pdf

Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., ... & Lee, S. I. (2020). From local explanations to global understanding with explainable AI for trees. Nature machine intelligence, 2(1), 56-67. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7326367/

Lyons, J. B., & Havig, P. R. (2014, June). Transparency in a human-machine context: approaches for fostering shared awareness/intent. In International conference on virtual, augmented and mixed reality (pp. 181-190). Springer, Cham. https://link.springer.com/content/pdf/10.1007/978-3-319-07458-0_18.pdf

Madumal, P., Miller, T., Sonenberg, L., & Vetere, F. (2020, April). Explainable reinforcement learning through a causal lens. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 03, pp. 2493-2500). https://ojs.aaai.org/index.php/AAAI/article/view/5631/5487

Miller, T. (2018). Contrastive explanation: A structural-model approach. arXiv preprint arXiv:1811.03163. https://arxiv.org/pdf/1811.03163

Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267, 1-38. https://arxiv.org/pdf/1706.07269

Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable AI: Beware of inmates running the asylum or: How I learnt to stop worrying and love the social and behavioural sciences. arXiv preprint arXiv:1712.00547. https://arxiv.org/pdf/1712.00547

Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., ... & Gebru, T. (2019, January). Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency (pp. 220-229). https://arxiv.org/pdf/1810.03993.pdf?source=post_page

Mittelstadt, B., Russell, C., & Wachter, S. (2019, January). Explaining explanations in AI. In Proceedings of the conference on fairness, accountability, and transparency (pp. 279-288). https://arxiv.org/pdf/1811.01439

Nori, H., Jenkins, S., Koch, P., & Caruana, R. (2019). Interpretml: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223. https://arxiv.org/pdf/1909.09223

Poyiadzi, R., Sokol, K., Santos-Rodriguez, R., De Bie, T., & Flach, P. (2020, February). FACE: Feasible and actionable counterfactual explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 344-350). https://arxiv.org/pdf/1909.09369

Preece, A. (2018). Asking ‘Why’in AI: Explainability of intelligent systems–perspectives and challenges. Intelligent Systems in Accounting, Finance and Management, 25(2), 63-72. https://onlinelibrary.wiley.com/doi/am-pdf/10.1002/isaf.1422

Rathi, S. (2019). Generating counterfactual and contrastive explanations using SHAP. arXiv preprint arXiv:1906.09293. https://arxiv.org/pdf/1906.09293

Ribeiro, M. T., Singh, S., & Guestrin, C. (2018, April). Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1). https://ojs.aaai.org/index.php/AAAI/article/view/11491/11350

Roscher, R., Bohn, B., Duarte, M. F., & Garcke, J. (2020). Explainable machine learning for scientific insights and discoveries. Ieee Access, 8, 42200-42216. https://ieeexplore.ieee.org/iel7/6287639/8948470/09007737.pdf

Samek, W., & Müller, K. R. (2019). Towards explainable artificial intelligence. In Explainable AI: interpreting, explaining and visualizing deep learning (pp. 5-22). Springer, Cham. https://arxiv.org/pdf/1909.12072

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (pp. 618-626). http://openaccess.thecvf.com/content_ICCV_2017/papers/Selvaraju_Grad-CAM_Visual_Explanations_ICCV_2017_paper.pdf

Stepin, I., Alonso, J. M., Catala, A., & Pereira-Fariña, M. (2021). A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence. IEEE Access, 9, 11974-12001. https://ieeexplore.ieee.org/iel7/6287639/9312710/09321372.pdf

van der Waa, J., van Diggelen, J., & Neerincx, M. (2018). The design and validation of an intuitive confidence measure. memory, 2, 1. https://www.researchgate.net/profile/Jasper_Waa/publication/323772259_The_design_and_validation_of_an_intuitive_confidence_measure/links/5aaa2ba0aca272d39cd64796/The-design-and-validation-of-an-intuitive-confidence-measure.pdf

Tintarev, N., & Masthoff, J. (2007, October). Effective explanations of recommendations: user-centered design. In Proceedings of the 2007 ACM conference on Recommender systems (pp. 153-156). https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.888.3437&rep=rep1&type=pdf

van der Waa, J., van Diggelen, J., Bosch, K. V. D., & Neerincx, M. (2018). Contrastive explanations for reinforcement learning in terms of expected consequences. arXiv preprint arXiv:1807.08706. https://arxiv.org/pdf/1807.08706

van der Waa, J., Robeer, M., van Diggelen, J., Brinkhuis, M., & Neerincx, M. (2018). Contrastive explanations with local foil trees. arXiv preprint arXiv:1806.07470. https://arxiv.org/pdf/1806.07470.pdf&sa=D&ust=1537878080177000

van der Waa, J., Nieuwburg, E., Cremers, A., & Neerincx, M. (2021). Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence, 291, 103404. https://www.sciencedirect.com/science/article/pii/S0004370220301533

van der Waa, J., Schoonderwoerd, T., van Diggelen, J., & Neerincx, M. (2020). Interpretable confidence measures for decision support systems. International Journal of Human-Computer Studies, 144, 102493. https://www.sciencedirect.com/science/article/pii/S1071581920300951

Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Transparent, explainable, and accountable AI for robotics. Science (Robotics), 2(6). https://philarchive.org/archive/WACTEA

Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., 31, 841. https://arxiv.org/pdf/1711.00399.pdf?source=post_page

Wang, N., Pynadath, D. V., & Hill, S. G. (2016, March). Trust calibration within a human-robot team: Comparing automatically generated explanations. In 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 109-116). IEEE. https://www.cs.utexas.edu/users/sniekum/classes/RLFD-F16/papers/Trust.pdf

Wang, D., Yang, Q., Abdul, A., & Lim, B. Y. (2019, May). Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1-15). http://www.brianlim.net/wordpress/wp-content/uploads/2019/01/chi2019-reasoned-xai-framework.pdf

Weitz, K., Schiller, D., Schlagowski, R., Huber, T., & André, E. (2019, July). " Do you trust me?" Increasing user-trust by integrating virtual agents in explainable AI interaction design. In Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents (pp. 7-9). https://opus.bibliothek.uni-augsburg.de/opus4/files/65191/X_Plane_IVA_2019_EA.pdf

Yang, X. J., Unhelkar, V. V., Li, K., & Shah, J. A. (2017, March). Evaluating effects of user experience and system transparency on trust in automation. In 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI (pp. 408-416). IEEE. https://dspace.mit.edu/bitstream/handle/1721.1/116045/Yang_HRI_2017.pdf?sequence=1&isAllowed=y

Blogs

Using

Tooling

####Accountable, robust, secure, reproducible, privacy-aware AI tooling Also check out: https://github.com/EthicalML/awesome-production-machine-learning

Responsible ML (Azure)

https://azure.microsoft.com/en-us/services/machine-learning/responsibleml/#overview

TO EXPAND!

Sagemaker pipelines (Amazon)

https://aws.amazon.com/sagemaker/pipelines/

  • purpose-built, easy-to-use continuous integration and continuous delivery (CI/CD) service for machine learning. Workflows can be shared and re-used between teams. Create, automate, and manage end-to-end ML workflows at scale.
  • Automate different steps of the ML workflow, including data loading, data transformation, training and tuning, and deployment. With SageMaker Pipelines, you can build dozens of ML models a week, manage massive volumes of data, thousands of training experiments, and hundreds of different model versions. You can share and re-use workflows to recreate or optimize models, helping you scale ML throughout your organization.
  • Create an audit trail of model components such as training data, platform configurations, model parameters, and learning gradients. Audit trails can be used to recreate models and help support compliance requirements.

Mlflow

https://mlflow.org/

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles four primary functions:

  • Tracking experiments to record and compare parameters and results (MLflow Tracking). The MLflow Tracking component is an API and UI for logging parameters, code versions, metrics, and output files when running your machine learning code and for later visualizing the results. MLflow Tracking lets you log and query experiments using Python, REST, R API, and Java API APIs.
  • Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects). An MLflow Project is a format for packaging data science code in a reusable and reproducible way, based primarily on conventions. In addition, the Projects component includes an API and command-line tools for running projects, making it possible to chain together projects into workflows.
  • Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models). An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark. The format defines a convention that lets you save a model in different “flavors” that can be understood by different downstream tools.
  • Providing a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations (MLflow Model Registry). The MLflow Model Registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, stage transitions (for example from staging to production), and annotations.

MLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API, R API, and Java API.

TensorBoard

https://www.tensorflow.org/tensorboard

TensorBoard: TensorFlow's visualization toolkit. TensorBoard provides the visualization and tooling needed for machine learning experimentation:

  • Tracking and visualizing metrics such as loss and accuracy
  • Visualizing the model graph (ops and layers)
  • Viewing histograms of weights, biases, or other tensors as they change over time
  • Projecting embeddings to a lower dimensional space
  • Displaying images, text, and audio data
  • Profiling TensorFlow programs

Adversarial robustness toolbox (IBM)

https://adversarial-robustness-toolbox.readthedocs.io/en/latest/

https://github.com/Trusted-AI/adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, scikit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio, video, etc.) and machine learning tasks (classification, object detection, speech recognition, generation, certification, etc.).

Neptune.AI

https://neptune.ai/product

Metadata store for MLOps, built for research and production teams that run a lot of experiments. ML metadata store is an essential part of the MLOps stack that deals with model building metadata management. It makes it easy to log, store, display, organize, compare and query all metadata generated during ML model lifecycle. Experiment and model training metadata. You can log anything that happens during ML run. Artifact metadata. For datasets, predictions or models you can log things like paths to dataset, dataset hash, feature column names, who created / modified + timestamps Model metadata. For trained models (production or not) you can log things such as model binary or location to model asset, dataset version, links to recorded model training runs and experiments, model descriptions and so on. You can use the metadata store to track your experiments, register your models, and more.

Weights and Biases (WANDB)

https://wandb.ai/

  1. Integrate quickly: Track, compare, and visualize ML experiments with 5 lines of code. Free for academic and open source projects.
  2. Visualize Seamlessly: Add W&B's lightweight integration to your existing ML code and quickly get live metrics, terminal logs, and system stats streamed to the centralized dashboard.
  3. Collaborate in real time: Explain how your model works, show graphs of how model versions improved, discuss bugs, and demonstrate progress towards milestones.
  4. Track, compare, and visualize with 5 lines of code: Add a few lines to your script to start logging results. Our lightweight integration works with any Python script.
  5. Visualize everything you are doing: See model metrics stream live into interactive graphs and tables. It is easy to see how your latest model is performing compared to previous experiments, no matter where you are training your models.
  6. Quickly find and re-run previous model checkpoints. Save everything you need to reproduce models later— the latest git commit, hyperparameters, model weights, and even sample test predictions. You can save experiment files and datasets directly to W&B or store pointers to your own storage.
  7. Monitor your CPU and GPU usage. Visualize live metrics like GPU utilization to identify training bottlenecks and avoid wasting expensive resources.
  8. Debug performance in real time. See how your model is performing and identify problem areas during training. We support rich media including images, video, audio, and 3D objects.

Polyaxon

https://polyaxon.com/features/

Polyaxon Features. Build, monitor, deliver and scale machine learning experiments — we're the easiest way to go from an idea to a fully deployable model, bypassing all those infrastructure headaches. A Platform for reproducible and scalable Machine Learning and Deep Learning applications. Learn more about the suite of features and products that underpin today's most innovative platform for managing data science workflows.

ClearML

https://www.allegro.ai/

https://clear.ml/

Experiment, orchestrate, deploy, and build data stores, all in one place. Manage all your MLOps in a unified and robust platform providing collaborative experiment management, powerful orchestration, easy-to-build data stores, and one-click model deployment. Log, share, and version all experiments and instantly orchestrate pipelines.

  • built-in orchestration
  • artifact and model tracking
  • build, reuse and reproduce pipelines
  • automate and save time
  • track, log, compare, visualize, reproduce, collaborate and manage experiments.

Comet

https://www.comet.ml/site/data-scientists/

Track your datasets, code changes, experimentation history, and models. Comet provides insights and data to build better models, faster while improving productivity, collaboration and explainability.

  • compare experiments—code, hyperparameters, metrics, predictions, dependencies, system metrics, and more—to understand differences in model performance.
  • flexible experiments and visualization suite that allows you to record, transform, compare and visualize any artifact from your code, computer or environment.
  • View, analyze, and gain insights from your model predictions. Visualize samples with dedicated modules for vision, audio, text, and tabular data to detect over-fitting and easily identify issues with your dataset.
  • Register and keep track of all of your models, all from a single location.

Sacred

https://github.com/IDSIA/sacred

Sacred is a tool to help you configure, organize, log and reproduce experiments. It is designed to do all the tedious overhead work that you need to do around your actual experiment in order to:

  1. keep track of all the parameters of your experiment
  2. easily run your experiment for different settings
  3. save configurations for individual runs in a database
  4. reproduce your results Sacred achieves this through the following main mechanisms:
  5. ConfigScopes A very convenient way of the local variables in a function to define the parameters your experiment uses.
  6. Config Injection: You can access all parameters of your configuration from every function. They are automatically injected by name.
  7. Command-line interface: You get a powerful command-line interface for each experiment that you can use to change parameters and run different variants.
  8. Observers: Sacred provides Observers that log all kinds of information about your experiment, its dependencies, the configuration you used, the machine it is run on, and of course the result. These can be saved to a MongoDB, for easy access later.
  9. Automatic seeding helps controlling the randomness in your experiments, such that the results remain reproducible.

Crypten (Facebook)

https://github.com/facebookresearch/CrypTen

CrypTen is a framework for Privacy Preserving Machine Learning built on PyTorch. Its goal is to make secure computing techniques accessible to Machine Learning practitioners. It currently implements Secure Multiparty Computation as its secure computing backend and offers three main benefits to ML researchers:

  1. It is machine learning first. The framework presents the protocols via a CrypTensor object that looks and feels exactly like a PyTorch Tensor. This allows the user to use automatic differentiation and neural network modules akin to those in PyTorch.
  2. CrypTen is library-based. It implements a tensor library just as PyTorch does. This makes it easier for practitioners to debug, experiment on, and explore ML models.
  3. The framework is built with real-world challenges in mind. CrypTen does not scale back or oversimplify the implementation of the secure protocols.

PySyft

https://www.openmined.org/

OpenMined is an open-source community whose goal is to make the world more privacy-preserving by lowering the barrier-to-entry to private AI technologies. https://github.com/OpenMined/PySyft

Syft decouples private data from model training, using Federated Learning, Differential Privacy, and Encrypted Computation (like Multi-Party Computation (MPC) and Homomorphic Encryption (HE)) within the main Deep Learning frameworks like PyTorch and TensorFlow. Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place.

The Syft ecosystem seeks to change this system, allowing you to write software which can compute over information you do not own on machines you do not have (total) control over. This not only includes servers in the cloud, but also personal desktops, laptops, mobile phones, websites, and edge devices. Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used privately for computation.

  • Federated learning: a type of remote execution wherein models are sent to remote data-holding machines (such as smart phones or IoT devices) for local training. This eliminates the need to store sensitive training data on a central server.
  • On-device prediction: a special case of remote execution wherein models are used within an application locally instead of moving a dataset to the cloud for classification.
  • Multi-party computation: When a model has multiple owners, multi-party computation allows for individuals to share control of a model without seeing its contents such that no sole owner can use or train it.
  • Homomorphic encryption: When a model has a single owner, homomorphic encryption allows an owner to encrypt their model so that untrusted 3rd parties can train or use the model without being able to steal it.
  • Differential Privacy: ventually you must request the results of your remote (or encrypted) analysis to be revealed (i.e., statistical results, a trained model, prediction, or synthetic dataset). Differential Privacy helps us answer the question, “If I were to reveal this datapoint, what’s the maximum amount of private information I may leak?” and obfuscate the data appropriately. We extend PyTorch and Tensorflow with the ability to perform differential privacy automatically.

Securiti

https://securiti.ai/

Identify any sensitive data across your organization in structured and unstructured systems. Automate data privacy, security & governance.

  • Collect and catalog all data assets
  • Discover hundreds of personal & sensitive data attributes such as name, address, credit card number, social security number, medical record number and many more in any structured & unstructured databases.
  • Detect special attributes that require disclosure under GDPR. For example: race, religion etc.
  • Detect personal data attributes specific to regions such as EU, Latin America and Asia Pacific. Ex: passport numbers, bank account numbers etc.
  • Automate privacy-by-design, DPIA, Article 30 reports, based on sensitive data intelligence.
  • Fulfill data subject rights automatically and maintain proof of compliance.
  • Collaborate and track all assessments in one place.
  • Use data aligned with cookie consent and universal consent.
  • Continuously monitor and remediate data asset security posture
  • Identify external and internals risks to data. Risk Engine for Scoring, Attribution & Visualization
  • Policy based protections and alerts for the distributed multicloud data

Privitar

https://www.privitar.com/

De-Identify data at massive scale. Safely drive data through your analytics and data science operations.

  • Data privacy platform
  • Centralized privacy policy management
  • Data masking, tokenization, generalization, perturbation, redaction, substitution and encryption

Collibra

https://www.collibra.com/

  • Intuitive and contextual search, which uses modern technology to search across all locations, data sources and more, so users can find what they’re looking for in a matter of seconds. Built into the holistic platform, the contextual search provides visibility across all of Collibra’s products and includes filters, facets, and other methods of narrowing search criteria.
  • Access organization’s trusted, governed data wherever they happen to be working. Whether it’s through integrations with your existing tools and products or through native applications, you can ensure your users have access to the data they need, when they need it, in the easiest way possible.
  • Range of capabilities to manage all of your data stewardship needs. It is simple enough that every data citizen can find the data they need, evaluate its quality and use it effectively in their role. Collibra’s Data Stewardship enables your teams to join forces with subject matter experts and data owners across the organization through role-based dashboards and interactive views.
  • Data Catalog empowers business users to quickly discover and understand data that matters so they can generate impactful insights that drive business value
  • Data governance. Questions around data’s quality and relevance are increasingly difficult for modern enterprises to answer. Collibra Data Governance helps organizations understand their ever-growing amounts of data in a way that scales with growth and change, so that teams can trust and use their data to improve their business.
  • Data lineage. Data lineage reveals how data transforms through its life cycle across interactions with systems, applications, APIs and reports. Collibra Data Lineage automatically maps relationships between data to show how data flows from system to system and how data sets are built, aggregated, sourced and used, providing complete, end-to-end lineage visualization.
  • Data privacy. Collibra delivers privacy from a Data Intelligent foundation that centralizes, automates and guides privacy workflows. Privacy by design is embedded into a single platform, enabling teams across departments to collaborate and operationalize privacy. By awakening the value of data, Collibra accelerates an organization’s ability to address global regulatory requirements.
  • Data quality. Data teams are often constrained by manual rule writing and management, with limited data connectivity and a siloed view of data quality. With predictive, continuous and self-service data quality, organizations can centralize and automate data quality workflows to gain better control over their end-to-end data pipelines and streamline analytics processes across the enterprise.

Zama (beta)

https://zama.ai/

Open Source framework for securing AI applications in the cloud. Using homomorphic encryption, Zama enables any trained network, regardless of its architecture or training method, to run inference on encrypted user data.

Differential privacy kit

https://medium.com/uber-security-privacy/differential-privacy-open-source-7892c82c42b6

Adversarial patch tricking models

https://www.youtube.com/watch?v=c_5EH3CBtD0 and https://securityintelligence.com/how-can-companies-defend-against-adversarial-machine-learning-attacks-in-the-age-of-ai/

Lifecycle analysis and governance

https://www.mimecast.com/blog/most-healthcare-data-breaches-now-caused-by-email/

The process and infrastructure to store the training data, accuracy, documentation, trained model, orchestration of the model, inference results and beyond.

Deeploy

https://www.deeploy.ml/

TO EXPAND

Explainability, interpretability, transparency

InterpretML (Azure)

https://interpret.ml/

https://github.com/interpretml/interpret

Open-source package that incorporates state-of-the-art machine learning interpretability techniques. With this package, you can train interpretable glassbox models and explain blackbox systems. InterpretML helps you understand your model's global behavior, or understand the reasons behind individual predictions.

  • Glass-box models are interpretable due to their structure. Examples include: Explainable Boosting Machines (EBM), Linear models, and decision trees. Glass-box models produce lossless explanations and are editable by domain experts.
  • Black-box models are challenging to understand, for example deep neural networks. Black-box explainers can analyze the relationship between input features and output predictions to interpret models. Examples include LIME and SHAP.
  • Supports global as well as local explanations, and things like subset prediction explanations as well as feature impact analysis.
  • Supported techniques:
    • Explainable Boosting glassbox model
    • Decision Tree glassbox model
    • Decision Rule List glassbox model
    • Linear/Logistic Regression glassbox model
    • SHAP Kernel Explainer blackbox explainer
    • LIME blackbox explainer
    • Morris Sensitivity Analysis blackbox explainer
    • Partial Dependence blackbox explainer

AI Explainability 360 (IBM)

https://aix360.mybluemix.net/

https://github.com/Trusted-AI/AIX360

Open-source library that supports interpretability and explainability of datasets and machine learning models. The AI Explainability 360 Python package includes a comprehensive set of algorithms that cover different dimensions of explanations along with proxy explainability metrics.

The algorithms in the toolkit are primarily intended for high-stakes applications of machine learning from data that support decision making with humans in the loop, either as the decision makers, the subjects of the decisions, or as regulators of the decision making processes. Other modes of AI such as knowledge graph induction or planning, and even other modes of machine learning such as reinforcement learning are not appropriate settings in which to use AIX360.

  • Data explanation (ProtoDash (Gurumoorthy et al., 2019), Disentangled Inferred Prior VAE (Kumar et al., 2018))
  • Local post-hoc explanation (ProtoDash (Gurumoorthy et al., 2019), Contrastive Explanations Method (Dhurandhar et al., 2018), Contrastive Explanations Method with Monotonic Attribute Functions (Luss et al., 2019), LIME (Ribeiro et al. 2016, Github), SHAP (Lundberg, et al. 2017, Github))
  • Local direct explanation (Teaching AI to Explain its Decisions (Hind et al., 2019))
  • Global direct explanation (Boolean Decision Rules via Column Generation (Light Edition) (Dash et al., 2018), Generalized Linear Rule Models (Wei et al., 2019))
  • Global post-hoc explanation (ProfWeight (Dhurandhar et al., 2018)
  • Supported explainability metrics (Faithfulness (Alvarez-Melis and Jaakkola, 2018), Monotonicity (Luss et al., 2019))

What if tool (Google People and AI Research)

https://pair-code.github.io/what-if-tool/

https://github.com/pair-code/what-if-tool

Visually probe the behavior of trained machine learning models, with minimal coding. A key challenge in developing and deploying responsible Machine Learning (ML) systems is understanding their performance across a wide range of inputs. Using WIT, you can test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data, and for different ML fairness metrics. The What-If Tool can work with any python-accessible model in Notebook environments, and will work with most models hosted by TF-serving in Tensorboard. The What-If Tool supports:

  • binary classification*
  • multi-class classification
  • regression tasks Fairness optimization strategies are available only with binary classification models due to the nature of the strategies themselves. In the What-If Tool, Counterfactuals are datapoints that are most similar to a selected datapoint, but are classified differently by a model. For binary classification models, counterfactuals are the most similar datapoint to a selected datapoint that is predicted in the opposite class or label by a model. For regression models, counterfactuals are calculated when the difference in prediction score between the selected datapoint and a candidate counterfactual is equal or greater to the “counterfactual threshold”. The counterfactual threshold default is set to the standard deviation of the prediction values and can be adjusted by the user. For multi-class models, the counterfactual is the most similar datapoint to a selected datapoint, but is classified as any class other than the selected datapoint’s class.

Tensorwatch

https://github.com/microsoft/tensorwatch

TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Microsoft Research. It works in Jupyter Notebook to show real-time visualizations of your machine learning training and perform several other key analysis tasks for your models and data. TensorWatch is designed to be flexible and extensible so you can also build your own custom visualizations, UIs, and dashboards. Besides traditional "what-you-see-is-what-you-log" approach, it also has a unique capability to execute arbitrary queries against your live ML training process, return a stream as a result of the query and view this stream using your choice of a visualizer (we call this Lazy Logging Mode). When you write to a TensorWatch stream, the values get serialized and sent to a TCP/IP socket as well as the file you specified. From Jupyter Notebook, we load the previously logged values from the file and then listen to that TCP/IP socket for any future values. The visualizer listens to the stream and renders the values as they arrive. Ok, so that's a very simplified description. The TensorWatch architecture is actually much more powerful. Almost everything in TensorWatch is a stream. Files, sockets, consoles and even visualizers are streams themselves. A cool thing about TensorWatch streams is that they can listen to any other streams. This allows TensorWatch to create a data flow graph. This means that a visualizer can listen to many streams simultaneously, each of which could be a file, a socket or some other stream. You can recursively extend this to build arbitrary data flow graphs. TensorWatch decouples streams from how they get stored and how they get visualized.

Rulex

https://www.rulex.ai/rulex-explainable-ai-xai/

Explainable AI produces transparent, easily understandable models. Using a series of if-then statements, Rulex automatically produces self-explanatory logic for all decisions. Rulex rulesets make it possible to explain a decision directly to the customer or provide customer service agents with the ability to look up the reason for a decision. Why eXplainable AI is more transparent than black box? The problem with conventional AI is very simple: it’s unexplainable. Conventional AI relies on machine learning algorithms such as neural networks and others that have one key feature in common: they produce “black box” predictive models, meaning they’re mathematical functions that cannot be understood by people, even mathematicians. Rulex’s core machine learning algorithm, the Logic Learning Machine (LLM), works in an entirely different way from conventional AI. Rather than producing a math function, it produces conditional logic rules that predict the best decision choice, in plain language that is immediately clear to process professionals. Rulex rules make every prediction fully self-explanatory. And unlike decision trees and other algorithms that produce rules, Rulex rules are stateless and overlapping, meaning one rule can cover many cases, and many rules can cover a single case. This allows for fewer, simpler rules and provides broader coverage at the same time. Rulex calculates the coverage and accuracy of each rule, making it easy to select the most effective decision rules. Also, proven heuristic human rules can be added to the predictive model, allowing a seamless blend of human and artificial intelligence. Human rules are also rated for coverage and accuracy, allowing Rulex to easily evaluate the quality of the decision rules in use and reduce false positives.

MindsDB

https://mindsdb.com/

https://github.com/mindsdb/mindsdb

MindsDB's is an Explainable AutoML framework for developers. MindsDB is an automated machine learning platform that allows anyone to gain powerful insights from their data. With MindsDB, users can get fast, accurate, and interpretable answers to any of their data questions within minutes. A predictive layer for existing databases that enables rapid prototyping & deployment of ML Models from your database. Significantly reducing the time and cost of machine learning workflows.

Intuitive confidence measure

TO DO

Contrastive explanations

https://pythonrepo.com/repo/MarcelRobeer-ContrastiveExplanation-python-deep-learning-model-explanation

Contrastive Explanation provides an explanation for why an instance had the current outcome (fact) rather than a targeted outcome of interest (foil). These counterfactual explanations limit the explanation to the features relevant in distinguishing fact from foil, thereby disregarding irrelevant features. The idea of contrastive explanations is captured in this Python package ContrastiveExplanation. Example facts and foils are: Machine Learning (ML) type Problem Explainable AI (XAI) question Fact Foil Classification Determine type of animal Why is this instance a cat rather than a dog? Cat Dog Regression analysis Predict students' grade Why is the predicted grade for this student 6.5 rather than higher? 6.5 More than 6.5 Clustering Find similar flowers Why is this flower in cluster 1 rather than cluster 4? Cluster 1 Cluster 4

Explainerdashboard

https://medium.com/value-stream-design/making-ml-transparent-and-explainable-with-explainerdashboard-49953ae743dd

https://github.com/oegedijk/explainerdashboard

https://www.youtube.com/watch?v=1nMlfrDvwc8

This package makes it convenient to quickly deploy a dashboard web app that explains the workings of a (scikit-learn compatible) machine learning model. The dashboard provides interactive plots on model performance, feature importances, feature contributions to individual predictions, "what if" analysis, partial dependence plots, SHAP (interaction) values, visualisation of individual decision trees, etc. You can also interactively explore components of the dashboard in a notebook/colab environment (or just launch a dashboard straight from there). Or design a dashboard with your own custom layout and explanations (thanks to the modular design of the library). And you can combine multiple dashboards into a single ExplainerHub. Dashboards can be exported to static html directly from a running dashboard, or programmatically as an artifact as part of an automated CI/CD deployment process.

Fairness, bias tooling

AI Fairness 360 (IBM)

http://aif360.mybluemix.net/

https://github.com/Trusted-AI/AIF360

Supported bias mitigation algorithms

  • Optimized Preprocessing (⦁ Calmon et al., 2017)
  • Disparate Impact Remover (⦁ Feldman et al., 2015)
  • Equalized Odds Postprocessing (⦁ Hardt et al., 2016)
  • Reweighing (⦁ Kamiran and Calders, 2012)
  • Reject Option Classification (⦁ Kamiran et al., 2012)
  • Prejudice Remover Regularizer (⦁ Kamishima et al., 2012)
  • Calibrated Equalized Odds Postprocessing (⦁ Pleiss et al., 2017)
  • Learning Fair Representations (⦁ Zemel et al., 2013)
  • Adversarial Debiasing (⦁ Zhang et al., 2018)
  • Meta-Algorithm for Fair Classification (⦁ Celis et al.. 2018)
  • Rich Subgroup Fairness (⦁ Kearns, Neel, Roth, Wu, 2018)
  • Exponentiated Gradient Reduction (⦁ Agarwal et al., 2018)
  • Grid Search Reduction (⦁ Agarwal et al., 2018, ⦁ Agarwal et al., 2019) Supported fairness metrics
  • Comprehensive set of group fairness metrics derived from selection rates and error rates including rich subgroup fairness
  • Comprehensive set of sample distortion metrics
  • Generalized Entropy Index (⦁ Speicher et al., 2018)
  • Differential Fairness and Bias Amplification (⦁ Foulds et al., 2018)
  • Bias Scan with Multi-Dimensional Subset Scan (⦁ Zhang, Neill, 2017)

Fairlearn

https://fairlearn.org/

https://github.com/fairlearn/fairlearn

Fairlearn is an open-source, community-driven project to help data scientists improve fairness of AI systems.

  • A Python library for fairness assessment and improvement (fairness metrics, mitigation algorithms, plotting, etc.) The Fairlearn tookit can assist in assessing and mitigation unfairness in Machine Learning models. It’s impossible to provide a sufficient overview of fairness in ML in this Quickstart tutorial, so we highly recommend starting with our User Guide. Fairness is a fundamentally sociotechnical challenge and cannot be solved with technical tools alone. They may be helpful for certain tasks such as assessing unfairness through various metrics, or to mitigate observed unfairness when training a model. Additionally, fairness has different definitions in different contexts and it may not be possible to represent it quantitatively at all.
  • Metrics for assessing which groups are negatively impacted by a model, and for comparing multiple models in terms of various fairness and accuracy metrics.
  • Algorithms for mitigating unfairness in a variety of AI tasks and along a variety of fairness definitions.

What if tool (Google People and AI Research)

https://pair-code.github.io/what-if-tool/

https://github.com/pair-code/what-if-tool

Visually probe the behavior of trained machine learning models, with minimal coding. A key challenge in developing and deploying responsible Machine Learning (ML) systems is understanding their performance across a wide range of inputs. Using WIT, you can test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data, and for different ML fairness metrics. The What-If Tool can work with any python-accessible model in Notebook environments, and will work with most models hosted by TF-serving in Tensorboard. The What-If Tool supports:

  • binary classification*
  • multi-class classification
  • regression tasks Fairness optimization strategies are available only with binary classification models due to the nature of the strategies themselves.

In the What-If Tool, Counterfactuals are datapoints that are most similar to a selected datapoint, but are classified differently by a model.

  • For binary classification models, counterfactuals are the most similar datapoint to a selected datapoint that is predicted in the opposite class or label by a model.
  • For regression models, counterfactuals are calculated when the difference in prediction score between the selected datapoint and a candidate counterfactual is equal or greater to the “counterfactual threshold”. The counterfactual threshold default is set to the standard deviation of the prediction values and can be adjusted by the user.
  • For multi-class models, the counterfactual is the most similar datapoint to a selected datapoint, but is classified as any class other than the selected datapoint’s class.

Sagemaker clarify (Amazon)

https://aws.amazon.com/sagemaker/clarify/

Provides machine learning developers with greater visibility into their training data and models so they can identify and limit bias and explain predictions. Biases are imbalances in the training data or prediction behavior of the model across different groups, such as age or income bracket. Biases can result from the data or algorithm used to train your model. For instance, if an ML model is trained primarily on data from middle-aged individuals, it may be less accurate when making predictions involving younger and older people. The field of machine learning provides an opportunity to address biases by detecting them and measuring them in your data and model. You can also look at the importance of model inputs to explain why models make the predictions they do. Amazon SageMaker Clarify detects potential bias during data preparation, after model training, and in your deployed model by examining attributes you specify. For instance, you can check for bias related to age in your initial dataset or in your trained model and receive a detailed report that quantifies different types of possible bias. SageMaker Clarify also includes feature importance graphs that help you explain model predictions and produces reports which can be used to support internal presentations or to identify issues with your model that you can take steps to correct.

  • Identify imbalances in data
  • Check trained models for bias
  • Monitor model for bias
  • Understand model
  • Monitor model for changes in behavior
  • explain individual model predictions

ML Fairness gym (Google)

https://github.com/google/ml-fairness-gym

ML-fairness-gym is a set of components for building simple simulations that explore the potential long-run impacts of deploying machine learning-based decision systems in social environments. As the importance of machine learning fairness has become increasingly apparent, recent research has focused on potentially surprising long term behaviors of enforcing measures of fairness that were originally defined in a static setting. Key findings have shown that under specific assumptions in simplified dynamic simulations, long term effects may in fact counteract the desired goals. Achieving a deeper understanding of such long term effects is thus a critical direction for ML fairness research. ML-fairness-gym implements a generalized framework for studying and probing long term fairness effects in carefully constructed simulation scenarios where a learning agent interacts with an environment over time. This work fits into a larger push in the fair machine learning literature to design decision systems that induce fair outcomes in the long run, and to understand how these systems might differ from those designed to enforce fairness on a one-shot basis.

Using fairness-gym in your research

ML-fairness-gym brings reinforcement learning-style evaluations to fairness in machine learning research. Here is a suggested pattern for using the ML-fairness-gym as part of the research process. Others may be added here as we continue to grow.

Evaluating a proposed ML algorithm

Here are suggested steps when evaluating a proposed new fair ML algorthm:

  • Choose a simulation environment.
  • Decide on metrics that you would like to measure for that environment.
  • Choose baseline agents and choose what reward functions they will optimize.
  • Write an agent that uses your new algorithm.
  • Compare metrics between your baseline agents and your fair agent. Some utilities for building experiments are provided in run_util.py. For example, run_simulationis a simple function that runs an experiment and returns metric measurements.
  • Explore parameter settings in your simulation environment - are there different regimes?

We provide some implementations of environments, agents, and metrics, but they are by no means comprehensive. Feel free to implement your own and contribute to ML-fairness-gym!

Aequitas

https://dssg.github.io/aequitas/

Aequitas is an open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive tools. Aequitas will help you:

  • Understand where biases exist in your model(s)
  • Compare the level of bias between groups in your sample population (bias disparity)
  • Visualize absolute bias metrics and their related disparities for rapid comprehension and decision-making Our goal is to support informed and equitable action for both machine learnining practitioners and the decision-makers who rely on them.

Risk, value, and stakeholder modeling

These tools are more process / project related tools. Potentially interesting for POs, ATs, PMs and the likes. Aanpak begeleidingsethiek https://ecp.nl/project/aanpak-begeleidingsethiek/

####Sustainable AI

Jina AI

https://jina.ai/

https://github.com/jina-ai/jina

Jina🔊 allows you to build search-as-a-service powered by deep learning in just minutes.

All data types - Large-scale indexing and querying of any kind of unstructured data: video, image, long/short text, music, source code, PDF, etc.

Fast & cloud-native - Distributed architecture from day one, scalable & cloud-native by design: enjoy containerizing, streaming, paralleling, sharding, async scheduling, HTTP/gRPC/WebSocket protocol.

Save time - The design pattern of neural search systems, from zero to a production-ready system in minutes.

Own your stack - Keep end-to-end stack ownership of your solution, avoid integration pitfalls you get with fragmented, multi-vendor, generic legacy tools.

Frameworks

Other

Podcasts