Code of ethics


Data for Democracy is partnering with Bloomberg and BrightHive to develop a code of ethics for data scientists. This code will aim to define values and priorities for overall ethical behavior, in order to guide a data scientist in being a thoughtful, responsible agent of change. The code of ethics is being developed through a community-driven approach.

By hosting discussions among data scientists, we hope to better capture the diverse interests, needs, and concerns that are at play in the community, and put together a code that is truly created by data scientists, for data scientists.

Read more here.

Ethics Principles


  • Its my job to understand, mitigate and communicate the presence of bias in algorithms.
  • Be responsible for maximizing social benefit and minimizing harm.
  • Practice humility and openness.
  • I will know my data and help future users know it as well.
  • Make reasonable efforts to know and document its origins and document its transformation.
  • Bias will exist. Measure it. Plan for it.
  • Thou shalt document transparently, accessibly, responsibly, reproducibly, and communicate.
  • Engaging the whole community. Do you have all relevant individuals engaged?
  • People before data - data scientists should use a question driven approach rather than a data-driving or methods approach. Consider personal safety and treat others the way they want to be treated.
  • Exercise ethical imagination.
  • Open by default - use of data should be transparent and fair.
  • I will not over/under represent findings.
  • You are part of an ecosystem understand context and provenance.
  • Respecting human dignity.
  • Respect their data even more than your own. Understand where its sources and think about the consequences of your actions.
  • Protecting individual and institutional privacy.
  • Diversity for inclusivity.
  • Attention to bias.
  • Respect for others/persons.
  • Be intentional as you work to create value.

What has been done so far?


We conducted a preliminary scan in the Data for Democracy community, by posting discussion questions on Slack and Twitter, and collecting feedback and input from our 2,000-plus members. We then identified recurring themes that our community members highlighted as important, and arranged these in a systematic framework. This was made by a list of resources addressing the topic.

After that, seven groups of work, formed by volunteers are doing an in-depth discussion of each topic area. Each group is meeting once every 2 weeks, for 1 to 2 hours each time to convey these analysis. Finally, a selected group of advisors will review the notes and gives feedback.

The aim was to have a draft version by February 6th to be presented at the Data for Good Exchange celebrated in San Francisco (D4GX)

The topic areas that the volunteers are discussing are:

  • Data Ownership and Provenance
  • Bias and Mitigation
  • Responsible Communications
  • Privacy and Security
  • Transparency and Openness
  • Questions and Answers
  • Thought Diversity

If you want to keep up to date about what is happening in each meeting, visit this other repo for meeting materials.

Working Group Roles and Leadership Team


  • Lilian Huang - Research and Analysis - Working Groups Lead and coordinates volunteers through the research and literature reviews.
  • Natalie Evans Harris - Advisors - Initiative Lead and coordinates volunteers focused on guiding the development of the Code of Ethics, including advising on additional research areas and questions to ask of the community.
  • Erin Stein - Moderators/Principles - Coordinates volunteers in drafting the principles and moderating the working groups.
  • Danny Buerkli - Process - Ensures an open and transparent process that is sustainable in the long-run. Our primary interactions will be through GitHub and project dashboards.

How can you contribute with ideas?


In order for this process to be as transparent and open as possible, we are making use of GitHub to collect ideas and suggestions from the community as a whole. This provides a quick and easy way for you to do the following-

  1. Submit titles or links of literature/resources that you have found useful in thinking about ethics and data science. You can do so by forking the repo and making the change in your text editor and submitting a pull request or directly in the resources.md file.
If you are not familiar with GitHub. (.md stands for Markdown, a way to format writing on the web easily-cheatsheet link below). If you click on the link, it’ll take you to a page that only loads the resources.md file. On the right side, above the beginning of the document, there is a button with a pencil icon. This allows you edit the file and add your links without having to fork the repo to your computer, use the command line, or any other thing but the content management system built into GitHub.
  1. Submit links including comments on what matters to you when creating a data science code of ethics. Open a pull request from your forked repo.
If you click the “Commit changes” button when using GitHub's content management system, you’ll get a notification from Github saying that you can’t directly contribute to the code so it has forked the repository for you and has made your changes on your branch. That might sound a bit confusing, it only means that we aren’t allowed to directly change the existing/live code or document without going through the process that verifies any proposed changes. In order for this to happen, everything is copied to your GitHub account and you make the changes on your GitHub account. If you want the changes to appear in the main project page/repo, you need to submit a pull request by following steps provided.
  1. Browse all the suggestions, comments and links submitted by your fellow community members in the Discussions section. All results of these contributions can be viewed by the public, here and in the master version of the resources.md file, including you.

  2. Indicate which suggestions and comments you agree with through voting via 👍 or 👎 emojis in the Discussions section located here.

The setup of GitHub presents you with all suggestions. However, please keep in mind that none of these suggestions are mutually exclusive; we are not pitting ideas against each other or using the number of votes to eliminate suggestions. We are simply using this as one convenient metric to determine which ideas have the most resonance in the community, or which resources/literature have been useful to a large number of people. If all suggestions presented are important to you, feel free to :thumbsup: all of them. If you have a more detailed response or would like to express your thoughts on someone else's idea, you can submit this comment in the Ethics Team's Discussion area or comment directly on their open pull request.

This document is written and edited using GitHub flavored Markdown. It's not scary, it's very simple and they even provide a cheatsheet which you can find here.

Shout out to Ashley Blewer's blog post for breaking down the pull request so succiently.