WhiteHouse/source-code-policy

Measuring the success of the proposed pilot

Closed this issue · 0 comments

I’m Will, a developer at the Consumer Financial Protection Bureau, where our Design and Development team works as open source by default. This comment represents my views, not necessarily those of the CFPB.

What metrics should be used to determine the impact and effectiveness of the pilot proposed in this draft policy, and of an open source policy more generally?

As my colleagues Adam, Cat, and Kimberly have already discussed, transparency, re-use, and collaboration are central to our understanding of open source at CFPB.

As important as re-use, and collaboration are, however, they won't make for useful metrics to determine the effectiveness of the pilot proposal. It's often difficult to judge when something will be re-used by others, and when collaboration will be possible. The rate of contributions, bug reports, etc from other agencies and from the general public, for exmple, would reflect more on the nature of individual open source software projects than on the overall impact of the policy. Often re-use and collaboration happen organically, and not because of any intent on the part of the original authors. The value gained from open source is in the expansion of the entire ecosystem that permits that spontenaity.

This leaves transparency. There are two meaningful metrics that can indicate an increase in transparency (the source code, and by extension, the algorithms that are increasingly involved in day-to-day governance):

  • Percent increase in code that is open sourced by the federal government during the pilot
  • Rate by which code is open sourced by the federal government

The first follows naturally from the "20 percent" requirement in the pilot (see other issues for discussions of the problems with this number), and would inform the overall success or failure of the pilot against a baseline of what already exists.

The second would require more real-time reporting, but it would give a greater sense of buy-in into the purpose of the pilot versus simple compliance. If agencies are increasingly open sourcing code throughout the pilot period, then the policy may have been successful in implanting the value of open source. If the bulk of the 20 percent of newly-developed code is released on the last day of the pilot, the requirements will have been met, but the pilot would've failed its overall goal.

There is one other metric that our experience at CFPB would suggest. If it's not regularly cared-for, software can decay. For would-be adopters, collaborators, and contributors, it is important to convey the current state of a particular open source project. Automated testing and continuous integration are software development practices that can give early-warnings about the durability of software. These practices can also indicate the durability of projects open sourced during the pilot program and the program's success long after the pilot has ended:

  • Overall number of federal open sourced projects that pass regularly scheduled builds

Of course, the thoroughness of automated builds is difficult to enforce even within a single organization, but given a large enough sample, this would be an important metric to indicate future potential for the code open-sourced during the pilot.

In summary, I'm proposing the following metrics:

  • Percent increase in code that is open sourced by the federal government during the pilot
  • Rate by which code is open sourced by the federal government
  • Overall number of federal open sourced projects that pass regularly scheduled builds