Boulder County RCV RLA Process

IMPORTANT DRAFT NOTICE
All content presented in this page and GitHub repository should be considered draft.
This content has not been reviewed or approved by authorized individuals at Boulder County.

Background, Theory, Contributors, and Approach Justification
SHANGRLA Overview
Risk Limiting Audit Process Overview
Tool and Repository Overview
Detailed Process for Performing a Risk Limiting Audit
5.1. Prerequisites
5.2. Activating the RLA Environment
5.3. Create the RAIRE formatted CVR file
5.4. Generate the Assertions to Test
5.5. Generate the Manifest Card Count
5.6. Generate the RLA Ballot Sample
5.7. Review the Contest File Generated for the MVR Tool
5.8. Run the Manual Vote Recorder Tool for Sample Comparison
5.9. Complete the Audit
5.10. Stop Container and Backup the Data Generated from the RLA
Testing Reproduction
General Security Notes and Considerations
Warranty

1. Background, Theory, Contributors, and Approach Justification

The Boulder County 2023 Ranked Choice Voting (RCV) Risk-Limiting Audit (RLA) process is based on published work and code from leading researchers in the field of statistics and elections auditing:

Philip B. Stark
- Distinguished Professor of Statistics, University of California, Berkeley
- About: https://statistics.berkeley.edu/people/philip-b-stark
- Other Information: https://www.stat.berkeley.edu/~stark/Vote/index.htm
Dr. Michelle Blom
- Senior Research Fellow, AI and Autonomy Lab of the School of Computing and Information Systems at the University of Melbourne, Australia
- RAIRE Technical Specialist, Democracy Developers (https://www.democracydevelopers.org.au)
- About: https://michelleblom.github.io/
- Other Information: https://findanexpert.unimelb.edu.au/profile/178724-michelle-blom
Dr. Vanessa Teague
- Adjunct Associate Professor, The Australian National University, Canberra, Australia
- Chairperson / Founder, Democracy Developers (https://www.democracydevelopers.org.au)
- About: https://researchers.anu.edu.au/researchers/teague-v

The majority of the work product used to build this toolset is based on the efforts of the aforementioned individuals. See also the following paper detailing the process used for the 2019 San Francisco District Attorney Instant Runoff Vote: https://arxiv.org/pdf/2004.00235.pdf

The County has received input and feedback from Dr. Blom and Dr. Teague during the County technology review and testing process. As a Colorado leader in using instant runoff voting (i.e., ranked choice voting), Boulder County needs a viable process to perform a risk-limiting audit in advance of broad RCV RLA tool availability in future elections, which is expected to be provided by the State.

Notably, the Colorado Secretary of State cites resources from Philip B. Stark's work on the Secretary of State’s Risk-Limiting Audit Resources page: https://www.sos.state.co.us/pubs/elections/VotingSystems/riskAuditResources.html. Additional resources provided by the Secretary of State, including Frequently Asked Questions (FAQs) can be found here: https://www.sos.state.co.us/pubs/elections/RLA/faqs.html.

Rule4 is a Boulder-based cybersecurity and infrastructure consulting company whose team members have partnered with the Boulder County Clerk & Recorder's Office for more than 15 years on various initiatives related to cybersecurity and the safe application of technology. Rule4 has provided support at the request of the Clerk & Recorder's office to operationalize the code necessary to use this collection of risk-limiting audit tools. Rule4 maintains the repository for this Boulder County RCV RLA process. Rule4's effort focused on aggregating those discrete tools into a manageable process, reducing the potential for human error when possible, and enabling the County to achieve relative independence when performing the RLA for RCV contests.

In summary, Boulder County has performed reasonable due diligence in ensuring that resources, both individuals and code, used to establish an interim process for performing risk-limiting audits for ranked choice voting are of high integrity, and consistent with the spirit and intent of risk-limiting audit objectives.

2. SHANGRLA Overview

The process summary from the SHANGRLA reference implementation used in 2019 for San Francisco is included here to provide context on terminology and the approach (with minor modifications based on URL accessibility and formatting).

Sets of Half-Average Nulls Generate Risk-Limiting Audits (SHANGRLA) by Michelle Blom, Andrew Conway, Philip B. Stark, Peter J. Stuckey and Vanessa Teague.

Risk-limiting audits (RLAs) offer a statistical guarantee: if a full manual tally of the paper ballots would show that the reported election outcome is wrong, an RLA has a known minimum chance of leading to a full manual tally. RLAs generally rely on random samples.

With SHANGRLA we introduce a very general method of auditing a variety of election types, by expressing an apparent election outcome as a series of assertions. Each assertion is of the form "the mean of a list of non-negative numbers is greater than 1/2." The lists of nonnegative numbers correspond to assorters, which assign a number to the selections made on each ballot (and to the cast vote record, for comparison audits).

Each assertion is tested using a sequential test of the null hypothesis that its complement holds. If all the null hypotheses are rejected, the election outcome is confirmed. If not, we proceed to a full manual recount.

SHANGRLA incorporates several different statistical risk-measurement algorithms and extends naturally to plurality and super-majority contests with various election types including Range and Approval voting and Borda count.

It can even incorporate Instant Runoff Voting (IRV) using the RAIRE assertion-generator (https://github.com/michelleblom/audit-irv-cp). This produces a set of assertions sufficient to prove that the announced winner truly won. Observed paper ballots can be entered using Dan King and Laurent Sandrolini's tool for the San Francisco Election board (https://github.com/dan-king/RLA-MVR).

We provide an open-source reference implementation and exemplar calculations in Jupyter notebooks.

3. Risk-Limiting Audit Process Overview

The general process for the risk-limiting audit is as follows, and assumes that contest, candidate, CVR, and supporting manifest information is available.

The CVR and supporting manifests are exported from the Dominion Voting System.
Exported files are transferred to an isolated workstation running the Boulder County RCV-RLA docker container, in a designated local folder that is mounted within the container when it is running.
The CVR to RAIRE tool is used to convert a contest from CVR format to RAIRE format, and generate the JSON representing the specific contest of interest to be used by the Jupyter Notebook.
The RAIRE format CVR from the contest being audited is processed using the IRV Audit tool to create the assertions JSON file.
The Jupyter Notebook is used to process the input data, compute the sample size, and generate a sample from the CVR and an independently generated ballot manifest (generated independently of the Dominion system).
The ballots identified in the sample are pulled (physical ballots).
The MVR tool is used to record the perceived ballot impressions from manual review.
The MVR tool generates a JSON file used for comparison with the dominion CVR data.
The measured risk based on generated assertions and the CVR and MVR data is used in conjunction with defined error and risk-limit rates to determine whether the RLA is deemed satisfied, or whether recounts or expanded sampling are required.

4. Tool and Repository Overview

The following tools are used to perform the Boulder County RCV RLA process. They are aggregated into a single repository for ease of change control and management, and incorporate customizations to the operation of the tools, but not the underlying statistical methods or procedures. Each link is to the folder containing the respective tool in the main branch of the repository used by the County. The Readme at each folder level provides links to the source repositories from which this code was acquired. Test data generated by Boulder County to support independent review and RLA process performance has also been included in the repository.

5. Detailed Process for Performing a Risk Limiting Audit

5.1. Prerequisites

This process requires that you have access to a service enabling you to run containers. Docker Desktop is suggested and is available here: https://www.docker.com/products/docker-desktop/
You should be comfortable running basic commands from a command line (following instructions)
These instructions are written primarily for Windows OS use. If you are using Linux or macOS, there will be subtle changes you might need to make for your specific platform.
You will need to designate a local folder to present data to the container. For example, create a folder under C:\ called rcv-data, and create a subfolder under c:\rcv-data called bccr. The subfolder is not required, but it is useful for storing data ready for processing (whereas the root rcv-data folder may be used to handle temporary data, or not used at all.)
You will need access to the following files. Place them in the c:\rcv-data\bccr folder on your local system.
- Dominion Files (these are generated by the Dominion Voting System):
  - CountingGroupManifest.json
  - ContestManifest.json
  - CandidateManifest.json
  - CvrExport.json
- Non-Dominion Files
  - A ballot manifest associated with the contest, also placed in the c:\rcv-data\bccr folder on your local system. This file should be in Excel (xlsx) format, ideally named manifest.xlsx and it should contain only the following columns (with case and spellings exactly as indicated):
    - Tray: Can be populated with 1 for each row, given the use of single-tray tabulators.
    - Tabulator Number: The scanning station. Must match the format used in the CVR (i.e. three-digit numeric identifiers)
    - Batch Number: The batch number
    - Total Ballots: Ballots in the associated batch number
    - VBMCart.Cart number: Storage location identifier
  - The manifest file should resemble the following when viewed:
    
    Tray Tabulator Number Batch Number Total Ballots VBMCart.Cart number
    
    1 103 1 150 3
    
    1 103 2 146 3
    
    1 105 4 150 2
    
    ... ... ... ... ...
If the manifest.xlsx file is to be created on this system, Excel will be required. The tool currently expects a .xlsx file as opposed to a .csv file for this input file.
The container image is built to support arm64 and amd64 architectures, and should work with modern Windows and macOS platforms.

Tray	Tabulator Number	Batch Number	Total Ballots	VBMCart.Cart number
1	103	1	150	3
1	103	2	146	3
1	105	4	150	2
...	...	...	...	...

5.2. Activate the RLA Environment

5.2.1. From a command line (e.g. cmd.exe), pull the current Docker container. Unless there are known changes to the container, this only has to be performed once. This command pulls the container image that is tagged latest (i.e. the most recently updated image in the repository):
```
docker pull public.ecr.aws/x3b0g6w0/rcv-rla:latest
```
5.2.2. Start the container (adjusting the local path /c/rcv-data if necessary based on where the local folder was created). Note that this format is required for the docker volume mount (vs. native c:\ type paths):
```
docker run -itd --name bc-rla -p 127.0.0.1:8888:8888 -p 127.0.0.1:8887:8887 -v /c/rcv-data:/rcv-data/ public.ecr.aws/x3b0g6w0/rcv-rla
```
Command Explanation:
- docker run -itd : Run the docker container and allocate a pseudo-TTY connected to the container’s stdin; creating an interactive bash shell in the container, and support backgrounding as a daemon
- --name bc-rla Set the name for the container to run as
- -p 127.0.0.1:8888:8888 : Accept connections only on the Docker host's local interface on port 8888, and map to port 8888 in the container (used for the Jupyter Notebook)
- -p 127.0.0.1:8887:8887 : Accept connections only on the Docker host's local interface on port 8887, and map to port 8887 in the container (used for the MVR Tool node.js application and CVR-to-RAIRE conversion tool)
- -v /c/rcv-data:/rcv-data/ : Mount the local folder /c/rcv-data (c:\rcv-data) to /rcv-data in the container.
- public.ecr.aws/x3b0g6w0/rcv-rla : The container image to run (pulled in the previous step)
5.2.3. Connect to the terminal/shell to in preparation for using the irvaudit tool to create the assertions to test. If you named your container something other than bc-rla, change the name as appropriate in the following command:
```
docker exec -it bc-rla /bin/bash
```
5.2.4. Refresh local copies of code in the container from the main repository in the event the notebook or other files have been updated. Only do this once when starting the RLA process, otherwise your notebook (and any work underway) may be overwritten.
```
cd /opt/BoCo-RCV-RLA
git stash -m "<Enter a descriptive message, ideally including the date and time for future reference>"
git pull
```

5.3. Create the RAIRE-Formatted CVR File

5.3.1. Navigate to http://localhost:8887/html/ConvertCVRToRAIREwithJSON.html in a web browser.
5.3.2. Load the four .json files from the Dominion Voting System that you placed in c:\rcv-data\bccr.
5.3.3. Review the Then choose options on how to deal with some issues and Next, choose the ballot types you want to audit parameters and adjust if appropriate.
5.3.4. Select the contest being audited by checking the appropriate checkbox.
5.3.5. Copy the JSON text between the --------------- boundaries for the contest you are auditing. Place this in an empty notepad text file, or leave this browser tab open. You will insert this data into the notebook in a later step. Don't worry - if you lose this, you'll be able to regenerate it using Steps 5.3.1-5.3.4 above.
5.3.6. Scroll to the bottom of the page, and click the Download RAIRE format link. Make a note of where this file is saved and move the file to c:\rcv-data\bccr -OR- choose to save it in c:\rcv-data\bccr if prompted. Name the file (or rename it if it automatically saves with an alternate name) as RAIRE.txt

5.4. Generate the Assertions to Test

5.4.1. Navigate to the shell you opened in Step 5.2.3.

5.4.2. Change to the bccr directory in the container shell, and run the irvaudit assertion generator to create the assertion file:

cd /rcv-data/bccr
irvaudit -rep_ballots RAIRE.txt -r 0.05 -agap 0.0 -alglog -simlog -json bc-assertions.json 2>&1 | tee /rcv-data/bccr/irvaudit_$(date +"%Y_%m_%d_%I_%M_%p").log

5.4.3. Leave this shell open, you will use it later in the process to backup contest data.

5.5. Generate the Manifest Card Count

5.5.1. Open the manifest.xlsx file that you should have placed in c:\rcv-data\bccr.
5.5.2. Auto-sum all the populated cells in the fourth column, Total Ballots, excluding the header.
5.5.3. Make a note of this value - it will be used in the following step.

5.6. Generate the RLA Ballot Sample

5.7. Review the Contest File Generated for the MVR Tool

5.7.1. Following sample file creation, the MVR contest.json file will be created to structure the data in the format expected by the MVR tool. This occurs just prior to the header "Boulder County Distributed MVR Process" in the Notebook. Verify the output reads SUCCESS - Created MVR contest json file: /mvr_contest_data_<date_and_time>.json. This file will be loaded into the MVR tool.

5.8. Run the Manual Vote Recorder Tool for Sample Comparison

5.9. Complete the Audit

5.9.1. Return to the Jupyter Notebook tab/window in your browser (from Step 5.6.15), and navigate to the cell following the header "Read the audited sample data". This directly follows the cell that wrote the sample CSV file in Cell #22 in the notebook. The comment in the cell reads "# Read MVR data".
5.9.2. Click the play button near the top of the notebook. Use this play button to progress through each successive cell (one cell at a time), up to and including the "Log the status of the audit" cell.
5.9.3. Review the audit results in Cell #27. Specifically, review the contest audit status.
- Audit COMPLETE will indicate the audit completed, and the results as scored were confirmed to be within the risk limit for the audit.
- Audit INCOMPLETE indicates that the audit has found issues beyond the risk limit.
5.9.4. Press the save button in the notebook to save the current state and output in the notebook, including the results.

5.10. Stop Container and Back Up the Data Generated from the RLA

5.10.1 Return to the console you opened in Section 5.2. If that was closed, from a command prompt (cmd.exe) run the following command to connect to the container:
```
docker exec -it bc-rla /bin/bash
```
5.10.2 Run the following command sequence from the container shell. You should replace <contest> and <yyyy-mm-dd> placeholders incusive of the angle brackets with relevant values in each case command:
```
cp /opt/BoCo-RCV-RLA/SHANGRLA/2023/BC-RLA.ipynb /rcv-data/bccr/BC-RLA_<contest>_<yyyy-mm-dd>.ipynb
cd /rcv-data
tar -cvf bc-rla_<contest>_<yyyy-mm-dd>.tar bccr
```
5.10.3 Stop the docker container from a command prompt (opening a new one if necessary via cmd.exe):
```
docker stop bc-rla
```
5.10.4 On your primary workstation (not the container shell), navigate to c:\rcv-data.
5.10.5 Rename the bccr folder to bccr_<contest>_<yyyy-mm-dd>, and then create a new empty bccr folder for the next RLA.
5.10.6 Copy the backup file named bc-rla_<contest>_<yyyy-mm-dd>.tar to the approved RLA storage location.

6. Testing Reproduction

Test data has been provided in the root of the repository to support independent testing and validation of the process and tools used to perform the RLA.

7. General Security Notes and Considerations

Although access to the containerized resources is via HTTP as opposed to HTTPS, it is important to note that access is restricted to only the local system (i.e. only the local system can connect to exposed ports in the container). There is no network transmission of data to perform this process.
The MVR tool is an alpha version developed several years ago by Dan King. There are currently multiple vulnerable components used to support this node.js application. However, given the container access restrictions and the intentional lack of network access to this container, the risk of using known vulnerable components in the container was deemed acceptable.

8. Warranty

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

This permission notice shall be included in all copies or substantial portions of the Software.

The Software is provided “as is”, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the Software.

BoulderCounty/rcv-rla