Human Rights Considered - Data Science Backend

Human Rights First

Human Rights First is a 501(c)(3) international independent advocacy and action organization that challenges America to live up to its ideals. We believe American leadership is essential in the global struggle for human rights, so we press the U.S. government and private companies to respect human rights and the rule of law. When they fail, we step in to demand reform, accountability and justice. Around the world, we work where we can best harness American influence to secure core freedoms.

Human Rights Considered is a project working to track incidents of police use of force on Americans for Human Rights First. Our initial goal was to develop a visualization that showcases instances of police use of force along with a data science model that helps classify possible instances of brutality. We quickly realized that our highest-priority data science task -in addition to creating a model to assess use of force- was to source and process the relevant data, create a database, and to host it in an accessible API.

Disclaimer: This application is currently in Alpha (as of Sep 20, 2020) and is not ready for production. Please use at your own risk.

DS Contributors

Axel Corro	Michelle Hottinger	Miriam Ali

This project's front end repository can be found here.

Tech Stack

Python Packages

Pandas
Snorkel
GeoPy
NLTK
Scikit-learn
Psycopg2

DevOps

Docker
PostgreSQL
SQLAlchemy
AWS CloudWatch
AWS Lambda
AWS Elastic Beanstalk
FastAPI

Overview

Data

Currently we are using data from Police Brutality 2020, which primarily sources data from Reddit posts. This data as of August 2020 was used to train our model and seed our database. New incidents and evidence from PB2020 will be also added to the database via a cron job executed by AWS Lambda. One of our goals for future releases is to include more dynamic social media scraping, like Twitter.

Processing and Model

Incident data was cleaned, and location metadata was added to each incident with a geocoder. In order to create a model which predicts which type of force was deployed, we first created a training dataset using a new method of weakly supervised learning with Snorkel.

For more information on our data cleaning process, how we used Snorkel, and our model, see our machine learning readme.

Database Schema

For information, see our database readme.

API Endpoints

Endpoints

Route: `/incidents`

Method: `GET`

Description:

Read all incidents of police use of force. Incidents can be identified by their unique id, eg: ca-sanfrancisco-1.

Schema:

[
  {
    "id": "string",
    "place_id": 0,
    "descr": "string",
    "date": "string",
    "evidences": [
      {
        "incident_id": "string",
        "link": "string",
        "id": 0
      }
    ],
    "tags": [
      {
        "incident_id": "string",
        "tag": "string",
        "id": 0
      }
    ],
    "place": {
      "city": "string",
      "state_name": "string",
      "state_code": "string",
      "latitude": "string",
      "longitude": "string",
      "id": 0
    }
  }
]

Route: `/incidents/{tag}`

Method: `GET`

Description:

Read incidents by tag. For example: /incidents/projectiles

Sortable Tags:

Blunt Impact
Chemical
EHC Soft Technique
EHC Hard Technique
Projectiles

Schema:

see /incidents endpoint above

Route: `/cron_update`

Method: `POST`

Description:

Endpoint for the cron job which updates the database with new incidents and evidence from PB2020.

Schema:

WIP

See the cron readme.

KAfable/Labs25-Human_Rights_First-TeamC-DS

Human Rights Considered - Data Science Backend

Human Rights First

DS Contributors

Tech Stack

Python Packages

DevOps

Overview

Data

Processing and Model

Database Schema

API Endpoints

Route: /incidents

Method: GET

Description:

Schema:

Route: /incidents/{tag}

Method: GET

Description:

Schema:

Route: /cron_update

Method: POST

Description:

Schema:

Route: `/incidents`

Method: `GET`

Route: `/incidents/{tag}`

Method: `GET`

Route: `/cron_update`

Method: `POST`