/democracy-hackathon

PyData London 2019 Democracy Hackathon

Primary LanguageJupyter Notebook

Newspeak House    PyData London 2019

PyData 2019 Democracy Hackathon

Code and resources for the PyData 2019 Democracy Hackathon (Saturday 13:30-15:45 in the Mortimer Room). Hosted by Newspeak House Fellow John Sandall and Richard Chadwick.


Description

Data and technology can be powerful tools for understanding and improving the democratic process. Instead of weaponising these tools to produce unfair advantages, driving mistrust and disenfranchisement, the data community can also do the opposite. This hackathon isn't "The Great Hack", but it will be a hack, it will be great, and it will be using data for good.

Why do some people exercise their right to vote whilst others stay at home? In an era of contentious discourse and political scandals, how can we restore democratic faith and trust in our elected representatives?

In this hackathon, hosted by Newspeak House, we present a series of challenges and datasets compiled by civic tech organisations working to upgrade democracy for the digital age. We will provide working examples, open ended challenges as well as a Kaggle-style prediction competition, and plenty of support if this is your first data hackathon!

Newspeak House is an independent residential college founded in 2015 to study, nurture and inspire emerging communities of practice across UK public sector and civil society. Find out more @nwspk or come to one our upcoming events in Shoreditch.

Themes & Prize Categories

  • Machine learning competition. There will be a Kaggle-style machine learning competition for predicting the turnout of UK general elections. SixFifty has been working hard to source and produce model-ready datasets for solving this problem. All that remains is for someone to solve it!
  • Voter engagement. For the hack most likely to get more people to turnout.
  • Open data for democracy. Help improve discoverability and accessibility of open datasets and streamline getting them from raw to model-ready by contributing to Maven. Maven aims to reduce the time data scientists spend on data cleaning and preparation by providing easy access to open datasets in both raw and processed formats.
  • Painless parsing of political PDFs. A huge amount of civic data is published as tables trapped in PDF prisons. Work towards liberating this information and set it free!
  • Fake news, misinformation & public sentiment. It's becoming harder to distinguish legitimate news from demonstrably false news, and with a few taps we can instantly share the stories we consume on our phones to our social networks. More news doesn't mean better news, and big tech companies are increasingly having to moderate and filter the content they host. How can we use the vast quantity of information at our fingertips to create tools or insights into improving the quality of the information we receive online?
  • Wildcard prize. The theme is democracy. The goal is a better world. You define how we get there. Should Parliament move to another city? What would be the perfect voting system? Perhaps we should back to the wapentake or the Thing? Should Parliament delegate constitutionally contentious issues to a citizens assembly? Should we replace all branches of Government with a Superintelligent AI?

Datasets

You don't have to use these, but they're a good start.

General

Turnout modelling competition

Voter engagement

  • Start with the README in voter_engagement.
  • To understand what's been tried before, take a look at the tools and projects that were developed for the 2017 General Election. In the GE2017 Tech Initiatives Handbook you'll find a collection of resources, datasets, volunteers, existing projects, proposed projects.

Open data for democracy

  • Start with Maven. How can we make commonly used datasets (e.g. those listed in Newspeak's Politics Datasets or under the "General" heading above) easier to discover, download and process?

PDF parsing

Fake news, misinformation & public sentiment

Wildcard prize

Schedule

  • 13:30 – Introduction to challenges & datasets.
  • 13:50 – Hacking time!
  • 15:25 – Presentations & wrapup.
  • 15:45 – Event ends.

Code of Conduct

All attendees are expected to abide by the NumFOCUS Code of Conduct. Please take this opportunity to review it.

Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes and language are not appropriate for PyData. All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery is not appropriate. PyData is dedicated to providing a harassment-free event experience for everyone, regardless of gender, sexual orientation, gender identity, and expression, disability, physical appearance, body size, race, or religion. We do not tolerate harassment of participants in any form. Thank you for helping make this a welcoming, friendly community for all.

The full Code of Conduct and additional information can be found here.

If you wish to submit a Code of Conduct report click here.