/concordia

Crowdsourcing platform for full text transcription and tagging. https://crowd.loc.gov

Primary LanguagePythonOtherNOASSERTION

Build Status Coverage Status

Welcome to Concordia

Concordia is a platform developed by the Library of Congress (LOC) for crowdsourcing transcription and tagging of text in digitized images. The first iteration of Concordia was launched as crowd.loc.gov in the autumn of 2018.

The application asks volunteers to transcribe and tag digitized images of manuscripts and typed materials from the Library’s collections that cannot be translated well by optical character recognition (OCR). All transcriptions are made by volunteers and reviewed by volunteers. The completed transcriptions will be returned to back to loc.gov to improve search, readability, and access to handwritten and typed documents.

Concordia is a user-centered project centering the principles of trust and approachability. Read our full design principles here.

Concordia leverages the LOC’s API to pull materials from the Library's catalog. In future developments, completed transcriptions will be exported as a single document, in bulk by item, project or campaign, or as BagIt bags.

Concordia and crowd.loc.gov are supported by the National Digital Library Trust Fund.

Want to help?

We are so excited that you want to jump right in. To get started:

  1. Check out our CONTRIBUTING page and see the different ways you can help out.
  2. Next, take a look at How we work, there you'll learn more about how we use GitHub and what we are looking for if you are contributing code.
  3. To learn how to set up the Concordia on your computer, check out the For Developers page.