/epdracor-sources

Sources for EngDraCor

Primary LanguageShell

epdracor-sources

This repository maintains the selection of documents from the EarlyPrint Project that serve as the sources of the EPDraCor corpus.

The EarlyPrint IDs of the selected documents are maintained in the ids.txt file.

To update the selection from a cloned bitbucket repository or from bitbucket directly the update script can be used:

./scripts/update --help
./scripts/update --download
./scripts/update --all --copy /path/to/local/repos

How to add or remove plays

  1. edit ids.txt to add or remove the EP IDs of the respective EarlyPrint texts
  2. run ./scripts/update --download to download new documents from the EarlyPrint Bitbucket repository and/or remove existing documents from the xml directory
  3. commit the changes

eXist DB integration

For development purposes this repository provides an eXist DB integration that makes it easy to upload the TEI files into a local eXist database to make them available for xqueries you might want to run for analysis.

To set this up copy the (.env.sample)[.env.sample] file to .env and adjust the environment variables to your local eXist setup. (The defaults should work with a vanilla eXist DB installation on most systems.) Then run the init script to create and configure the database collection and upload xquery files:

cp .env.sample .env
# adjust .env
./scripts/init

Now you can upload either individual TEI files or the entire xml directory using the load script:

# load all files in xml/
./scripts/load
# load individual files
./scripts/load xml/A015*.xml
# usage
./scripts/load --help

Finally an uploaded query can conveniently be executed with the query script:

./scripts/query plays.xq
./scripts/query 'speeches.xq?id=A36645'

.existdb.json

To support the integration with editor plugins for Atom or Visual Studio Code we also provide a template for an .existdb.json configuration file. The .existdb.json gets created when running the init script with the -j option:

./scripts/init -j