Code and example data related to the Public Knowlege Project Private LOCKSS Network.
More info is available on this repo's wiki.
There is a sample database schema in pkppln.sql - load it into a MySQL database and edit config_test.cfg to point at the test database.
The automatic tests require a running WSGI instance. Luckily bottle.py makes this easy. Start a test server from a terminal window:
python server.py config_test.cfg
Then, in a second terminal window, run the unit tests with automatic discovery:
python -m unittest discover tests
This is a minimal configuration for Apache to run the server.py script as a WSGI.
<VirtualHost 127.0.0.1>
ServerName pkppln.dvh
WSGIDaemonProcess pkppln.dvh processes=2 threads=15
WSGIProcessGroup pkppln.dvh
WSGIScriptAlias / /path/to/pkppln/server.py
</VirtualHost>
This will run the server.py with configutation data from config.cfg in the same directory. You will need to update the config.cfg file with your actual configuration data.
The server accepts SWORD deposits with a link to a BagIt file. The SWORD deposit also contains some metadata. The staging server runs each deposit through number of "microservices" to validate the data in different ways and prepare it for deposit to a LOCKSSOMatic instance. The services are:
- harvest
- Download the deposit BagIt file.
- validate_payload
- Check the file size and checksum of the BagIt file against the metadata in the SWORD deposit.
- validate_bag
- Extract the contents of the BagIt file and validate it.
- virus_check
- Check the content of the deposit with ClamAV's clamd.
- validate_export
- Validate the OJS export XML.
- reserialize_bag
- Add the results of validation and virus checking to the BagIt data, and serialize it into a new BagIt file.
- stage_bag
- Move the new BagIt file to the staging location.
- deposit_to_pln
- Create a SWORD deposit on a LOCKSSOMatic instance for the staged BagIt file.
- check_status
- Check the status of the deposit on the LOCKSSOMatic instance.
Services are run via pln-service.py.
usage: pln-service.py [-h] [-v | -q] [-n | -f] [-d DEPOSIT] service
Run a staging service
positional arguments:
service Name of the service to run
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase output verbosity
-q, --quiet Silence most output
-n, --dry-run Do not update the deposit states
-f, --force Force updates to the deposit states.
-d DEPOSIT, --deposit DEPOSIT
Run the service on one or more deposits
There are a number of convenience commands for querying the list of deposits. They are:
- journal_history
- Show all deposits for a journal.
- journal_info
- Show metadata for a journal.
- list_commands
- List the available commands.
- list_deposits
- List all deposits.
- list_journals
- List all the journals that have ever made a deposit.
- list_services
- List all the services in the order they are applied to a deposit.
- process
- Process one deposit through all the services in the appropriate order
- reset_deposit
- Reset a deposit to a processing stage
- service_log
- Show all service actions against a deposit.
usage: pln-command.py [-h] [-v | -q] command ...
Run a staging command
positional arguments:
command Name of the command to run
subargs Arugments to subcommand
optional arguments:
-h, --help show this help message and exit
-v, --verbose Increase output verbosity
-q, --quiet Silence most output
Use pln-command.py list_commands for a list of available commands
All commands accept -h/--help
as an argument:
$ ./pln-command.py journal_info --help
usage: pln-command.py [global options] journal_info [command options]
Report all known journal metadata.
positional arguments:
uuid Journal UUID
optional arguments:
-h, --help show this help message and exit