BagIt

The bagit-python library can be found here: https://github.com/LibraryOfCongress/bagit-python

I have taken bagit.py out of it and included it here in this project. To see an example of using bagit-python, I have created a couple of scripts you can run. In a terminal run each of the following commands.

# Create a folder called bag-in-place. We will bag it in the next step.
./prepare-test.sh

# Take a look inside the bag-in-place folder now. See what is inside.

# This script runs bagit-python on the bag-in-place folder and zips it.
./bag.sh

Take a look inside the bag-in-place folder to see what bagit-python did.

Have a look inside bag.sh to see details of what it does.

Validate the bag

The following runs a python script in this folder called validate-bag.py.

python validate-bag.py

If the bag-in-place folder is there and unmodified after bagging, this should echo a message to say whether it is ok. Try modifying the README.adoc inside bag-in-place/data. Run the validate script again. It should tell you that the bag is not valid.

Have a look inside validate-bag.py to see what it does. It’s a very simple Python script that uses bagit-python to validate a bag.

BagIt Profiles

With bagit-python, the profile isn’t used to generate the bag-info.txt. However, you can define one through the arguments to it. For example:

--bagit-profile-identifier 'https://cdn.jsdelivr.net/gh/ResearchObject/bagit-ro@0.2.20160422/profile.json'

You can take a look at that JSON file in a web browser. The idea with profiles is that you can declare what profile the bag meets. That URL gets added to bagit.txt. As part of a workflow, you can then validate whether the bag meets the profile.

There is a Python BagIt profiles validator here: https://github.com/bagit-profiles/bagit-profiles-validator

Usage of bagit.py

Using bagit.py on the command line, you get a variety of options, to define the metadata which sill be saved in the manifest. In the example above, I set the --contact-name to "Yvette". Similarly you can set the organization name and address, you can change the hash algorithm, and more.

usage: bagit.py [-h] [--processes PROCESSES] [--log LOG] [--quiet]
                [--validate] [--fast] [--completeness-only] [--sha3_256]
                [--sha3_224] [--shake_256] [--sha1] [--shake_128] [--sha3_512]
                [--sha512] [--sha224] [--sha384] [--md5] [--sha256]
                [--blake2s] [--blake2b] [--sha3_384]
                [--source-organization SOURCE_ORGANIZATION]
                [--organization-address ORGANIZATION_ADDRESS]
                [--contact-name CONTACT_NAME] [--contact-phone CONTACT_PHONE]
                [--contact-email CONTACT_EMAIL]
                [--external-description EXTERNAL_DESCRIPTION]
                [--external-identifier EXTERNAL_IDENTIFIER]
                [--bag-size BAG_SIZE]
                [--bag-group-identifier BAG_GROUP_IDENTIFIER]
                [--bag-count BAG_COUNT]
                [--internal-sender-identifier INTERNAL_SENDER_IDENTIFIER]
                [--internal-sender-description INTERNAL_SENDER_DESCRIPTION]
                [--bagit-profile-identifier BAGIT_PROFILE_IDENTIFIER]
                directory [directory ...]