A framework to support migration of data from Fedora 3 to Fedora 4 repositories.
The main class (org.fcrepo.migration.Migrator
) iterates over all of the fedora objects in a configured source (org.fcrepo.migration.ObjectSource
) and handles them using the configured handler (org.fcrepo.migration.StreamingFedoraObjectHandler
). The configuration is entirely contained within a Spring XML configuration file in src/main/resources/spring/migration-bean.xml
.
A basic migration scenario is implemented that may serve as a starting point for your own migration from Fedora 3.x to Fedora 4.x.
Background work
- Determine the disposition of your FOXML files:
- Will you be migrating from exported (archive or migration context) FOXML?
- If so, you will need all of the export FOXML in a known directory.
- Will you be migrating from from a native fcrepo3 filesystem?
- If so, fcrepo3 should not be running, and you will need to determine if you're using legacy or akubra storage
- Will you be migrating from exported (archive or migration context) FOXML?
- Determine your fcrepo4 url (ex: http://localhost:8080/rest/, http://yourHostName.ca:8080/fcrepo/rest/) (line 140
- There is currently only one implemented pid-mapping strategy, but you can configure it to put all of your migrated content under a given path (line 93, sets that value to "migrated-fedora3").
Getting started:
- Download the executable jar file
- Create a local copy of the example configuration file and update as described below:
- If you are migrating from exported FOXML, you will leave line 9.
- If you are migrating from a native fcrepo3 file system, you will need to change
exportedFoxmlDirectoryObjectSource
tonativeFoxmlDirectoryObjectSource
in line 9. - If you are migrating from a native fcrepo3 file system, you will need to set the paths to the
objectStore
anddatastreamStore
(Lines 143-139). - If you are migrating from exported FOXML, you will need to set the path to the directory you have them stored in (Lines 151-153).
- Set your fcrepo4 url (Line 140).
- If you would like to run the migration in test mode (console logging), you will leave lines 11-16 as is.
- If you would like to run the migration, you will need to comment out or remove line 9, and uncomment line 15.
To run the migration scenario you have configured in the Spring XML configuration file:
java -jar migration-utils-{version}-driver.jar <relative-or-absolute-path-to-configuration-file>
fcrepo 3 | fcrepo4 | Example |
---|---|---|
PID | dcterms:identifier | yul:328697 |
state | fedoraaccess:objState | Active |
label | fedora3model:label† | Elvis Presley |
createDate | premis:hasDateCreatedByApplication | 2015-03-16T20:11:06.683Z |
lastModifiedDate | metadataModification | 2015-03-16T20:11:06.683Z |
ownerId | fedora3model:ownerId† | nruest |
fcrepo3 | fcrepo4 | Example |
---|---|---|
DSID | dcterms:identifier | OBJ |
Label | dcterms:title‡ | ASC19109.tif |
MIME Type | ebucore:hasMimeType† | image/tiff |
State | fedoraaccess:objState | Active |
Created | premis:hasDateCreatedByApplication | 2015-03-16T20:11:06.683Z |
Versionable | fedora:hasVersions‡ | true |
Format URI | premis:formatDesignation‡ | info:pronom/fmt/156 |
Alternate IDs | dcterms:identifier‡ | |
Access URL | dcterms:identifier‡ | |
Checksum | premis:hasMessageDigestAlgorithm + premis:hasMessageDigest‡ | SHA1, c91342b705b15cb4f6ac5362cc6a47d9425aec86 |
fcrepo3 event | fcrepo4 Event Type |
---|---|
addDatastream | premis:create‡ |
modifyDatastreamByReference | audit:contentModification/metadataModification‡ |
modifyObject | audit:resourceModification‡ |
modifyObject (checksum validation) | premis:validation‡ |
modifyDatastreamByValue | audit:contentModification/metadataModification‡ |
purgeDatastream | audit:contentRemoval‡ |
† The fedora3model
namespace is not a published namespace. It is a representation of the fcrepo3 namespace info:fedora/fedora-system:def/model
.
‡ Not yet implemented
Note: All fcrepo3 DC (Dublin Core) datastream values are mapped as dcterms properties on the Object in fcrepo4. The same goes for any properties in the RELS-EXT and RELS-INT datastreams.