/ckanext-oaipmh

OAI PMH extension to CKAN

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

OAI-PMH harvester for CKAN. This extends CKAN harvester to parse OAI-PMH metadata sources and import datasets. Supported metadata schemas are oai_dc (Dublin Core), CMDI (Component MetaData Infrastructure).

The list of supported verbs consists of:

  • GetRecord: fetches a single dataset.
  • Identify: when creating the client object ('harvest source'), displays information about this OAI-PMH server.
  • ListIdentifiers: fetches individual datasets' identifiers.
  • ListSets: fetches identifiers of sets.

Harvester configuration

Please make sure you have ckanext-harvest installed. You can add a harvest source from CKAN UI and set the harvest source to use OAI-PMH harvester.

Configuration options:

  • from: Harvest datasets starting from date YYYY-MM-DD.
  • limit: Import only first 'limit' number of XML files.
  • set: Harvest only from certain sets.
  • type: Harvest only certain type.
  • until: Harvest datasets before date YYYY-MM-DD.

Here is an example of a configuration object (the one that must be entered in the configuration field):

{
 "limit": 10,
 "set": ["hdl_10138_135703"],
 "from": "2014-03-03"
}