OAI-PMH integration
Opened this issue · 1 comments
skasberger commented
Integrate OAI-PMH endpoint and data conversion.
Requirements
- Mapping of data from OAI-PMH endpoint (DDI XML and/or DC)
- Import of data
- Export of data
- XML schema
- validate against schema
ACTIONS
0. Pre-Requisites
1. Research
- check Python modules
- Get in touch with Carsten Thiel about progress of tool to validate Metadata Server output and creation of CESSDA metadata model from it
pyoai
from oaipmh.client import Client
from oaipmh.metadata import MetadataRegistry, oai_dc_reader
url = "https://data.aussda.at/oai"
registry = MetadataRegistry()
registry.registerReader('oai_dc', oai_dc_reader)
client = Client(URL, registry)
for record in client.listRecords(metadataPrefix='oai_dc'):
print(record)
oai-harvest
oai-harvest --set "all_published" --metadataPrefix "oai_ddi" https://data.aussda.at/oai
sickle
from sickle import Sickle
sickle = Sickle('https://data.aussda.at/oai')
records = sickle.ListRecords(metadataPrefix='oai_ddi')
record = records.next()
record.header
record.header.identifier
record.metadata
2. Plan
- Define requirements
3. Implement
- Write tests
- Write code
- Write and update Docs
- Write Docstrings
- Run pytest
- Run tox
- Run pylint
- Run mypy
4. Follow Ups
- Review
- Code
- Tests
- Docs
pdurbin commented
As discussed during the 2024-02-14 meeting of the pyDataverse working group, we are closing old milestones in favor of a new project board at https://github.com/orgs/gdcc/projects/1 and removing issues (like this one) from those old milestones. Please feel free to join the working group! You can find us at https://py.gdcc.io and https://dataverse.zulipchat.com/#narrow/stream/377090-python