Develop external services (separate from OSF code base) to export OSF business objects and associated resources to DC Packages. (Import from DC packages to OSF is anticipated as a next step.)
If we envision DC as providing curation services to data in OSF, package import/export provides a basis for mechanisms of transfer of archivally-relevant materials between OSF and DC services.
Data Conservancy develops an RDF-based OSF data model, which could be leveraged for future use cases. OSF would be able to include preservation and curation events in the OSF UI activity feeds.
- OSF is intended to be a part of a researcher's’ workflow
- Many data curation activities are not a part of a researcher's’ workflow
- Individual institutions may have data curation requirements not satisfied by the general OSF framework or individual OSF storage providers chosen by a researcher (e.g. figshare).
- The OSF UI and model allows collaborators with different roles (e.g. curators) to contribute to a project in OSF
Package export/import would be the basis for incorporation of curation activities that do not occur through the OSF user interface, including
- Automated activities (e.g. archiving, format migration, content type detection, etc)
- Activities that occur through specialized tooling (e.g. Package Tool GUI)
- The Bagit Spec provides a mechanism for packaging and verifying digital content for transfer or archiving
- The DC Packaging spec builds on BagIt, and provides:
- Mechanisms for distinguishing domain/business objects from binary content
- Mechanisms for handling link resolution between objects
- The DC Packaging spec can be used to package data files in OSF as well as OSF business object(s) that describe the component that contains them, the provider they came from, metadata data which may establish provenance, etc.
Import/export of packages from OSF could be a building block of several kinds of use cases:
- Pulling content into a tool for manual curation (e.g. the Package tool GUI)
- Separately archiving or preserving content in OSF, regardless of its native provider (e.g. figshare, github, dropbox, etc)
- Synchronizing locally modified (manual or automated) content with OSF content
- Importing locally generated content into the OSF.
- Specialized (local) indexing of content stored in OSF
- Exposing OSF content as linked data (via the Package Ingest Service)
- In the DC Packaging spec, domain objects must have an RDF representation. In theory, OSF uses JSON to represent to represent “its business objects”. There could possibly be json schemas available, but it is unclear where. We’d need to ask.
- It’s unclear if they have a defined object model at all
- We would need to confirm that the Django API supports everything that we need to do.