ESA-EarthCODE/portal

Write a Technote to describe the current EarthCODE Workflow and Potential Enhancements

Opened this issue · 4 comments

Hi @silvester-pari,

I hope this captures what we discussed this morning. This should be a useful reference when we learn more about the actual platforms we will use with EarthCODE.

As discussed this will:

Describe the current EarthCODE GitHub Workflow (or user journey)
Describe the current EarthCODE GitHub Workflow by role (or user journey by role)
Describe any current Workflow issues with support from @edobrowolska
Describe any potential GUI enhancements to the portal to duplicate the GitHub Workflow functionality (if they provide a user benefit)

The workflow can be described using diagrams or text as appropriate.
The workflow can refer to existing OSC artifacts is helpful

Note that this tech note informs the following stories (and it may indeed resolve them)

https://github.com/orgs/ESA-EarthCODE/projects/5/views/3?pane=issue&itemId=59760877
https://github.com/orgs/ESA-EarthCODE/projects/5/views/3?pane=issue&itemId=59761187
https://github.com/orgs/ESA-EarthCODE/projects/5/views/3?pane=issue&itemId=59761826
https://github.com/orgs/ESA-EarthCODE/projects/5/views/3?pane=issue&itemId=59761923

Describe the current EarthCODE GitHub Workflow by role (or user journey by role)

User/scientist/editor

Description of the current OSC workflow is in https://github.com/ESA-EarthCODE/open-science-catalog-metadata/wiki/User-Guide%E2%80%90v.1.0.0#add-metadata-of-a-single-product-item-to-the-catalogue and https://github.com/ESA-EarthCODE/open-science-catalog-metadata/wiki/User-Guide%E2%80%90v.1.0.0#add-multiple-assets-at-once-with-github

Within EarthCODE, we started experimenting with open-source tools that automate parts of this workflow for the user (scientist, editor):

  1. The user logs into EarthCode portal using GitHub as Identity Provider
  2. Via the GUI, the user creates a fork of the OSC metadata repository in the name of the user
  3. Via the GUI, the user browses the files located in the OSC metadata repository
  4. Via the GUI, the user adds/edits one file and saves it (creates a branch + a commit in the user's fork)
  5. Via the GUI, the user marks the file as "ready for review" (creates a PR from the user's fork branch to the main repository)
  6. Once the user's request has been approved, the "review" status disappears and the merged file behaves like any other file in the metadata repository.
    Encountered limitations: Only one file per "session" (branch/PR)

Administrator

  1. The administrator logs into GitHub and sees a list of Pull Requests (PRs)
  2. The administrator reviews the contents of the PR
  3. The administrator approves and merges the PR
    Identified limitations: The administrator can't see a "live preview" of the proposed changes

Describe any current Workflow issues with support from @edobrowolska

See Technical note for the current data publication process in OSC provided by @edobrowolska

Issues identified with Users (Editors) are listed below:

  1. STAC Item/Catalog complexity through GitHub:
  • Metadata ingestion is complex for users unfamiliar with GitHub.
  • Incorrect file naming instead of creating a collection.json file with product id.
  • Manual creation of catalog or collection.json files is time-consuming prone to human errors (typos).
  • Difficulty in branch creation and switching in GitHub web editor for metadata edits (user new to GitHub).
  1. Product metadata description:
  • Need for guidelines on constructing unique product IDs and specifying ‘standard name’ fields.
  • Difficulty in creating correct file links.
  • Complexity in updating the products/catalog.json file to update the product list.
  • Manual writing of Json files is challenging, especially for users unfamiliar with STAC format.
  1. User Guide:
  • Need for simplification. -- > short video/demo
  • Absence of specific instructions for adding new variables.
  • Unclear explanations of field contents.
  1. Repository Access:
  • Limited asset repository access to OSC administrators, hindering external user uploads.
  • Further discussion needed in another sprint to be aligned with APEx

Describe any potential GUI enhancements to the portal to duplicate the GitHub Workflow functionality (if they provide a user benefit)

The ideal outcome for metadata editing would improve the following points:

  • unique id generation: we need a system of how to create unique ids, either based on random identifiers (less user readable) or by combining metadata properties into an id (harder to check if the id is actually unique). It is unclear if currently there are technical means for checking the existence of an id within the catalog
  • multiple STAC files at once should be manageable within one "session" (branch/PR), so they can reference each other. This would mean that the GUI triggers changes to the same (forked) branch until the user decides to mark it as ready for review. It is acceptable that the users follow a specific order of operations (e.g. first create a variable, then create a product referencing that variable), but not to go through multiple review-approval loops for adding multiple files
  • the existing files should be referable through the GUI (e.g. existing variables should be selectable when creating/editing a product) in order to avoid typos etc.
  • the input form should provide a basic set of validation functionalities

In regards to data management (uploading and storing files, referencing them etc.), the document provided by @edobrowolska gives more insight, but this is a bigger topic that requires more research and discussions.

Description of current workflow issues and future improvements (recommended) are provided in this document:

Technical note of the current publication process in OSC.docx