/somisana

SOMISANA-related tooling

Primary LanguageMATLABMIT LicenseMIT

SOMISANA API

Documentation and source code for SAEON's Sustainable Ocean modelling Initiative (SOMISANA)

Contents

The workflow for running the models is currently being re-designed here, and this new repo will be used for running models operationally on MIMS instratructure. So this repo is fairly stagnant at the moment, but operational models continue to run from this repo on SAEON infrastructure. The new repo only runs the models - nothing website related - so we will need to move the web-related code to a separate web-centric repo, which looks for the latest model outputs on MIMS. Just as soon as we can get a web designer to do this!

Local development

Please install Node.js, then in the root of the repo, run the following commands to configure global settings (such as git pre-commit hooks):

npm install -g chomp
chomp init

Look in the chompfile.toml file to see available scripts, and feel free to add your own! The pre-commit hook is defined in .husky/pre-commit. This is just a shell script and can easily be extended.

Deployment

The following infrastructure is required:

Web application (visualizations)

The web application components (databases, and apps) are all deployed to a containerization platform. Currently this is Docker Swarm, but in the future will be some managed K8s installation (for example, maybe Rancher). The goal is to support multiple container formats - for example Docker, and Singularly.

An application server

  • 8GB memory
  • 4CPUs
  • Minimum configurable (see below for mount requirements)

MongoDB database server

  • 4GB memory
  • 2CPUs
  • Minimum configurable (see below for mount requirements)

PostgreSQL database server

  • 12GB memory
  • 16 CPUs
  • Minimum configurable (see below for mount requirements)

Total

  • 24GB memory
  • 22 CPUs
  • Minimum configurable (maybe 100GB to be safe, allow for SWAP, etc)

NOTE: There is still some uncertainty here depending on where various components are hosted. For example, if THREDDS or the somisana.ac.za website is hosted on SAEON infrastructure rather than on MIMS infrastructure, that effects these requirements. Also, on the SAEON infrastructure, similar resources are spread between 5 web applications (and the same resources can support a lot more web applications). These requirements for simply hosting somisana.ac.za are costly, but this is cost effective when the same infrastructure is shared.

Task server

This server runs GitHub Actions pipelines on a self-hosted actions runner - it executes models. This configuration currently supports executing 2 models in serial. To execute 2 models in parallel, increase the number of CPUs to tha required for the 2 largest models, plus 1. Storage on this server is allocated for temporary storage (currently 7 days) of all model/product files/data. This allows for any archiving steps to fail up to 7 days in a row before data is lost

  • 12GB memory
  • 13 CPUs
  • 1TB storage (does not have to be backed up)

Mounted storage

Most storage requirements are in the form of SAMBA mounts that can be mounted to multiple locations

  • Data archive: 5TB? (must be backed up. TODO - @giles to calculate)
  • PostgreSQL: 500GB (does not have to be backed up, but should be mounted if possible)
  • MongoDB: 100GB

Reporting

Contribution history can be accessed via interrogating the Git index. Here is a useful command for that:

git log --pretty=format:"%h%x09%an%x09%ad%x09%s"

Documentation