/hub

☸️ Hub for executable documents

Primary LanguagePythonApache License 2.0Apache-2.0

Stencila

Build Build Status Coverage
Services Router Manager Assistant Broker Worker Overseer Database Cachemonitor
Clients PyPI NPM

👋 Introduction

Stencila is a platform for authoring, collaborating on, and sharing executable documents. This is the repository for the Stencila Hub which deploys these tools as a service and integrates them with other tools and services (e.g. Google Docs, GitHub).

⚙️ Services

Stencila Hub consists of several services, each with it's own sub-folder. The README.md file for each service provides further details on the service, including its purpose, the current and potential alternative technical approaches, and tips for development.

  • router: A Nginx server that routes requests to other services.
  • manager: A Django project containing most of the application logic.
  • assistant: A Celery worker that runs asynchronous tasks on behalf of the manager.
  • worker: A Celery process that runs jobs on behalf of users.
  • broker: A RabbitMQ instance that acts as a message queue broker for tasks and jobs.
  • scheduler: A Celery process that places periodic, scheduled tasks and jobs on the broker's queue.
  • overseer: A Celery process that monitors events associated with workers and job queues.
  • database: A PostgreSQL database used by the manager.
  • cache: A Redis store used as a cache by the manager and as a result backend for Celery
  • steward: Manages access to cloud storage for the worker and other services.
  • monitor: A Prometheus instance that monitors the health of the other services.

🤝 Clients

The Hub exposes a public API, https://hub.stenci.la/api. API client packages, generated from the Hub's OpenAPI Schema, are available in this repository for the following languages:

Client packages for other languages will be added based on demand. Please don't hesitate to ask for a client for your favorite language!

📜 Documentation

  • User focussed documentation is available in the Hub collection of our help site.

  • As mentioned above, most of the individual services have a README.md files.

  • The code generally has lots of documentation string and comments, so "Use the source, Luke".

  • To get an overview of the functionality provided check out the automatically generated page screenshots (see manager/README.md for more details on how and why these are generated).

🛠️ Develop

Prerequisites

The prerequisites for development vary somewhat by service (see the service README.mds for more details). However, most of the services will require you to have at least one of the following installed (with some examples of how to install for Ubuntu and other Debian-based Linux plaforms):

To run the service integration tests described below you will need:

and/or,

Getting started

The top level Makefile contains recipes for common development tasks e.g make lint. To run all those recipes, culminating in building containers for each of the services (if you have docker-compose installed), simply do:

make

💬 Info: We use Makefiles throughout this repo because make is a ubiquitous and language agnostic development tool. However, they are mostly there to guide and document the development workflow. You may prefer to bypass make in favour of using your favorite tools directly e.g. python, npx, PyCharm, pytest, VSCode or whatever.

The top level make recipes mostly just invoke the corresponding recipe in the Makefile for each service. You can run them individually by either cding into the service's folder or using the make -C option e.g. to just run the manager service,

make -C manager run

💁 Tip: The individual run recipes are useful for quickly iterating during development and, in the case of the manager, will hot-reload when source files are edited. Where possible, the run recipes will use a local Python virtual environment. In other cases, they will use the Docker image for the service. In both cases the run recipes define the necessary environment variables, set at their defaults.

💁 Tip: If you need to run a couple of the services together you can make run them in separate terminals. This can be handy if you want to do iterative development of one service while checking that it is talking correctly to one or more of the other services.

Linting and formatting

Most of the services define code linting and formatting recipes. It is often useful to run them sequentially in one command i.e.

make format lint

Unit testing

Some services define unit tests. Run them all using,

make test

Or, with coverage,

make cover

💬 Info: Test coverage reports are generated on CI for each push and are available on Codecov here.

Integration testing

Manually running each service

The most hands-on way of testing the integration between services is to run each of them locally.

First, create a seed SQLite development database,

make -C manager create-devdb-sqlite

Note that this will destroy any existing manager/dev.sqlite3 database. If you want to update your development database to a newer version of the database schema do this instead,

make -C manager migrate-devdb

Then in separate terminal consoles run the following make commands,

make -C manager run
make -C broker run
make -C cache run
make -C overseer run
make -C worker run

At the time of writing, those services provide for most of use cases, but you can of course also run other services locally e.g. router if you want to test them.

With docker-compose

The docker-compose.yaml file provides an easier way of integration testing. First, ensure that all of the Docker images are built:

Building images
make -C manager static # Builds static assets to include in the `manager` image
docker-compose up --build # Builds all the images
Creating a seed development database

Create a seed development database within the database container by starting the service:

docker-compose start database

and, then in another console, sending the commands to the Postgres server to create the database,

make -C manager create-devdb-postgres

If you encounter errors related to the database already existing, it may be because the you previously ran these commands. In those cases we recommend removing the existing container using,

docker-compose stop database
docker rm hub_database_1

and running the previous commands again.

💁 Tip: pgAdmin is useful for inspecting the development PostgreSQL database

Running services

Once you have done the above, to bring up the whole stack of services,

docker-compose up

The router service, which acts as the entry point, should be available at http://localhost:9000.

Or, to just bring up one or two of the services and their dependents,

docker-compose up manager worker

With minikube and kompose

To test deployment within a Kubernetes cluster you can use Minikube and Kompose,

minikube start
make run-in-minikube

💬 Info: the run-in-minikube recipe sets up Minikube to be able to do local builds of images (rather than pulling them down from Docker Hub), then builds the images in Minikube and runs kompose up.

💁 Tip: The minikube dashboard is really useful for debugging. And don't forget to minikube stop when you're done!

💁 Tip: Instead of using minikube you might want to consider lighter weight alternatives such as kind, microk8s, or k3s to run a local Kubernetes cluster.

Committing

Commit messages should follow the conventional commits specification. This is important (but not required) because commit messages are used to determine the semantic version of releases (and thus deployments) and to generate the project's CHANGELOG.md. If appropriate, use the sentence case service name as the scope (to help make both git log and the change log more readable). Some examples,

  • fix(Monitor): Fix endpoint for scraping metrics
  • feat(Director): Add pulling from Google Drive
  • docs(README): Add notes on commit messages

Dependency updates

We use Renovate to keep track of updates to packages this repo depends upon. For more details see the list of currently open and scheduled Renovate PRs and the renovate configuration in package.json.

🚀 Continuous integration

We use Azure Pipelines as a continuous integration (CI) service. On each push, the CI does linting and runs tests and semantic releases made (if there are commits since the last tag of types feat or fix). On each tag, if there are commits in a service's directory since the last tag, the Docker image for the service is built and pushed to Docker Hub (that is why Docker images do not necessarily have the same tag):