/onyxia-web

A data science oriented container launcher

Primary LanguageTypeScriptMIT LicenseMIT

A data science oriented container launcher

Onyxia @ INSEE - Community website - Storybook


Onyxia is a web app that aims at being the glue between multiple open source backend technologies to provide a state of the art data analysis experience.
Onyxia is developed by the French National institute of statistic and economic studies (INSEE).

Core feature set:

  • A web GUI where users can upload/download files to/from a S3 servers. (S3 as the open standard, not the AWS service)
  • An interface for launching docker images (e.g: Jupyter, RStudio) on demand on a Kubernetes cluster. The catalog of available images is not part of the app and is fully customizable. (You can checkout here the catalog we offer to our staff on the instance of Onyxia hosted @ INSEE)
  • Users can define the amount of RAM, CPU and GPU they would like to allocate for their containers.
  • When the user log into it's container (e.g: RStudio, Jupyter), the environnement is pre configured according to he's profile, the user don't have to fill in it's credentials. For example he can easily access the file he previously uploaded from the GUI using the pre-configured minio client. He can also push to GitHub without having to to type he's password.
  • Users can provide a bash script to be executed at the start of a container. (e.g: git clone ... && pip install )

Screenshots

scree_myservices screen_launcher screen_main_services my_secrets

Contributing

Development

onyxia-web relies following open sources backend technologies:

  • Onyxia API: For starting containers (RStudio, Jupyter) on demand on a Kubernetes cluster.
  • keycloak: For managing user's authentication.
  • Minio: For storing user's datasets.
  • Vault: For storing user preferences and custom environnement variable to inject in the containers.

Setting up this infrastructure manually is not documented yet. As a result, if you want to contribute you'll have to connect to the services hosted on the sspcloud (INSEE's Data center in Paris). The app is configured using environnement variables.

# Download the binary files (images, fonts ect, you need git LFS)
git lfs install && git lfs pull
yarn install
wget https://gist.githubusercontent.com/garronej/e0f7485fac23e8aa0ceda6ce82256df6/raw/0cb60c6759d4e3005c15c9ca9706316e08013fc2/.env.local #Setup the var envs to tell the app to connect to INSEE's infra
yarn start # To launch the app
yarn storybook # To test the React's component in isolation.
yarn keycloak # To spin up Keycloak container and test the login/register page. See https://github.com/InseeFrLab/keycloakify

Architecture

The is four source directories:

  • src/lib/: Where lies the code for the logic of the application. It this directory there must be no reference to React and it is not allowed to import things from src/app. src/app/setup.ts exposes a function that takes as argument all the params of the app: address of the keycloak server, url of onyxia-web, ect... This store is to be be provided at the root of the React application in src/app/index.tsx. The only way src/app (the UI) should interact with src/lib (the logic) is by dispatching thunk exposed in src/app/setup.ts any by using selector to access states. All the access to the src/lib from src/app have been gathered int a single directory src/app/interfaceWithLib/hooks. The store have two very distinct states: When the user is authenticated and when it is not. To test if the user is authenticated use appConstants.isUserLogin if isUserLogin is true then you have access to store.appConstants.logout() else store.appConstants.login() is defined. See example. We chose to not make appConstant a slice of the store but rather an object returned by a thunk because it stores all the values and functions that never changes (for a specific execution of the app, they changes in between reload of the app though, they are not constant as the environnement variables that are hard codded in the bundle.).
  • src/app/: The react code.
  • src/app/assets: Here should be placed the small assets imported directly from the code.
    For bigger assets like video, you should upload them here and hard code the url in the code.
    To be able to import other kind of files as urls like here for example with .md you should declare the file extension like it has been done here here
  • src/stories/: Storybook stories, to develop the react component in isolation.
  • */tools: All generic code. Everything that could be externalized to a standalone modules independent from the project.
  • src/js: Legacy code that hasn't be ported to the new architecture yet.

OPS

To release a new version, do not create a tag manually, simply bump the package.json's version then push on the default branch, the CI will takes charge of publishing on DockerHub and creating a GitHub release.

  • A docker image with the tag :main is published on DockerHub for every new commit on the main branch.
  • When the commit correspond to a new release (the version have changed) the image will also be tagged :vX.Y.Z and :latest.
  • Every commit on branches that have an open pull-request on main will trigger the creation of a docker image tagged :<name-of-the-feature-branch>.

You can find here the Helm chart we use to put the docker image of the app in production.

NOTE (for self)

To login to local keycloak:

  • In .env.local set: REACT_APP_AUTH_OIDC_URL=http://localhost:8080/auth
  • After launching and logging in to keycloak create realm: sspcloud
  • Root url when you add the "onyxia" client in keycloak: http://localhost:3000