A data science oriented container launcher
Onyxia @ INSEE - Community website - Storybook
Onyxia is a web app that aims at being the glue between multiple open source backend technologies to
provide a state of the art data analysis experience.
Onyxia is developed by the French National institute of statistic and economic studies (INSEE).
Core feature set:
- A web GUI where users can upload/download files to/from a S3 servers. (S3 as the open standard, not the AWS service)
- An interface for launching docker images (e.g: Jupyter, RStudio) on demand on a Kubernetes cluster. The catalog of available images is not part of the app and is fully customizable. (You can checkout here the catalog we offer to our staff on the instance of Onyxia hosted @ INSEE)
- Users can define the amount of RAM, CPU and GPU they would like to allocate for their containers.
- When the user log into it's container (e.g: RStudio, Jupyter), the environnement is pre configured according to he's profile, the user don't have to fill in it's credentials. For example he can easily access the file he previously uploaded from the GUI using the pre-configured minio client. He can also push to GitHub without having to to type he's password.
- Users can provide a bash script to be executed at the start of a container. (e.g:
git clone ... && pip install
)
onyxia-web relies following open sources backend technologies:
- Onyxia API: For starting containers (RStudio, Jupyter) on demand on a Kubernetes cluster.
- keycloak: For managing user's authentication.
- Minio: For storing user's datasets.
- Vault: For storing user preferences and custom environnement variable to inject in the containers.
Setting up this infrastructure manually is not documented yet. As a result, if you want to contribute you'll have to connect to the services hosted on the sspcloud (INSEE's Data center in Paris). The app is configured using environnement variables.
# Download the binary files (images, fonts ect, you need git LFS)
git lfs install && git lfs pull
yarn install
wget https://gist.githubusercontent.com/garronej/e0f7485fac23e8aa0ceda6ce82256df6/raw/0cb60c6759d4e3005c15c9ca9706316e08013fc2/.env.local #Setup the var envs to tell the app to connect to INSEE's infra
yarn start # To launch the app
yarn storybook # To test the React's component in isolation.
yarn keycloak # To spin up Keycloak container and test the login/register page. See https://github.com/InseeFrLab/keycloakify
The is four source directories:
src/lib/
: Where lies the code for the logic of the application. It this directory there must be no reference to React and it is not allowed to import things fromsrc/app
.src/app/setup.ts
exposes a function that takes as argument all the params of the app: address of the keycloak server, url of onyxia-web, ect... This store is to be be provided at the root of the React application insrc/app/index.tsx
. The only waysrc/app
(the UI) should interact withsrc/lib
(the logic) is by dispatching thunk exposed insrc/app/setup.ts
any by using selector to access states. All the access to thesrc/lib
fromsrc/app
have been gathered int a single directorysrc/app/interfaceWithLib/hooks
. The store have two very distinct states: When the user is authenticated and when it is not. To test if the user is authenticated useappConstants.isUserLogin
ifisUserLogin
is true then you have access tostore.appConstants.logout()
elsestore.appConstants.login()
is defined. See example. We chose to not makeappConstant
a slice of the store but rather an object returned by a thunk because it stores all the values and functions that never changes (for a specific execution of the app, they changes in between reload of the app though, they are not constant as the environnement variables that are hard codded in the bundle.).src/app/
: The react code.src/app/assets
: Here should be placed the small assets imported directly from the code.
For bigger assets like video, you should upload them here and hard code the url in the code.
To be able to import other kind of files as urls like here for example with.md
you should declare the file extension like it has been done here heresrc/stories/
: Storybook stories, to develop the react component in isolation.*/tools
: All generic code. Everything that could be externalized to a standalone modules independent from the project.src/js
: Legacy code that hasn't be ported to the new architecture yet.
To release a new version, do not create a tag manually, simply bump the package.json
's version then push on the default branch,
the CI will takes charge of publishing on DockerHub
and creating a GitHub release.
- A docker image with the tag
:main
is published on DockerHub for every new commit on themain
branch. - When the commit correspond to a new release (the version have changed) the image will also be tagged
:vX.Y.Z
and:latest
. - Every commit on branches that have an open pull-request on
main
will trigger the creation of a docker image tagged:<name-of-the-feature-branch>
.
You can find here the Helm chart we use to put the docker image of the app in production.
To login to local keycloak:
- In
.env.local
set:REACT_APP_AUTH_OIDC_URL=http://localhost:8080/auth
- After launching and logging in to keycloak create realm:
sspcloud
- Root url when you add the "onyxia" client in keycloak: http://localhost:3000