The dashboard webapp provides an overview of the studies supported by the kimel lab. The app supports manual QC of MRI sessions as well as display of various QC metrics generated by our nightly pipelines.
- Technologies
- Dashboard Structure
- Database Schema
- Setting Up a Development Environment
- TIGRLab Dashboard
- TIGRLab Ansible Configuration
- Code Tips
The app is supported by a backend postgreSQL database, hosted on the srv-postgres.camhres.ca virtual server. The front end is programmed in python using the Flask framework (Strongly recommend this tutorial if you are trying to get familiar with flask, with SQLAlchemy. The web server itself is NGINX with uWSGI. Authentication is handled with OAuth and we currently support github and gitlab for login.
dashboard/models.py defines the Object Relational Mapping (ORM) used by SQLAlchemy to map from the relational database to the python objects used in the code.
dashboard/views.py defines the entry points (i.e. valid URLs for the app).
HTML templates are in dashboard/templates/. Templates with _snip.html
are
embedded in other pages.
The web-app interacts with the filesystem (checklist.csv, blacklist.csv) and updates made through the web-app are automatically propogated to the filesystem before the update is made. If the filesystem cannot be updated the database update fails. Alterations made directly to the filesystem are propagated to the database by the nightly datman scripts.
This section under construction. - @DESm1th, May 23 2018
Brace yourself, this is going to be a long one.
- Fork the dashboard
- Clone your new fork
- Clone datman
- Ensure that
pidentd
,postgresql-9.5
,postgresql-client-9.5
are installed on your machine - Set Up a Virtual Environment
- Configure postgres for the dashboard
- Set up the database schema
- Set up OAuth
- Set up your config files for datman
- Set up your shell environment
- Run your new dashboard!
Currently datman / the dashboard need python 2.7. The examples below use virtualenv since that's my preference, but you can use any python package for virtual environments (e.g. conda)
Create a new virtual environment
virtualenv --python=python2.7 <your path>/venv
Activate your new environment
source <your path>/venv/bin/activate
Install the required python packages.
pip install -r <path to your dashboard clone>/requirements.txt
There are a few things that may go wrong with installing these packages, all centered on the cryptography package.
- If you get a message about how pip failed to build or failed to install the cryptography package make sure you have libssl-dev installed.
sudo apt install libssl-dev
- Occasionally old versions of the cryptography package are made obsolete due to discovered security vulnerabilities. If pip refuses to install the version requested by requirements.txt it should be perfectly fine to just use the newest version of cryptography instead.
If you find any other problems or cant get the cryptography package working please let us know by creating an issue here
There are three main changes needed to postgres' config files on your machine
to enable the dashboard to access the database. You will need sudo to make these
changes or need to ask an admin with sudo. For Ubuntu and PostgreSQL version 9.5
the config files will be found at /etc/postgresql/9.5/main/
.
- Update
postgresql.conf
to listen on your IP and localhost. Add the following line:
listen_addresses = '<YOUR IP HERE>, localhost'
- Update
pg_hba.conf
with the correct 'host' records. If this is a development setup that only you will access adding the following will be fine:
host dashboard all 127.0.0.1/32 ident map=default
host dashboard all <YOUR IP HERE>/32 ident map=default
If your setup will be used on an internet facing server DO NOT use the ident method shown here. It's really easily fooled by anyone with ill intentions and only safe to use for your own machine or a private, secure local network. For more info on postgres authentication methods see here.
- Update
pg_ident.conf
to give your user account access to the web_user role and to a role named after your account. If you're using the ident method shown in step two then the following should work.<your account name>
should be replaced with whatever account name you use to log into your OS.
default <your account name> web_user
default /(.*) \1
The second line will match your account name to a role of the same name. So for
example if my account name is john_doe
and I add this in place of
<your account name>
in the example config, I will be able to log into the
dashboard database as either web_user
(matches first line) or john_doe
(matches second line).
Once you've made these changes you may need to restart postgres to get it to recognize them. On Ubuntu 16.04 you can do this with
sudo systemctl restart postgresql
In addition, you may need to create your user account yourself with
sudo -u postgres createuser -s <your_username>
The easiest way to set up your test database is to just load in an old backup of the real database (ask @DESm1th for this). Once you have the backup you can set up your database by running the following as a user with database creation privileges (see this section for configuring these privileges):
createdb dashboard
# If no host is specified psql will connect to postgres on your machine
psql dashboard -f $path_to_your_backup
You'll have to build an empty database from scratch from the SQL files to be added later. Sorry - @DESm1th, May 24 2018
If this is a development set up you may also want to give your user account Superuser access to the database for convenience. To do this you must connect to the database as a user that already has superuser permissions (on most new installations this would be the postgres user). Run the following in your terminal
sudo su postgres
psql dashboard
Once connected to the database with psql you can grant superuser to your account with
ALTER USER <your account name here> WITH SUPERUSER;
Instructions for how to do this on GitHub are found here and instructions to do this with GitLab are here.
The name and description can be anything but the URL you use should be
http://<your ip>:5000
. The Authorisation callback URL should be
http://<your computer name or ip>:5000/callback/<provider>
. Replace
<provider>
with either github
or gitlab
depending on which you are
configuring.
The base URL you set for the callback URL is what you should use when you access your dashboard. If you use your computer name (or another name) in the callback URL but attempt to access it with your IP in your browser (or vice versa) authentication may fail if DNS is not properly configured.
So for example, if I use a URL of http://1.1.1.1:5000
and a callback URL of
http://borg-cube:5000/callback/github
without DNS configured accessing the
dashboard with http://borg-cube:5000
should work but http://1.1.1.1:5000
may fail when github tries to contact the callback URL. To minimize problems
either set the URL and the base URL of the callback to be the same thing
or make sure that the base URL of your callback is exactly what you intend to
use when you access your app.
The Client ID and Client Secret that are generated by GitHub/GitLab are needed for the shell environment as described here.
The dashboard reads almost all of its information from the database, but uses the config files to locate metadata that might need to be updated (e.g. scans.csv, blacklist.csv, the study's README) and scans to be viewed in the papaya viewer.
If you're a TIGRLab member the easiest way to get setup is to copy tigrlab_config.yaml
and any needed
study config files from /archive/code/config
.
You should then update the following config entries:
- Update SERVER_LOG_DIR and change LOGSERVER to point to your machine if you want to make use of datman's logging server
- In SystemSettings delete the existing entries and add a new block that points to your projects. Here's an example template to use:
SystemSettings:
YOUR_SYSTEM_NAME_HERE:
DATMAN_PROJECTSDIR: '/path/to/your/datman/archive'
DATMAN_ASSETSDIR: '/path/to/your/datman/assets'
CONFIG_DIR: '/path/to/your/config/files/folder'
This page contains an overview of the datman configuration files, this page has detailed instructions for setting up a site config file and this page has detailed instructions for setting up study config files.
There are two main options for configuring your shell to run a development instance of the dashboard.
If you use environment modules (TIGRLab does) you can create your own module from the template provided in dashboard.module.template. Just follow the comments and fill in your own passwords and other secrets. See here for advice on setting up a module using environment modules.
Remember that in addition to loading your module you'll have to source your virtual environment before running the dashboard with
source <path to your virtual env>/venv/bin/activate
If you dont use environment modules, or are more comfortable just sourcing a script, you can build your environment set up script from the following template
# Provide OAuth details
# You can delete either the github or gitlab entries. You only need one configured
# Github
export OAUTH_SECRET_GITHUB=YOUR-SECRET-HERE
export OAUTH_CLIENT_GITHUB=YOUR-SECRET-HERE
# GitLab
export OAUTH_SECRET_GITLAB=YOUR-SECRET-HERE
export OAUTH_CLIENT_GITLAB=YOUR-SECRET-HERE
# Enable github issue support. If the repo issues will be added to is private, owner
# must be the same as the owner of the dashboard app itself
export GITHUB_OWNER=GITHUB-REPO-OWNER-ACCOUNT-HERE
export GITHUB_REPO=GITHUB-REPO-NAME-HERE
# Provide a secret key for Flask
# This can be whatever you want, but you should keep it secret and
# make it something not easily guessed since it's used to encrypt sessions
export FLASK_SECRET_KEY=YOUR-SECRET-HERE
# Provide Postgres info
# You can change the postgres user or database name here, just make sure
# everything is configured correctly in postgres
export POSTGRES_USER=web_user
export POSTGRES_DATABASE=dashboard
export POSTGRES_PASS=YOUR-SECRET-HERE
export POSTGRES_SRVR=YOUR-POSTGRES-SERVERS-IP-HERE
export ADMINS=ADMIN-COMMA-SEPARATED-EMAIL-ADDRESSES-HERE
export DASHBOARD_SUPPORT_EMAIL=MAIN-CONTACT-EMAIL-HERE
# Provide a redcap token to enable Scan Completed forms to be
# pulled in. This part is optional but you may have to fill in a fake value
# to get the dashboard to start :(
export REDCAP_TOKEN=YOUR-REDCAP-TOKEN-HERE
# Configure datman
# The dashboard requires datman. You can install it whereever you like but
# the site wide configuration file, site name, and the location of the code
# must be provided.
export DM_CONFIG=PATH-TO-YOUR-SITE-CONFIG-HERE
export DM_SYSTEM=YOUR-SYSTEM-NAME-HERE
# If you installed datman + the dashboard inside your environment you do not
# need to modify the python path or add them to your path and can omit this
# section. If you just cloned them somewhere, you do need it.
# Add datman to your paths
export PATH=<path to datman/datman folder>:$PATH
export PATH=<path to datman/bin folder>:$PATH
export PYTHONPATH=<path to datman folder>:$PYTHONPATH
# Add dashboard scripts to your paths
export PATH=<path to dashboard here>:$PATH
export PYTHONPATH=<path to dashboard here>:$PYTHONPATH
# Source your virtual environment for convenience
source <path to your virtualenv>/venv/bin/activate
And then just source your script before running the dashboard
source <path to your script>
Once you've completed all the other steps to set up your development environment, open a shell, load your shell environment by either loading your module and sourcing your virtual environment or sourcing your setup script, and then start up your server in one of two ways:
- Use Flask's built in Werkzeug server with
python dashboard/run.py
This will give you debugging output when an error occurs and is more than sufficient for a development instance. It should not be used for a production server though.
- Use a temporary uWSGI instance with
bash dashboard/srv_uwsgi.sh
This gives you a setup closer to the TIGRLab production server, so you can toy with uWSGI settings before trying them out on the real server.
This section describes the TIGRLab's current production setup for our dashboard and how to make modifications to it if needed. It's only relevant to TIGRLab members who will be working with the production server :)
Server: srv-dashboard.camhres.ca (172.26.216.66)
NGINX and uWSGI configuration are controlled by ansible. See Dashboard Ansible Role for more info.
The uwsgi app runs as user clevis with the web_user role to access the
database. These are configured in /etc/uwsgi/apps-available/dashboard.ini
The codebase is expected to be located at /archive/code/dashboard
. This means
that any updates or bug fixes to the dashboard need to be pulled into the
archive and uWSGI needs to be restarted with systemctl restart uwsgi
on srv-dashboard before they'll take effect.
Important: Some secret information is required for uWSGI. These passwords should not be committed to github. The relevant passwords are in passpack, and as mentioned in Dashboard Ansible Role, separated into a file that doesnt get committed to git. Please maintain this separation to avoid any security issues.
Server: srv-postgres.camhres.ca (172.26.216.68)
Database name: dashboard
Access requires that postgresql-client
and pidentd
packages are both
installed. This should already be the case on all lab workstations. New admin
users may need access to the postgres web_user role. See
Postgres Ansible Role for more info.
Once authentication is correctly configured the database should be accessible with
psql -h srv-postgres -U web_user dashboard
In addition, three postgres roles have been defined with access to the dashboard database.
- admin: Manage databases, manage roles on all databases
- dashboard: Read, Write, Delete etc. on dashboard database
- dashboard_read: Read only on dashboard database
One special user role has been defined. web_user is a member of the dashboard role and is used by the webapp front end.
Clevis is defined as a superuser to enable backups.
For now, to add a new study to the dashboard you must manually insert some records into the database.
- If the PI of the study is not already in the table 'people', they must be added
- A record must be added to the table 'studies'
- Any sites unique to this study must be added to the 'sites'
- For each site in the study make a record in study_sites
- Add any unique series tags to scantypes
- For each scantype tag that may appear in this study's data, add a record to study_scantypes
For our lab, the dashboard's configuration can be found in the role 'dashboard' and srv-postgres' configuration is in the role 'postgres'. This section is only relevant for TIGRLab dashboard admins :)
The most important thing this role does is add the uWSGI and NGINX configuration
for the dashboard to srv-dashboard. The NGINX configuration is just a copy
of dashboard/templates/nginx_dashboard.conf.j2
while the uWSGI configuration
is generated from dashboard/templates/dashboard_ini_template.j2
and
dashboard/templates/dashboard.ini.j2
.
NOTE: If you make any configuration changes directly to srv-dashboard without adding these changes to ansible your changes may be obliterated the next time ansible is run. Also, it's just generally bad practice and makes it harder to recreate a working server if anything catastrophic happens. Dont do this!
This file is the main source for uWSGI configuration. It holds all of the non-sensitive settings for the server. New environment variables and changes to how uWSGI will run should be added here.
This file holds sensitive information that gets filled in to the "dashboard_ini_template.j2" template. It is stored separately in a directory that we never commit to github and linked into the templates folder. Any new passwords or secrets should be added here and a line added to "dashboard_ini_template.j2" where the new secret will be filled in.
The most important tasks this role performs are configuring postgresql for the dashboard and configuring database backups.
As with the dashboard ansible role (and anything else managed by ansible) configuration changes should be made in ansible and not directly to the server.
The postgres role has templates for postgresql.conf
, pg_ident.conf
and
pg_hba.conf
. Any changes to postgres' configuration should be made to these
templates. To give a new user access to the web_user role their username
must be added to the pg_admins
list in postgres/vars/main.yml
. Only admins
might require this role.
Ansible configures a cron job (/etc/cron.d/dashboard_bkup
) that will dump the
entire database nightly and store the result in /mnt/backup
on srv-postgres.
We currently keep three weeks worth of backups. TIGRsrv is configured to copy
the backups and store them at /mnt/backup/dashboard
, so we should have two
copies at all times.
This is just a collection of small things to be aware of if you're going to write code for the dashboard. To help with profiling and debugging consider using the Flask Debug Toolbar (Thanks Mike for finding that! - Dawn). The profiler can help identify what's generating most of the load time and the SQLAlchemy tab can let you know if your code is accidentally generating a huge number of queries.
Flask's HTML templates use Jinja templating to render pages (docs here). The most important thing to be aware of is that while you can break up your pages into smaller, more readable, chunks by saving html in another file and then importing it with something like
{% include 'my_other_file.html' %}
performance wise this is not always a good idea. Each and every time the 'include' statement is read when the website's page is loaded the included file has to be read from the filesystem. File reads are (relatively) slow and if the include is inside of a loop with a large number of iterations you can easily add extra seconds of load time for a minimal boost in HTML readability.
Some tips to get the most out of Jinja without adding too much overhead:
- If you have a loop and want to 'include' the body of the loop from another file, it's better to keep the loop inside the included file (so the file is opened and read once, rather than once per iteration)
- If you have an 'if' statement and the body of it is included from another snippet it's better to keep the 'if' in your original file, so you dont need to open the snippet just to discover the if statement failed
Also note that if you're organizing your html snippets in a nested folder you always need to give the full path from the root of the template directory to the file you want to include. If you get an error about a missing template, make sure you quoted the name of the file to be included.
# Good
{% include 'my_snippet.html' %} # for templates/my_snippet.html
{% include 'session/modals/incidental_findings.html' %} # For templates/session/modals/incidental_findings.html
# Bad
{% include my_snippet.html %} # Missing quotes on file name
{% include 'incidental_findings.html' %} # This file won't be found without the full path
SQLAlchemy is awesome and very powerful BUT sometimes it makes really naive queries. If you try to work with objects from dashboard.models like they're normal python objects you can very easily end up generating thousands of queries without realizing. For instance if you tried doing something like
from dashboard.models import Site
cmh = Site.query.get('CMH')
for timepoint in cmh.timepoints:
# Do some stuff with the timepoint record here
This loop would generate one query to the database for each timepoint that has the site 'CMH'. If code like that were embedded in a function called in a jinja template then those extra queries would hit every time a user loaded the page. In these sorts of cases it is often better to craft your own queries using SQLAlchemy's query API.