/growthcleanr-web

A simple web application, dockerized for easy use of growthcleanr

Primary LanguageJinjaApache License 2.0Apache-2.0

growthcleanr-web

A simple web interface to the growthcleanr R package for cleaning clinical height and weight measurements. Presents a simple web interface for uploading CSV data, cleaning that data with growthcleanr, and making results available.

This application is intended for use in a containerized setting using Docker.

Note: please exercise appropriate caution when cleaning sensitive data that may contain PHI. This application does not differentiate between users, so all data processed through it will be visible to every user. When running this application on a local desktop, data stays on a single system, and should only be available to the user of that system. If it were to be deployed to a remote server which multiple users could access, additional steps would be required to ensure proper security and privacy controls are followed.

Written in Python 3 with the Flask web framework.

Installation

An up-to-date installation of Docker is required.

To run the webapp using Docker, run the image directly, specifying a port mapping:

% docker run -it -p 5000:5000 mitre/growthcleanr-web:latest

After the image is downloaded and run, visit http://localhost:5000/ in any browser to use the application.

Usage

This application receives CSV data files over its web interface, and then cleans data in those files using growthcleanr's cleangrowth() function. With a data file in the format specified by growthcleanr, use the app's Upload function to post your CSV file, and click "Refresh this page" until you see your that result dataset is done. Click the result file to download!

Note that the result files will have the word "cleaned" and a date/time prepended. This helps to avoid name conflicts if two files with the same name are uploaded.

Development

This application was developed using Python 3.8. To develop locally, in a clean environment (e.g., a virtualenv), clone the repository, switch to the new directory, and install dependencies:

% https://github.com/mitre/growthcleanr-web.git
% cd growthcleanr-web
% pip install -r requirements.txt

The application assumes that R and the R package growthcleanr are installed, and gcdriver.R (from growthcleanr's exec dir) is executable as /usr/local/bin/gcdriver.R.

With those pieces in place, start the application:

% python app.py

You should now be able to visit the application at http://localhost:5000/.

Note that when it first starts, the application will create the dataset, status, and result directories if necessary. New uploads land in dataset with a date/time prefix. Once a gcdriver process starts, a status file is created in status to track its process id (larger datasets can result in long-running processes of several minutes or more). When a process completes successfully, a results file will be written to result with the cleaned- prefix.

On every load of the index page in a browser, these three directories are scanned for determine dataset and process state. Status files for completed or failed processes should be cleaned up at this time.

Notice

Copyright 2020-2021 The MITRE Corporation.

Approved for Public Release; Distribution Unlimited. Case Number 19-2008