Tidepool Data Science Project Template

Creating a new repository from this template

You can create a new repository from this template manually by going here.

Step by step directions from github are located here.

Instructions for updating this readme are as follows:

  • Everything in ( ) are instructions.
  • Everything in [ ] should be changed.
  • (Delete everything above the [Project Name])

[Project Name]

-- Project Status: [Active, On-Hold, Completed]

-- Project Disclaimer: This work is [Exploratory, Pre-Production, For-Production]

Project Objective

The purpose of this project is to [___].

Definition of Done

This phase of the project will be done when [___].

Project Description

(Add a short paragraph with some details, Why?, How?, Link to Jira and/or Confluence) In order to learn [], we did [].

Technologies (Update this list)

  • Python (99% of the time)
  • Anaconda for our virtual environments
  • Pandas for working with data (99% of the time)
  • Google Colab for sharing examples
  • Plotly for visualization
  • Pytest for testing
  • Travis for continuous integration testing
  • Black for code style
  • Flake8 for linting
  • Sphinx for documentation
  • Numpy docstring format
  • pre-commit for githooks

Getting Started with the Conda Virtual Environment

  1. Install Miniconda. CAUTION for python virtual env users: Anaconda will automatically update your .bash_profile so that conda is launched automatically when you open a terminal. You can deactivate with the command conda deactivate or you can edit your bash_profile.
  2. If you are new to Anaconda check out their getting started docs.
  3. If you want the pre-commit githooks to install automatically, then following these directions.
  4. Clone this repo (for help see this tutorial).
  5. In a terminal, navigate to the directory where you cloned this repo.
  6. Run conda update -n base -c defaults conda to update to the latest version of conda
  7. Run conda env create -f conda-environment.yml --name [input-your-env-name-here]. This will download all of the package dependencies and install them in a conda (python) virtual environment. (Insert your conda env name in the brackets. Do not include the brackets)
  8. Run conda env list to get a list of conda environments and select the environment that was created from the environmental.yml file (hint: environment name is at the top of the file)
  9. Run conda activate <conda-env-name> or source activate <conda-env-name> to start the environment.
  10. If you did not setup your global git-template to automatically install the pre-commit githooks, then run pre-commit install to enable the githooks.
  11. Run deactivate to stop the environment.

Getting Started with this project

  1. Raw Data is being kept [here](Repo folder containing raw data) within this repo. (If using offline data mention that and how they may obtain the data from the froup)
  2. Data processing/transformation scripts are being kept [here](Repo folder containing data processing scripts/notebooks)
  3. (Finishing filling out this list)

Contributing Guide

  1. All are welcome to contribute to this project.
  2. Naming convention for notebooks is [short_description]-[initials]-[date_created]-[version], e.g. initial_data_exploration-jqp-2020-04-25-v-0-1-0.ipynb. A short _ delimited description, the creator's initials, date of creation, and a version number,
  3. Naming convention for data files, figures, and tables is [PHI (if applicable)]-[short_description]-[date created or downloaded]-[code_version], e.g. raw_project_data_from_mnist-2020-04-25-v-0-1-0.csv, or project_data_figure-2020-04-25-v-0-1-0.png.

NOTE: PHI data is never stored in github and the .gitignore file includes this requirement as well.

Featured Notebooks/Analysis/Deliverables

Tidepool Data Science Team

Name (with github link) Tidepool Slack
Ed Nykaza @ed
Jason Meno @jason
Cameron Summers @Cameron Summers