Cookiecutter Data Science Py3Tkinter Template

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work._
A template for command-line utility cookiecutters to create a Python 3 package with a PySimpleGUI.
cookiecutter is a command-line utility that creates projects from cookiecutters (project templates). See cookiecutter.readthedocs.io.

Project homepage

Motivation

A project template that promotes good practices for reproducible data science (immutablity of raw data, seperation of exploratory code and "final" analysis code), while giving options for more or less complex projects

Features

Works on 3.7 (Earlier Python versions untested).
choice for managing packages and virtualenvs
- Pipenv
- pip
- conda
Modern CLI with Typer.
Batteries included: Pandas, numpy, scipy, seaborn, and jupyterlab already installed.
Consistent code quality: black, isort, autoflake, and pylint already installed.
Pytest for testing.
Provide a operation research (OR) demo
Provide a machine learning (ML) demo sales predict

Requirements to use the cookiecutter template:

Python 3.7+
Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:

Install the latest Cookiecutter and Pipenv:

$ pip install -U pipenv cookiecutter

$ conda config --add channels conda-forge
$ conda install cookiecutter

To start a new project, run:

    cookiecutter gh:juforg/cookiecutter-ds-py3gui

Params Description

use_gui

whether to use gui packages, such as mip,pulp
use_ml

whether to use machine learning packages, such as lightgbm , wandb
use_or

whether to use operation research (OR)

Get inside the project:

cd <repo_name>
pipenv shell  # activates virtualenv

The resulting directory structure

The directory structure of your new project looks like this:

├── LICENSE      <- Your project's license.
   ├── README.md          <- The top-level README for developers using this project.
    ├── data
    │   ├── 0_raw                 <- The original, immutable data dump.
    │   ├── 1_external            <- Data from third party sources.
    │   ├── 2_interim             <- Intermediate data that has been transformed.
    │   └── 3_processed               <- The processed, pureness data sets for modeling.
    ├── docker               <- docker files
    ├── docs               <- A default Sphinx project; see sphinx-doc.org for details
    │   ├── data_dictionaries     <- Data dictionaries
    |   └── references            <- Papers, manuals, and all other explanatory materials.
    │
    ├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
    │                         the creator's initials, and a short `-` delimited description, e.g.
    │                         `01_sj_exploratory_data_analysis.ipynb`.
    ├── output
    │   ├── features              <- Fitted and serialized features
    │   ├── models                <- Trained and serialized models, model predictions, or model summaries
    │   └── reports               <- Generated analyses as HTML, PDF, LaTeX, etc.
    │       └── figures           <- Generated graphics and figures to be used in reporting
    │
    ├── tests              <- test scripts
    ├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
    │                         generated with `pip freeze > requirements.txt`
    │
    ├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
    ├── {{cookiecutter.repo_name}}                <- Source code for use in this project.
    │   ├── __init__.py    <- Makes src a Python module
    │   │
    │   ├── data           <- Scripts to download or generate data or etl
    │   │   └── make_dataset.py
    │   │
    │   ├── features       <- Scripts to turn raw data into features for modeling
    │   │   └── build_features.py
    │   ├── gui       <-  Scripts to support interactive with GUI
    │   │   └── locale      <-  I18N
    │   ├── models         <- Scripts to train models and then use trained models to make
    │   │   │                 predictions
    │   │   ├── predict_model.py
    │   │   └── train_model.py
    │   │
    │   ├── visualization  <- Scripts to create exploratory and results oriented visualizations
    │   |   └── visualize.py
    |   ├── {{cookiecutter.repo_name}}_gui_main.py           <- Script with option for running the final analysis.
    |   ├── {{cookiecutter.repo_name}}_main.py               <- Script with option for running the final analysis.
    |   ├── Snakefile                <- Script with options for running the final analysis.
    |   ├── Makefile           <- Makefile with commands like `make data` or `make train` Script with options for running the final analysis.
    │
    └── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io
    ├── Pipfile                   <- The Pipfile for reproducing the analysis environment
    ├── .gitignore                <- GitHub's excellent Python .gitignore customized for this project

Installing development requirements

pip install -r requirements.txt

Running the tests

py.test tests

screenshot

This project is meant to adapt (and borrows liberally from) Driven Data's cookicutter-data-science structure and philosophy to slightly different needs.

juforg/cookiecutter-ds-py3gui