A template for python projects made using cookiecutter.
.
├── LICENSE
├── environment.yml
├── .gitignore
├── README.md
├── setup.py
├── data
│ ├── external
│ ├── interim
│ ├── processed
│ └── raw
├── figures
├── notebooks
├── references
├── reports
└── src
├── __init__.py
├── analysis
├── etl
├── utils
└── viz
LICENSE
- License file for the project.
- Availiable options include MIT and BSD-3-Clause.
environment.yml
- The requirements file to reproduce the analysis environment.
.gitignore
- Ignores python user profile temporary files.
README.md
- Project specific readme.
setup.py
- Makes project pip installable with
pip install -e ../
.
- Makes project pip installable with
data
- This is the directory used to store all of the project's data. All files should go into one of the following folders.
data/external
- Data from third party sources.
data/interim
- Intermediate data that has been transformed.
data/processed
- The final, canonical data sets for analysis.
data/raw
- The original, inmutable data dump.
figures
- Generated graphics and figures to be used in reporting.
notebooks
- Any Jupyter Notebooks go here.
references
- Data dictionaries, manuals, and all other exploratory materials.
reports
- Generated analysis as HTML, PDF, LaTeX, etc.
src
- All the scripts in the project go here.
src/__init__.py
- Makes
src
a Python module.
- Makes
src/analysis
- Code that involves analysis on already-cleaned data. Code for cleaning data should go in
src/etl
. - Multiple analysis files are numbered sequentially.
- Code that involves analysis on already-cleaned data. Code for cleaning data should go in
src/etl
- ETL (extract, transform, load) scripts for reading in source data, cleaning and standardizing it to prepare for analysis go here.
- Joins are included in ETL process.
- Multiple ETL files are numbered sequentially.
src/utils
- Miscellaneous code goes here.
src/viz
- Graphics and visualization development specific work should go here.
- Multiple viz files are numbered sequentially.
This can be installed using either
pip install cookiecutter
or
conda install -c conda-forge cookiecutter
In the folder where you want to generate the project, run:
cookiecutter https://github.com/camartinezbu/cookiecutter-python-project
If you wish to create a conda environment based on the environment.yml
file, run:
# Go to the project's directory:
cd DIRECTORY_NAME
# Create a conda environmenT:
conda env create --file environment.yml
# Activate said environment:
conda activate ENVIRONMENT_NAME
Both
DIRECTORY_NAME
andENVIRONMENT_NAME
correspond to the project's slug name, defined when creating the template.
In order to set up the project's module, in the terminal run:
pip install -e ../
Or if you are in a notebook, run in a code cell:
! pip install -e ../
To use the module inside the notebook, add the following to the first cell:
%load_ext autoreload
%autoreload 2
This template was designed based on jvelesmagic's Cookiecutter Conda Data Science.
Check out a similar template for R here.