Python data analysis project structure template

Introduction

This is a suggested project setup for a data analysis project. It uses open source projects to help you streamline data analysis workflow, maintain a sane and sensible folder structure, and follow best practices. Specifically, it uses:

Poetry for environment and dependencies management
Nbdev and LineaPy for seamless transitions from messy, exploratory Jupyter notebooks to reusable code and packages with beautiful documentation
Kedro for datasource management via data catalogs and reproducible/visualizable pipelines
Prefect to orchestrate and schedule pipelines, with retries and complex error handling

... and other modern utility tools like linting with ruff, code coverage with slipcover

Getting started

Prerequisite

You need to have Poetry installed globally and Python >=3.8,<3.11

Setup steps

Click on the Use this template button to create your own repository

Clone your repo locally using git clone <repo url>
Edit values in change-my-values.yaml, project_name should be your repository's name

Run the following commands

python create-repository.py
make reset-project-with-install

Example repository structure when finished

.
├── conf
│   ├── base
│   ├── local
│   └── README.md
├── create-repository.py
├── data
│   ├── 01_raw
│   ├── 02_intermediate
│   ├── 03_primary
│   ├── 04_feature
│   ├── 05_model_input
│   ├── 06_models
│   ├── 07_model_output
│   └── 08_reporting
├── docs
│   └── source
├── kedro-answers.yml
├── LICENSE
├── logs
├── Makefile
├── MANIFEST.in
├── notebooks
│   ├── analyses
│   ├── exploratory
│   ├── generate_figures
│   └── package
├── poetry.lock
├── _proc
│   ├── 00_core.ipynb
│   ├── _docs
│   ├── index.ipynb
│   ├── nbdev.yml
│   ├── _quarto.yml
│   └── styles.css
├── _pyproject.toml
├── pyproject.toml
├── README_kedro.md
├── README.md
├── settings.ini
├── setup.py
└── src
    ├── requirements.txt
    ├── setup.py
    ├── test_package_project
    └── tests

hoangthienan95/test-package-project

Python data analysis project structure template

Introduction

Getting started

Prerequisite

Setup steps