/dp-framework

Apache License 2.0Apache-2.0

DP Framework

About

From low-value inputs into high-value outputs - data value chain describes the full data lifecycle from collection to analysis and usage… and it’s not all about data transformation. An open-source dbt-based DP Framework has a goal to support the whole process in the spirit of data democratization, in a portable way to many of infrastructure choices and clouds.

dp_framework_logo.png

Key characteristics of DP Framework:

  • Single unified integration layer to stop "reinventing the wheel".
  • Readiness for diverse environments - flexibility in component selection to use them interchangeably
  • Ability to work on any environment and with any data storage.
  • Standardization, simplification and unification across projects (through templating)
  • Hides complexity form Analytical Engineers by grouping most of the interactions with a data platform into one user interface

Components

data-pipelines-CLI & Project Template Factory

data-pipelines-CLI: Project on GitHub (documentation) img.png

data-pipelines-CLI:

  • Building and managing data pipelines
  • Interaction with the whole data environment
  • Abstraction layer hiding complexity from the end user
  • Handling deployments and publications, automation support

Project Template Factory:

  • Defining standardized templates for your organization’s data pipelines
  • Differentiating config for different environments
  • Creating projects out of templates with a handy cookie cutter

project-template-factory.png

dbt-airflow-factory

dbt-airflow-factory: Project on GitHub (documentation)

  • parses dbt manifest files and builds orchestrator (Apache Airflow, GCP Workflows, Databricks Workflows) jobs
  • highly customizable, pluggable runtime
  • DAG is built on-the-fly - without materialization
  • task grouping, hiding ephemeral models, etc.
  • sends DAG failure notification to Slack or Microsoft Teams channel

dbt_to_DAG.png

img_1.png

Workshops

As GetInData we delivered a number of workshops on how to deploy dbt pipelines on production using best engineering practices with DP Framework.

Short demo of our Modern Data Platform with DP Framework: Watch Modern Data Platform with DP Framework demo

Tutorials

First Steps With DP Framework: GitHub

Blog posts & whitepapers

List of our publications on data platform architectures leveraging DP Framework:

  • Modern Data Platform - the what's, why's and how's? Demystifying the buzzword link

  • Announcing the GetInData Modern Data Platform - a self-service solution for Analytics Engineers link

  • GetInData Modern Data Platform - features & tools link

  • How we built a Modern Data Platform in 4 months for Volt.io, a FinTech scale-up. link

Presentations

Presentations on various conferences about DP Framework:

Contributions

All the components of DP Framework are open-source. Pull requests are welcome. Please check out detailed contribution instructions on particular project's repository.

Contact us

Contact us & sign up for DP Framework demo!