great-expectations
There are 63 repositories under great-expectations topic.
iusztinpaul/energy-forecasting
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 2.5 𝘩𝘰𝘶𝘳𝘴 𝘰𝘧 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 & 𝘷𝘪𝘥𝘦𝘰 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴
adidas/lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
josephmachado/data_engineering_best_practices
Sample project to demonstrate data engineering best practices
trannhatnguyen2/NYC_Taxi_Data_Pipeline
Nyc_Taxi_Data_Pipeline - DE Project
GokuMohandas/testing-ml
Learn how to create reliable ML systems by testing code, data and models.
provectus/data-quality-gate
Data Quality Gate based on AWS
NatanMish/data_validation
Tutorial for implementing data validation in data science pipelines
MDS-BD/hands-on-great-expectations-with-spark
How to evaluate the Quality of your Data with Great Expectations and Spark.
PrefectHQ/prefect-great-expectations
Prefect integrations for interacting with Great Expectations
luatnc87/modern-data-warehouse-modeling-and-data-quality-with-dbt-openmetadata
This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source tools
moritzkoerber/covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
ismaildawoodjee/GreatEx
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
BirdiD/BirdiDQ
BirdiDQ leverages the power of the Python Great Expectations open-source library and combines it with the simplicity of natural language queries to effortlessly identify and report data quality issues, all at the tip of your fingers.
josephmachado/data_engineering_best_practices_log
Code to demonstrate data engineering metadata & logging best practices
grillazz/fastapi-greatexpectations
Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool
luchonaveiro/open-source-data-stack
Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.
serialbandicoot/great-assertions
This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.
datarootsio/notion-dbs-data-quality
Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.
great-expectations/cloud
Source code for the gx cloud agent
adidas/lakehouse-engine-docs
The Goal of this project is to provide documentation for the Lakehouse Engine framework.
piyush-an/NYC-Restaurant-Inspection
A data warehousing application on NYC Food Inspection
k3ai/plugins
A lightweight tool to get an AI Infrastructure Stack up in minutes not days. K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
phatnguyen080401/Real-Estate-Sale-Analytics
Create data pipeline using Lambda architecture with Spark, Kafka, Airflow and Snowflake
JuanCampbsi/analytics_engineering_airbnb
In this project, dbt, Great Expectations, Python and Pandas were used to transform and validate the "Inside Airbnb" dataset. The tools ensure quality data, ready for analysis.
KoenvdBerg/csv-validator
Validates tabular CSV data using predefined validations, inspired from its Python homologue "Great Expectations".
luchonaveiro/great-expectations-postgres-tutorial
Tutorial using Great Expectations library, validating and profiling data on a local PostgreSQL database.
xxl4tomxu98/data-engineering-python-great-expectations
Demo on Data Engineering using Great Expectations API
anilkulkarni87/databricks_notebooks
A collection of Databricks notebooks for testing and learning
MagisterUnivers/undefined-Team-Project
A great project with a top teammates. [...undefined] will break through the roof!!!
PbVrCt/time-series-pipeline
A pipeline to forecast the direction stock prices from data from eodhistoricaldata.com
aravinthsci/great-expectations-site
Dockerizing Data Docs autogenerated by Great Expectations using FastAPI Jinja Templates .
brendajanuario/pipeline-bigdata-pyspark
Personal Data Engineering project witch the objective is create the Data Lakehouse for a B2B e-commerce that must store the transactional and analytical data of the business. The final system delivers structured and clean data with the purpose of generate reports and find opportunities.
ismaildawoodjee/Great-Expectations-for-CSV
Ensuring data quality in an e-commerce data set using Great Expectations.
J6Software/Jan6Coin
The $JAN6 Commemorative Coin. The Only MEME that shouts FREEDOM!
MagisterUnivers/Undefined-project-2
A second project with a top teammates. [...undefined] will keep the quality, as always.
VuBacktracking/data-batch-processing
Batch Data Processing Pipeline using MinIO, Spark, PostgreSQL, Great Expectations, DBT and Dbeaver