jakemongaya's Stars
apache/polaris
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
gptscript-ai/gptscript
Build AI assistants that interact with your systems
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
ivoalbrecht/data-content
A repository for the best data content, from data science to data engineering
streamdal/plumber
A swiss army knife CLI tool for interacting with Kafka, RabbitMQ and other messaging systems.
kanton-bern/hellodata-be
The Open-Source Enterprise Data Platform in a single Portal
getindata/dbt-flink-adapter
Adapter for dbt that executes dbt pipelines on Apache Flink
apache/iceberg
Apache Iceberg
datahub-project/datahub
The Metadata Platform for your Data Stack
vektra/mockery
A mock code autogenerator for Go
gorse-io/gorse
Gorse open source recommender system engine
notiz-dev/prisma-dbml-generator
Prisma DBML Generator
opentofu/manifesto
The OpenTF Manifesto expresses concern over HashiCorp's switch of the Terraform license from open-source to the Business Source License (BSL) and calls for the tool's return to a truly open-source license.
pingcap/tidb
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
acryldata/dbt-impact-action
OpenLineage/OpenLineage
An Open Standard for lineage metadata collection
apache/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
paypal/data-contract-template
Template for a data contract used in a data mesh.
paypal/hera
High Efficiency Reliable Access to data stores
open-metadata/OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
sourcery-ai/sourcery
Instant AI code reviews
dotansimha/graphql-code-generator
A tool for generating code based on a GraphQL schema and GraphQL operations (query/mutation/subscription), with flexible support for custom plugins.
uber/cadence
Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.
borjavb/slatraker
Track SLAs
sqlalchemy/sqlalchemy
The Database Toolkit for Python
cloudevents/spec
CloudEvents Specification
capitalone/DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
EqualExperts/dbt-unit-testing
This dbt package contains macros to support unit testing that can be (re)used across dbt projects.
sdv-dev/SDV
Synthetic data generation for tabular data
astronomer/astronomer-cosmos
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code