Pinned Repositories
blog-tpcds-dbt-duckdb
This repository contains the tpcds queries together with the code required to run this benchmark for dbt and duckdb
conveyor-roadmap
This is the public roadmap for Conveyor.
conveyor-samples
Samples on how to use Conveyor.
conveyor-templates
Cookiecutter templates used by Conveyor.
demo-elections2024-website
demo-llm-hackathon
lighthouse
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines and apply best practices.
python-and-spark-for-data-analysis
A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course given by Patrick Varilly to one of our clients in December 2015
spark_on_azure_batch_demo
webinar-containers
Dataminded's Repositories
datamindedbe/lighthouse
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines and apply best practices.
datamindedbe/blog-tpcds-dbt-duckdb
This repository contains the tpcds queries together with the code required to run this benchmark for dbt and duckdb
datamindedbe/demo-elections2024-website
datamindedbe/demo-llm-hackathon
datamindedbe/incubator-sync-upgrade
datamindedbe/blog-platform-quack-quack-ka-ching
The duck escapes with the credits.
datamindedbe/conveyor-samples
Samples on how to use Conveyor.
datamindedbe/iceberg-ingestion
Public repository containing sample code for how to improve ETL ingestion processes with Apache Iceberg
datamindedbe/homebrew-conveyor-formulas
Brew tap repository for Conveyor
datamindedbe/terraform-provider-conveyor
datamindedbe/academy_git
datamindedbe/academy_linux
datamindedbe/conveyor-templates
Cookiecutter templates used by Conveyor.
datamindedbe/dbt-testing-hackathon
datamindedbe/playground-duckdb-wasm
datamindedbe/academy-capstone
datamindedbe/aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
datamindedbe/dbt-conveyor-snowflake
The Conveyor Snowflake adapter is a thin shell around the Snowflake adapter to allow authenticating users in Conveyor IDE's with Snowflake to run DBT projects
datamindedbe/dbt-playground
Try out dbt in a Gitpod environment in one click, with a Postgres database pre-configured
datamindedbe/ecr-mirror
Mirror public repositories to internal ECR repos
datamindedbe/eks-spark-benchmark
Performance optimization for Spark running on Kubernetes
datamindedbe/git-credential-oauth
A Git credential helper that securely authenticates to GitHub, GitLab and BitBucket using OAuth.
datamindedbe/iris
Artifacts related to a training on running stream processing pipelines
datamindedbe/kubernetes_academy_course
datamindedbe/playground-engine-query
datamindedbe/snowflake-gitpod
datamindedbe/spark-sql-perf
datamindedbe/terraform-aws-eks
Terraform module to create an Elastic Kubernetes (EKS) cluster and associated resources 🇺🇦
datamindedbe/terraform-provider-dmcloud
datamindedbe/webinar-cross-dag-Airflow