Pinned Repositories
.vim
My vim configurations including syntax, colors and plugins
ai
airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
amazon-emr-on-eks-labs
Amazon EMR on EKS use case labs
ansible-hortonworks
Ansible playbooks for deploying Hortonworks Data Platform and DataFlow
data-platform-containers
Container images big data platform on k8s
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
jupyterlab
JupyterLab computational environment.
spark
Mirror of Apache Spark
khwj's Repositories
khwj/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
khwj/data-platform-containers
Container images big data platform on k8s
khwj/spark
Mirror of Apache Spark
khwj/ai
khwj/airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
khwj/amazon-emr-on-eks-labs
Amazon EMR on EKS use case labs
khwj/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
khwj/apache-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
khwj/arrow-rs
Official Rust implementation of Apache Arrow
khwj/aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
khwj/brew
🍺 The missing package manager for macOS (or Linux)
khwj/connectors
Connectors for Delta Lake
khwj/cube
📊 Cube — Headless Business Intelligence for Building Data Applications
khwj/data-platform-charts
khwj/delta-rs
A native Rust library for Delta Lake, with bindings into Python and Ruby.
khwj/dremio-oss
Dremio - the missing link in modern data
khwj/druid
Apache Druid (Incubating) - Column oriented distributed data store ideal for powering interactive applications
khwj/grafana
The tool for beautiful monitoring and metric analytics & dashboards for Graphite, InfluxDB & Prometheus & More
khwj/hive
Apache Hive
khwj/hive-driver
Driver for connection to Apache Hive via Thrift API
khwj/howtographql
The Fullstack Tutorial for GraphQL
khwj/jupyterlab-filesystem
khwj/jupyterlab-github
GitHub integration for JupyterLab
khwj/kaniko
Build Container Images In Kubernetes
khwj/personal-analytics
khwj/polynote
A better notebook for Scala (and more)
khwj/react-query-course
The project that is built over the course of the React Query course at ui.dev
khwj/rust-the-book
The Rust Programming Language
khwj/spark-on-k8s-ui-proxy
khwj/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.