kbendick
I work on distributed systems, mostly big data as an open source dev working on Apache Iceberg and friends. But mostly, I walk my dog a lot.
@tabular-io Los Angeles, CA
Pinned Repositories
iceberg
Apache Iceberg
spark
Apache Spark - A unified analytics engine for large-scale data processing
docker-spark-iceberg
docker-spark-iceberg
flink
Apache Flink
kafka
Mirror of Apache Kafka
MongoMart
Final course project for M101JS - MongoDB for NodeJS Developers at Mongo University
kbendick's Repositories
kbendick/docker-spark-iceberg
kbendick/nessie
Nessie provides Git-like capabilities for your Data Lake
kbendick/flink
Apache Flink
kbendick/academy
Ray tutorials from Anyscale
kbendick/arctic
Arctic is a streaming lake warehouse service open sourced by NetEase
kbendick/arrow-rs
Official Rust implementation of Apache Arrow
kbendick/avro
Apache Avro is a data serialization system.
kbendick/dbt-spark
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
kbendick/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
kbendick/docker-stacks
Ready-to-run Docker images containing Jupyter applications
kbendick/graviton2-workshop
kbendick/iceberg
Apache Iceberg (Incubating)
kbendick/iceberg-rs
kbendick/jnr-ffi
Java Abstracted Foreign Function Layer
kbendick/k8s-device-plugin
NVIDIA device plugin for Kubernetes
kbendick/mermaid
Generation of diagram and flowchart from text in a similar manner as markdown
kbendick/message-backend-ray
A Ray port of the message backend
kbendick/ngods-stocks
New Generation Opensource Data Stack Demo
kbendick/orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
kbendick/ozone
Scalable, redundant, and distributed object store for Apache Hadoop
kbendick/parquet-mr
Apache Parquet
kbendick/presto
The official home of the Presto distributed SQL query engine for big data
kbendick/python-zstandard
Python bindings to the Zstandard (zstd) compression library
kbendick/querybook
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
kbendick/scalingpythonml
Scaling Python Machine Learning
kbendick/spark
Apache Spark
kbendick/spark-cassandra-connector
DataStax Spark Cassandra Connector
kbendick/superset
Apache Superset is a Data Visualization and Data Exploration Platform
kbendick/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
kbendick/zstd-jni
JNI binding for Zstd