Pinned Repositories
amoro
Amoro is a Lakehouse management system built on open data lake formats.
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
arrow-datafusion-comet
Apache Arrow DataFusion Comet Spark Accelerator
arthas
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
atlas
Apache Atlas
duckdb-journey
flink
Apache Flink
flink-cdc
Flink CDC is a streaming data integration tool
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
spark
Apache Spark - A unified analytics engine for large-scale data processing
beryllw's Repositories
beryllw/duckdb-journey
beryllw/amoro
Amoro is a Lakehouse management system built on open data lake formats.
beryllw/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
beryllw/arrow-datafusion-comet
Apache Arrow DataFusion Comet Spark Accelerator
beryllw/beryllw.github.io
beryllw/flink
Apache Flink
beryllw/flink-cdc
Flink CDC is a streaming data integration tool
beryllw/incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
beryllw/spark
Apache Spark - A unified analytics engine for large-scale data processing
beryllw/BigDataNotes
beryllw/chunjun
A data integration framework
beryllw/compass
Compass is a task diagnosis platform for bigdata
beryllw/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
beryllw/flink-cdc-playground
playground for flink-cdc
beryllw/fluss
Fluss is a streaming storage built for real-time analytics.
beryllw/gitbook
The open source frontend for GitBook doc sites
beryllw/gluten
beryllw/gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
beryllw/gravitino-playground
A playground to experience Gravitino
beryllw/incubator-paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
beryllw/kyuubi-shaded
Apache Kyuubi Shaded Dependencies.
beryllw/rust-by-practice
Learning Rust By Practice, narrowing the gap between beginner and skilled-dev through challenging examples, exercises and projects.
beryllw/rust-study
beryllw/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
beryllw/spark-benchmark
sql benchmark for test spark+gluten
beryllw/spark-eventlog-connector
Spark Eventlog Connector build on DataSourceV2 API
beryllw/starrocks-connector-for-apache-flink
beryllw/temporary-work
beryllw/tugraph-analytics
TuGraph-analytics is a distribute streaming graph computing engine.
beryllw/velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.