Pinned Repositories
amoro
Amoro is a Lakehouse management system built on open data lake formats.
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
arrow-datafusion-comet
Apache Arrow DataFusion Comet Spark Accelerator
arthas
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
atlas
Apache Atlas
beryllw.github.io
flink
Apache Flink
flink-cdc
Flink CDC is a streaming data integration tool
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
spark
Apache Spark - A unified analytics engine for large-scale data processing
beryllw's Repositories
beryllw/amoro
Amoro is a Lakehouse management system built on open data lake formats.
beryllw/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
beryllw/arrow-datafusion-comet
Apache Arrow DataFusion Comet Spark Accelerator
beryllw/beryllw.github.io
beryllw/flink
Apache Flink
beryllw/flink-cdc
Flink CDC is a streaming data integration tool
beryllw/incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
beryllw/incubator-uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
beryllw/spark
Apache Spark - A unified analytics engine for large-scale data processing
beryllw/spark-connector-eventlog
Spark EventLog Connector
beryllw/BigDataNotes
beryllw/compass
Compass is a task diagnosis platform for bigdata
beryllw/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
beryllw/gluten
beryllw/iceberg
Apache Iceberg
beryllw/incubator-paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
beryllw/kyuubi-shaded
Apache Kyuubi Shaded Dependencies.
beryllw/pyspark-ai
English SDK for Apache Spark
beryllw/quine
Quine • a streaming graph • https://quine.io • Slack: https://that.re/quine-slack
beryllw/rust-study
beryllw/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
beryllw/spark-benchmark
sql benchmark for test spark+gluten
beryllw/spark-clickhouse-connector
Spark ClickHouse Connector build on DataSourceV2 API
beryllw/spark-eventlog-connector
Spark Eventlog Connector build on DataSourceV2 API
beryllw/spark-website
Apache Spark Website
beryllw/sqlancer
Automated testing to find logic bugs in database systems
beryllw/street-fighter-ai
This is an AI agent for Street Fighter II Champion Edition.
beryllw/temporary-work
beryllw/tugraph-analytics
TuGraph-analytics is a distribute streaming graph computing engine.
beryllw/velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.