Pinned Repositories
celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
spark
Apache Spark - A unified analytics engine for large-scale data processing
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
ClickHouse-Native-JDBC
ClickHouse Native Protocol JDBC implementation
compass
Compass is a task diagnosis platform for bigdata
Firestorm
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers
gluten
incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
spark-tfrecord
Read and write Tensorflow TFRecord data from Apache Spark.
mcdull-zhang's Repositories
mcdull-zhang/spark-tfrecord
Read and write Tensorflow TFRecord data from Apache Spark.
mcdull-zhang/gluten
mcdull-zhang/Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
mcdull-zhang/celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
mcdull-zhang/ClickHouse-Native-JDBC
ClickHouse Native Protocol JDBC implementation
mcdull-zhang/compass
Compass is a task diagnosis platform for bigdata
mcdull-zhang/Firestorm
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers
mcdull-zhang/hadoop
Apache Hadoop
mcdull-zhang/hive
Apache Hive
mcdull-zhang/hue
Open source SQL Query Assistant service for Databases/Warehouses
mcdull-zhang/incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
mcdull-zhang/hudi
Upserts, Deletes And Incremental Processing on Big Data.
mcdull-zhang/incubator-seatunnel
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
mcdull-zhang/mcdull-zhang.github.io
mcdull-zhang/netty
Netty project - an event-driven asynchronous network application framework
mcdull-zhang/orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
mcdull-zhang/RemoteShuffleService
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
mcdull-zhang/spark
Apache Spark - A unified analytics engine for large-scale data processing
mcdull-zhang/spark-clickhouse-connector
Spark ClickHouse Connector build on DataSourceV2 API and gRPC protocol.
mcdull-zhang/spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
mcdull-zhang/SparkNote
mcdull-zhang/velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.