warrenzhu25

Thinking for living

@Google

Pinned Repositories

spark
Apache Spark - A unified analytics engine for large-scale data processing
Language:Scala39.4k 2k 028.2k
algorithm004-05
Language:Java00
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Language:Rust00
arthas
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
Language:Java00
awesome-scalability
High Scalability, High Availability, High Stability, High Performance, and High Intelligence Back-end Designs
00
Bargain-on-Azure
Real time dynamic discount based on Azure's intelligent resource usage predication
Language:Java00
Knowledge
Record useful ideas and guides
Language:Python11
sparkInsight
Spark auto performance tuning and failure analysis tool
Language:Scala10

warrenzhu25's Repositories

warrenzhu25/sparkInsight
Spark auto performance tuning and failure analysis tool
Language:Scala10
warrenzhu25/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Language:Rust00
warrenzhu25/arthas
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
Language:Java00
warrenzhu25/ClickHouse
ClickHouse® is a free analytics DBMS for big data
Language:C++00
warrenzhu25/blaze
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
warrenzhu25/compass
Compass is a task diagnosis platform for bigdata
warrenzhu25/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
warrenzhu25/dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
warrenzhu25/dotfiles
warrenzhu25/flink
Apache Flink
Language:Java
warrenzhu25/fluent-bit-kubernetes-logging
Fluent Bit Kubernetes Daemonset
warrenzhu25/geektime_dl
把极客时间装进 Kindle，内含快手内推等福利
Language:Python
warrenzhu25/gluten
Gluten: Plugin to Double SparkSQL's Performance
Language:Scala
warrenzhu25/grokking-the-object-oriented-design-interview
warrenzhu25/hadoop
Apache Hadoop
warrenzhu25/hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
warrenzhu25/incubator-kyuubi
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
Language:Scala
warrenzhu25/incubator-livy
Mirror of Apache livy (Incubating)
warrenzhu25/incubator-uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
warrenzhu25/leetCodeSolutions
Get most voted leetcode solutions for a problem
warrenzhu25/pyspark-ai
English SDK for Apache Spark
Language:Python
warrenzhu25/RemoteShuffleService
Language:Java
warrenzhu25/spark
Apache Spark
Language:Scala3
warrenzhu25/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
warrenzhu25/spark-rapids-tools
User tools for Spark RAPIDS
warrenzhu25/sparklens
Qubole Sparklens tool for performance tuning Apache Spark
warrenzhu25/sql-server-samples
Azure Data SQL Samples - Official Microsoft GitHub Repository containing code samples for SQL Server, Azure SQL, Azure Synapse, and Azure SQL Edge
warrenzhu25/tidb
TiDB is an open source distributed HTAP database compatible with the MySQL protocol
warrenzhu25/Tpc-benchmark
Tool for generate tpc-ds data and run benchmark
Language:Scala1
warrenzhu25/uber-RemoteShuffleService
Remote shuffle service for Apache Spark to store shuffle data on remote servers.