powerwu's Stars
oxnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Angel-ML/angel
A Flexible and Powerful Parameter Server for large-scale machine learning
lw-lin/CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等
apache/paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
dokterdok/Continuity-Activation-Tool
An all-in-one tool to activate and diagnose macOS 10.10-12 Continuity on compatible Mac configurations.
linkedin/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
apache/celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
entron/entity-embedding-rossmann
RevolutionAnalytics/RHadoop
RHadoop
ColZer/DigAndBuried
挖坑与填坑
jseidman/hadoop-R
Example code for running R on Hadoop
alibaba/SparkCube
SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.
aliyun/aliyun-maxcompute-data-collectors
nexr/RHive
RHive is an R extension facilitating distributed computing via Apache Hive.
aliyun/aliyun-odps-java-sdk
ODPS SDK for Java Developers
ihainan/SparkInternals
Learning notes of Apache Spark source code
databricks/pig-on-spark
proof-of-concept implementation of Pig-on-Spark integrated at the logical node level
yclim/spark-kafka-offset-monitoring
theopenlab/leveldbjni
A Java Native Interface to LevelDB