Pinned Repositories
gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
iceberg
Apache Iceberg
incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
spark
Apache Spark - A unified analytics engine for large-scale data processing
uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
jerryshao.github.com
my jekyll web page, forked from jekyll
spark-hive-streaming-sink
A sink to save Spark Structured Streaming DataFrame into Hive table
spark-kafka-0-8-sql
Spark Structured Streaming Kafka 0.8 Source Implementation
streaming-demo
A Spark Streaming demo framework that implements and improves the functions of Twitter Rainbird
tensorflow
An Open Source Machine Learning Framework for Everyone
jerryshao's Repositories
jerryshao/spark-kafka-0-8-sql
Spark Structured Streaming Kafka 0.8 Source Implementation
jerryshao/spark-hive-streaming-sink
A sink to save Spark Structured Streaming DataFrame into Hive table
jerryshao/streaming-demo
A Spark Streaming demo framework that implements and improves the functions of Twitter Rainbird
jerryshao/jerryshao.github.com
my jekyll web page, forked from jekyll
jerryshao/spark2-ambari-definition
Ambari definition to install Spark 2.0
jerryshao/spark-atlas-connector
A Spark Atlas connector to track data lineage in Apache Atlas
jerryshao/spark-streaming-kafka-0-10-connector
A Kafka 0.10 connector for Spark 1.x Streaming
jerryshao/spark-website
Mirror of Apache Spark Website
jerryshao/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
jerryshao/gravitino
A high-performance, geo-distributed and federated metadata lake
jerryshao/hudi
Upserts, Deletes And Incremental Processing on Big Data.
jerryshao/incubator-iceberg
Apache Iceberg (Incubating)
jerryshao/incubator-livy
Mirror of Apache livy (Incubating)
jerryshao/incubator-spark
Mirror of Apache Spark
jerryshao/incubator-uniffle
Uniffle is a high performance, general purpose Remote Shuffle Service.
jerryshao/kafka-input-format
A Kafka input format used in Hadoop or Spark for batch reading data from Kafka
jerryshao/spark
Scala framework for iterative and interactive cluster computing.
jerryshao/spark-terasort
Spark Terasort
jerryshao/apache-spark
Mirror of Apache Spark
jerryshao/gravitino-playground
A playground to experience Gravitino
jerryshao/gravitino-site
Apache gravitino
jerryshao/HiBench
HiBench is a big data benchmark suite.
jerryshao/hive
Mirror of Apache Hive
jerryshao/incubator-livy-website
Mirror of Apache livy (Incubating)
jerryshao/livy
Livy is an open source REST interface for interacting with Apache Spark from anywhere
jerryshao/Mastering-Machine-Learning-with-Spark
Mastering Machine Learning with Spark勘误
jerryshao/shc
jerryshao/storm-test-framework
storm performance test framework
jerryshao/tensorflow
Computation using data flow graphs for scalable machine learning
jerryshao/zeppelin
Mirror of Apache Zeppelin