Pinned Repositories
airflow
Apache Airflow
banking-faq-bot
This is retrieval based Chatbot based on FAQs found at a banking website.
docker-druid
Druid Docker
docker-stacks
Opinionated stacks of ready-to-run Jupyter applications in Docker.
docker-zeppelin
Docker image for apache zeppelin
druid
Column oriented distributed data store ideal for powering interactive applications
druid-docker
Docker containers for Druid nodes
hive
Mirror of Apache Hive
tpch-kit
tpch-kit for hive and spark
jerryjung's Repositories
jerryjung/tpch-kit
tpch-kit for hive and spark
jerryjung/airflow
Apache Airflow
jerryjung/banking-faq-bot
This is retrieval based Chatbot based on FAQs found at a banking website.
jerryjung/docker-druid
Druid Docker
jerryjung/docker-stacks
Opinionated stacks of ready-to-run Jupyter applications in Docker.
jerryjung/docker-zeppelin
Docker image for apache zeppelin
jerryjung/druid
Column oriented distributed data store ideal for powering interactive applications
jerryjung/druid-docker
Docker containers for Druid nodes
jerryjung/hive
Mirror of Apache Hive
jerryjung/incubator-spark
Mirror of Apache Spark
jerryjung/kube-yarn
Running YARN on Kubernetes with PetSet controller.
jerryjung/logstash-output-jdbc
JDBC output for Logstash
jerryjung/mlflow
Open source platform for the complete machine learning lifecycle
jerryjung/presto
Distributed SQL query engine for big data
jerryjung/seldon-server
Enterprise machine learning platform for prediction and recommendation.
jerryjung/spark
Mirror of Apache Spark
jerryjung/spark-druid-olap
Sparkline BI Accelerator is a Spark native Business Intelligence Stack geared towards providing fast ad-hoc querying over a Logical Cube(aka Star-Schema).
jerryjung/spark-kubernetes
Apache Spark on Kubernetes
jerryjung/spark-pivot-examples
spark pivot examples
jerryjung/SparkInternals
Notes talking about the design and implementation of Apache Spark
jerryjung/sun-java-formula
Flexible provisioning for JDK and JRE tarballs
jerryjung/tensorflow-recommendation-wals
An end-to-end solution for website article recommendations based on Google Analytics data. The model is based on matrix-factorization using the WALS algorithm, in TensorFlow, trained on Cloud ML Engine. Recommendations are served using App Engine Flex with Cloud Endpoints. Orchestration is performed using Airflow, running on Cloud Composer or GKE.
jerryjung/tpch-spark
TPC-H queries in spark SQL using native DataFrames API
jerryjung/tranquility
Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover, seamlessly and without downtime.