Pinned Repositories
apachecon-bigtop
Resources for the demo presented at Bigtop presentation at ApacheCon 2013
azkaban-jobs
Few example jobs for Azkaban
bdtc-hive
Presentation on Apache Hive at Big Data TechCon
cloudcon-hive
Data set and queries that I use in my Hive and Impala presentations. Slides are usually posted at slideshare.net/markgrover
hadoop-intro-fast
Intro to Hadoop tutorial
hive-translate
Translate UDF for Apache Hive
homebrew-cdh
Homebrew packages for CDH
oscon-bigtop
Presentation on Apache Bigtop at OSCON 2013
spark-kafka-app
spark-secure-kafka-app
Sample Spark Streaming application for secure consumption from Kafka
markgrover's Repositories
markgrover/spark-secure-kafka-app
Sample Spark Streaming application for secure consumption from Kafka
markgrover/hadoop-intro-fast
Intro to Hadoop tutorial
markgrover/spark-kafka-app
markgrover/spree
Live-updating Spark UI built with Meteor
markgrover/amundsen
Repository for the Amundsen project
markgrover/amundsen-1
LP for Amundsen
markgrover/amundsen-io.github.io
markgrover/amundsen.github.io
markgrover/amundsendatabuilder
Data ingestion library for Amundsen to build graph and search index
markgrover/amundsenfrontendlibrary
Front-end service library for Amundsen
markgrover/amundsenmetadatalibrary
Metadata service library for Amundsen
markgrover/awesome-data-catalogs
📙 Awesome Data Catalogs and Observability Platforms.
markgrover/egads
Extendible Generic Anomaly Detection System
markgrover/founder-dating-ritual
A ritual or an agreed upon sequence of events that everyone looking to work together on a startup or a long-term, high-commitment project can follow.
markgrover/frec
Financial recommender written in Go
markgrover/hallzhallz.github.io
Blog Articles
markgrover/incubator-airflow
Apache Airflow (Incubating)
markgrover/kafka
Mirror of Apache Kafka
markgrover/markgrover.github.io
markgrover/oni-setup
Creates schema and HDFS folder structure for Open Network Insight. The code in this repo is a prerequisite for all components.
markgrover/open-network-insight
Open Network Insight is an open source solution for packet and flow analytics on Hadoop. It provides ingest and transform of binary data, scalable machine learning, and interactive visualization for identifying threats in network flows and DNS packets. Open Network Insight uses the open source projects Jupyter, nfdump, wireshark, and D3.
markgrover/Parameterized-Remote-Trigger-Plugin
A plugin to Jenkins CI which triggers parameterized builds on a remote Jenkins
markgrover/rfcs
RFCs for changes to Amundsen
markgrover/spark
Mirror of Apache Spark
markgrover/spark-20066
Test harness for SPARK-20066
markgrover/spark-app
markgrover/spark-avro
Avro support for Spark, SQL, and DataFrames
markgrover/spark-utils
Miscellaneous scripts and utilities related to spark and its deployment
markgrover/Taxi360
Simple Example of HBase, SolR, and Kudu for Entity 360 using NY taxi data
markgrover/test-app
Just testing to make sure particular versions of artifacts are available