theclaymethod's Stars
spotify/luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
deeplearning4j/deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
haifengl/smile
Statistical Machine Intelligence & Learning Engine
mesos/chronos
Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
milessabin/shapeless
Generic programming for Scala
getguesstimate/guesstimate-app
Create Fermi Estimates and Perform Monte Carlo Estimates
sixpack/sixpack
Sixpack is a language-agnostic a/b-testing framework
sryza/aas
Code to accompany Advanced Analytics with Spark from O'Reilly Media
yahoo/egads
A Java package to automatically detect anomalies in large scale time-series data
datumbox/datumbox-framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
QubitProducts/bamboo
HAProxy auto configuration and auto service discovery for Mesos Marathon
sequenceiq/docker-spark
miguno/kafka-storm-starter
[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
dibbhatt/kafka-spark-consumer
High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper. No Data-loss. No dependency on HDFS and WAL. In-built PID rate controller. Support Message Handler . Offset Lag checker.
aws-samples/emr-bootstrap-actions
This repository hold the Amazon Elastic MapReduce sample bootstrap actions
everpeace/vagrant-mesos
Spin up your Mesos Cluster with Vagrant! (VirtualBox and AWS)
analytically/hadoop-ansible
Ansible playbook that installs a Hadoop cluster, with HBase, Hive, Presto for analytics, and Ganglia, Smokeping, Fluentd, Elasticsearch and Kibana for monitoring and centralized log indexing.
miguno/wirbelsturm
[PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
daynebatten/keras-wtte-rnn
Demo Weibull Time-to-event Recurrent Neural Network in Keras
intentmedia/mario
Functional, Typesafe, Declarative Data Pipelines
nathanmarz/kafka-deploy
Automated deploy for Kafka on AWS
HHammond/kcbo
A Bayesian testing framework written in Python.
eigengo/lift
...Do you even? Exercise in exercise analysis
kifi/ReactiveLDA
ReactiveLDA is a fast, lightweight implementation of the Latent Dirichlet Allocation (LDA) algorithm, using a parallel vanilla Gibbs sampling algorithm.
mesos/spark-ec2
[NOTE: Repository has moved to github.com/amplab/spark-ec2]
theclaymethod/Foundry-vagrant-mesos-kafka-cluster
A Vagrant/Ansible => Kafka, Mesos (w/ Marathon/Docker), ZK, Hadoop, and Spark. Service discovery via HAProxy and Bamboo.
kanzure/docker-basenode
Docker service discovery where applications in each container route traffic through localhost haproxy to connect to other services in the cluster. Don't hardcode IP addresses.
AlpineNow/SparkML2
kadel/Dockerfiles
Dockerfiles
bobtfish/nerve-etcd
Nerve registration container (etcd backend)