Pinned Repositories
airflow-backfill-util
Airflow Backfill UI based plugin for existing / new Airflow environment
akka
Akka Project
akka-cluster-on-kubernetes
Sample project for deploying Akka Cluster to Kubernetes. Presented at Scala Up North on July 21, 2017.
akka-sharding-example
Simple demonstration of akka-cluster sharding functionality
docker-java
Java Docker API Client
spark
Mirror of Apache Spark
Spark-tf-idf
spark
SprayESSearch
Http search service which fetch different kinds of data from Elasticsearch and return the data back to clients, implement by Scala
kafka-spark-consumer
High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper. No Data-loss. No dependency on HDFS and WAL. In-built PID rate controller. Support Message Handler . Offset Lag checker.
sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
AndyRao's Repositories
AndyRao/airflow-backfill-util
Airflow Backfill UI based plugin for existing / new Airflow environment
AndyRao/akka
Akka Project
AndyRao/akka-cluster-on-kubernetes
Sample project for deploying Akka Cluster to Kubernetes. Presented at Scala Up North on July 21, 2017.
AndyRao/docker-java
Java Docker API Client
AndyRao/angel
A Flexible and Powerful Parameter Server for large-scale machine learning
AndyRao/awesome-ml-for-cybersecurity
:octocat: Machine Learning for Cyber Security
AndyRao/ballista
Distributed compute platform implemented in Rust, using Apache Arrow memory model.
AndyRao/conduit
The ultralight service mesh for Kubernetes
AndyRao/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
AndyRao/fastText_java
Java port of c++ version of facebook fasttext
AndyRao/flink
Mirror of Apache Flink
AndyRao/GeekTime_Algorithm
AndyRao/incubator-mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
AndyRao/JavaGuide
【Java学习+面试指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。
AndyRao/kubeflow
Machine Learning Toolkit for Kubernetes
AndyRao/leeml-notes
李宏毅《机器学习》笔记,在线阅读地址:https://datawhalechina.github.io/leeml-notes
AndyRao/mleap
MLeap: Deploy Spark Pipelines to Production
AndyRao/n2
TOROS N2 - lightweight approximate Nearest Neighbor library which runs fast even with large datasets
AndyRao/pandora-docs
AndyRao/parquet-mr
Mirror of Apache Parquet
AndyRao/petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
AndyRao/pipeline
PipelineIO: End-to-End ML and AI Platform for Real-time Spark and Tensorflow Data Pipelines
AndyRao/presto
The official home of the Presto distributed SQL query engine for big data
AndyRao/pyspell
python log parser using "Spell: Streaming Parsing of System Event Logs"
AndyRao/scheduler
A Scala library for scheduling arbitrary code to run at an arbitrary time.
AndyRao/seldon-server
Enterprise machine learning platform for prediction and recommendation.
AndyRao/Sparta
Real Time Aggregation based on Spark Streaming
AndyRao/tensorflow
Computation using data flow graphs for scalable machine learning
AndyRao/tutorials
The "REST With Spring" Course:
AndyRao/vearch
A distributed system for embedding-based vector retrieval