Pinned Repositories
cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows on a Hadoop cluster. See https://github.com/Cascading/cascading for the release repository.
cascading.avro
Cascading Scheme for the Apache Avro data serialization format
cascalog
Data processing on Hadoop without the hassle.
cascalog-cascading-test
1. Jcascalog/Cascalog and cascading performance test. 2. Creating maven project for Jcascalog
cluster-analysis
this project is for assignments of cluster analysis
competitive-data-science
Materials for "How to Win a Data Science Competition: Learn from Top Kagglers" course
cqengine
Ultra-fast SQL-like queries on Java collections
dyzie
This is to create oozie coordinator and workflow xmls dynamically using java api
Hive-JSON-Serde
Read - Write JSON SerDe for Apache Hive.
incubator-predictionio
PredictionIO, a machine learning server for developers and ML engineers. Built on Apache Spark, HBase and Spray.
sourabhchaki's Repositories
sourabhchaki/cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows on a Hadoop cluster. See https://github.com/Cascading/cascading for the release repository.
sourabhchaki/cascading.avro
Cascading Scheme for the Apache Avro data serialization format
sourabhchaki/cascalog
Data processing on Hadoop without the hassle.
sourabhchaki/cascalog-cascading-test
1. Jcascalog/Cascalog and cascading performance test. 2. Creating maven project for Jcascalog
sourabhchaki/cluster-analysis
this project is for assignments of cluster analysis
sourabhchaki/competitive-data-science
Materials for "How to Win a Data Science Competition: Learn from Top Kagglers" course
sourabhchaki/cqengine
Ultra-fast SQL-like queries on Java collections
sourabhchaki/dyzie
This is to create oozie coordinator and workflow xmls dynamically using java api
sourabhchaki/Hive-JSON-Serde
Read - Write JSON SerDe for Apache Hive.
sourabhchaki/incubator-predictionio
PredictionIO, a machine learning server for developers and ML engineers. Built on Apache Spark, HBase and Spray.
sourabhchaki/incubator-zeppelin
Mirror of Apache Zeppelin (Incubating)
sourabhchaki/intellij-scala
Scala plugin for IntelliJ IDEA
sourabhchaki/learnantlr
A project to learn ANTLR4 based on Maven
sourabhchaki/mapreducepatterns
Repository for MapReduce Design Patterns (O'Reilly 2012) example source code
sourabhchaki/oozie
Mirror of Apache Oozie
sourabhchaki/oozie-cloudera
Oozie - workflow engine for Hadoop
sourabhchaki/practice
sourabhchaki/scalding
A Scala API for Cascading
sourabhchaki/Shadowfax
sourabhchaki/spark
Mirror of Apache Spark
sourabhchaki/spark-sorted
Secondary sort and streaming reduce for Spark
sourabhchaki/spark-thrift-test
Example of reading writing thrift using spark
sourabhchaki/spark-ts-examples
Spark TS Examples
sourabhchaki/sqlite-parser
An ANTLR4 grammar for SQLite statements.
sourabhchaki/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
sourabhchaki/utils-extra
sourabhchaki/zeppelin
DEPRECATED. Zeppelin has moved to Apache. Please make pull request there