Pinned Repositories
activator-scalding
An Activator template for the Scalding Big Data API
Akka-Essentials
BerkeleyX-CS100.1x-Big-Data-with-Apache-Spark
This repository contains code files specifically IPython notebooks for the assignments in the course "Introduction to Big Data with Apache Spark" by UC Berkeley and Databricks on edX
BerkeleyX-CS190.1x-Scalable-Machine-Learning
This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Berkeley and Databricks on edX
lambda_poc
example lambda architecture using Kafka, Spark, Cassandra, Hadoop
pig-programming
This samples will let you to extract useful statistics such as top 10 average rated movies, genre based filtering on 2 million records using Pig Latin.
spark-kafka-app
SparkPOC
A POC to build cache and aggregate using spark features and configurations
SRGAN-Keras
wellsgitlab
KalyanKumarPichuka's Repositories
KalyanKumarPichuka/avro-hadoop-starter
Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
KalyanKumarPichuka/awktut
Chapter folders, sample files and sample code for "Effective Awk Programming" by Anthony Robbins (O'Reilly, 2015).
KalyanKumarPichuka/cluster-mapred
simple cluster code based on map reduce
KalyanKumarPichuka/data-scientists-guide-apache-spark
Best practices of using Spark for practicing data scientists in the context of a data scientist’s standard workflow.
KalyanKumarPichuka/enlighten-apply
Example code and materials that illustrate applications of SAS machine learning techniques.
KalyanKumarPichuka/final-exercise-bdtraining
Hive, Pig and MapReduce solution to Globant's final evaluation exercise for the Big Data Course
KalyanKumarPichuka/hadoop-tutorials
hadoop-tutorials
KalyanKumarPichuka/hadoop-utility-scripts
Set of utility scripts for Hadoop Development
KalyanKumarPichuka/hbase-book
Contains the code used in the HBase: The Definitive Guide book.
KalyanKumarPichuka/hipi
HIPI: Hadoop Image Processing Interface
KalyanKumarPichuka/incubator-datafu
Mirror of Apache DataFu
KalyanKumarPichuka/learning-spark
Example code from Learning Spark book
KalyanKumarPichuka/learning-spark-examples
Examples for learning spark
KalyanKumarPichuka/MachineLearning
Literature Study
KalyanKumarPichuka/magellan
Geo Spatial Data Analytics on Spark
KalyanKumarPichuka/mapreduce-3
Mapreduce projects in hadoop
KalyanKumarPichuka/ml-workshop
KalyanKumarPichuka/mrunit-test-harness
MRUnit project test harness
KalyanKumarPichuka/NgramPredictor-MapReduceJob
We write three MapReduce jobs to 1. Extract Ngram from WikiPedia data 2. Predict N+1 gram based on N gram 3. Predict words based on prefix
KalyanKumarPichuka/pig-udf
Sample UDFs for Pig
KalyanKumarPichuka/ProgrammingWithScalding
Programming MapReduce with Scalding
KalyanKumarPichuka/realtimesystems
This repository consists of all the papers that are used as study material for Georgia Tech Graduate Real Time Systems Course CS6235
KalyanKumarPichuka/spark-hyperloglog
Interactive Audience Analytics with Spark and HyperLogLog
KalyanKumarPichuka/spark-movie-lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
KalyanKumarPichuka/Spark-POCS
Spark POCs
KalyanKumarPichuka/spark-sql-perf
KalyanKumarPichuka/spark-testfiles
KalyanKumarPichuka/spark-workshop
A Typesafe Activator tutorial for Apache Spark.
KalyanKumarPichuka/SparkETL
ETL script for Spark Hive Hadoop etc.
KalyanKumarPichuka/trace-analysis
Scripts to analyze Spark's performance