Pinned Repositories
activejdbc
ActiveJDBC is a fast and lean Java ORM
advanced-scala-code
Companion code for the Mastering Advanced Scala book https://leanpub.com/mastering-advanced-scala
advanced-scala-code-1
Code examples for Underscore's Advanced Scala course
Cloudera-Hadoop-for-Developers
devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
kudu-summit-2016
scoozie
Scala DSL on top of Oozie XML
maheshsv's Repositories
maheshsv/Cloudera-Hadoop-for-Developers
maheshsv/kudu-summit-2016
maheshsv/activejdbc
ActiveJDBC is a fast and lean Java ORM
maheshsv/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
maheshsv/BestPractices
Demonstrates The Basic Java Practices
maheshsv/big-data-mapreduce-course
Big Data, MapReduce, Spark, PySpark, Java @ Santa Clara University, Fall 2016
maheshsv/building-spark-applications-live-lessons
maheshsv/cc16-java-amqp
Codecamp 2016 Java AMQP lecture
maheshsv/cc16-java-aop-multithreading
Codecamp 2016 AOP/multithreading lecture
maheshsv/cc16-java-elasticsearch
Codecamp 2016 Java Elasticsearch lecture
maheshsv/cloudera-earthquake
maheshsv/Cloudera-manager-Installation-automation
This script automates the cloudera manager installation after ensuring all the hadoop prerequisites.Ansible is employed as the automation tool.
maheshsv/Cloudera_Hands_on
maheshsv/confluent-kafka-python
Confluent's Apache Kafka Python client
maheshsv/data-algorithms-book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
maheshsv/Frequent-Itemset-Mining
A method of market basket analysis to mine frequent set of items
maheshsv/go-workshop
maheshsv/hadoop-mini-clusters
maheshsv/hadoop-pseudo-distributed
CDH 5 with YARN on a Single Linux Node in Pseudo-distributed mode
maheshsv/hdinsight-python-ooziebot
OozieBot helps generate Apache Oozie coordinators and Workflows for Hive, Spark and Shell actions and run them on a Linux based HDInsight cluster
maheshsv/hoidla
Set of real time algorithms used by big data streaming platform
maheshsv/Java-Introduction-And-Best-Practices
Java入门与最佳实践
maheshsv/Oozie
This repo explains how to schedule different oozie jobs and helps understand how to set the property file and workflow.xml
maheshsv/oozie-tool
maheshsv/Random-Forest-Algorithm-using-Hadoop
Hadoop Map Reduce-Association rules are implemented in mapper function and random forest algorithm is used to implement action rules in reducer function
maheshsv/Random-forest-using-Hadoop
maheshsv/Research_Paper_ETL_vs_ELT
Study focuses on deep understanding, analysis, and appreciation of relative effectiveness of data management techniques ETL and ELT in data warehousing
maheshsv/spark-infotheoretic-feature-selection
This package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
maheshsv/sqoop-on-spark
maheshsv/testify
A Java Testing Framework faithful to testing principles and best practices.