cloudera-hadoop
There are 37 repositories under cloudera-hadoop topic.
sergevs/ansible-cloudera-hadoop
ansible playbook to deploy cloudera hadoop components to the cluster
tilakpatidar/cdh5
Docker image for Cloudera Hadoop components (CDH5)
Ranjandas/Dirty-CDH-Docker
A quick and dirty CDH cluster skeleton using Docker for Testing
dengshaochun/cdh-tools
cloudera hadoop auto install
achintya-kumar/BD2017
Otto-von-Guericke Universität Magdeburg - Big Data SoSe 2017
haspdecrypted/OS-for-Big-Data-and-Hadoop
Getting Started with Hadoop and Big Data
kwartile/spark-benchmark
Spark Benchmark suite to evaluate cluster configuration and compare the performance with other big data frameworks.
rapsoulhaonan/graphic-theoretic-problems
:guardsman: Hadoop/MapReduce Streaming
arunkthomasuncc/Query_Search_Using_TF-IDF
This repository contains the TF-IDF score calculation for the documents in the Canterbury dataset for a user given search query
dorianbg/cloudera-quickstart-installation-guide
How to install Cloudera quickstart
Ishuan/Page-Rank-Implementation
The goal of this programming assignment is to compute the PageRanks of an input set of hyperlinked Wikipedia documents using Hadoop MapReduce. The PageRank score of a web page serves as an indicator of the importance of the page. Many web search engines (e.g., Google) use PageRank scores in some form to rank user-submitted queries. The goals of this assignment are to: 1. Understand the PageRank algorithm and how it works in MapReduce. 2. Implement PageRank and execute it on a large corpus of data. 3. Examine the output from running PageRank on Simple English Wikipedia to measure the relative importance of pages in the corpus. To run your program on the full Simple English Wikipedia archive, you will need to run it on the dsba-hadoop cluster to which you have access.
JohnnyFoulds/local-hadoop
This project creates a small local Hadoop cluster using Cloudera CDH and CentOS.
SakhriHoussem/Apache-Hive-Tutorial
Learn How Hive Work in Simple Example
syscrest/cloudera-manager-hipchat-chatbot
chatbot for hipchat (cloud or onpremise) that enables you to talk to your cloudera manager
vodkolav/DataEngineerProject
This is my final project for Data Engineer Expert course at Naya College.
aastha-ghub/Airlines-Analysis-project-HADOOP
This project involves analysing the airline datasets to solve the problem statements using HADOOP.
akshay-madar/MovieTycoon-gcp-based-BI-tool
GCP hosted product for over 1 million movie investors on HSX.com, aiding online movie trading and box-office investments by leveraging Big Data technologies like Hive and Hadoop, and Tableau dashboards
akshaydake123/Sentiment-Analysis-on-Twitter-Data
This contains how to perform Sentiment Analysis on the tweets from Twitter using Hive.Collect the tweets from Twitter using Flume, As the tweets coming in from twitter are in Json format, we need to load the tweets into Hive using json input format. Use Cloudera Hive json serde for this purpose.
bishalpaudel/HadoopProductPurchaseProbability
Anticipatory customer order prediction after purchasal of item(s).
marycboardman/Assessment-Attempts
Data processing using docker containers, kafka, spark, and hadoop
meetgajjarx07/Baseball-analysis-BigData
This project utilizes the Cloudera platform and PIG queries to analyze and retrieve information on specific baseball performance and statistics problems. By employing big data methods, the analysis offers valuable insights into player performance, game trends, and strategic patterns.
VaishnavJois/CLOUDERA
Cloudera commands used for Big Data Analytics
guptasaumya/navigator-data-service
Navigator is a data service that prepares the content for travel agencies, ready for exploration in EWNS (East-West-North-South) direction and hence allows them to render content to the end-user based on their desire to travel.
Johnny1110/Hadoop_Note
學習 Hadoop 筆記
Mantej-Singh/Apache-Spark-Under-the-hood--WordCount
Running my first pyspark app in CDH5
nikitaeverywhere/hadoop-network-of-keywords
Keywords network builder based on TF-IDF with the use of Hadoop platform
SakhriHoussem/Apache-Spark-Tutorial
a Simple Apache Spark Tutorial
SakhriHoussem/HBase-Tutorial
a Simple HBase Tutorial
SakhriHoussem/HBase-With-Hive
Learn How Hive Work With HBase in Simple Example
SakhriHoussem/SparkSQL-Tutorial
a Simple SparkSQL Tutorial
shubnimkar/Hadoop
This repository includes two versions of hadoop management tools