hadoop-ecosystem
There are 40 repositories under hadoop-ecosystem topic.
madd86/awesome-system-design
A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
dhkdn9192/data_engineer_career
DE직무에 필요한 모든 것
ZuInnoTe/hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Cigna/ibis
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Jayvardhan-Reddy/BigData-Ecosystem-Architecture
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
hyeonsangjeon/dataplatform
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
jodth07/hadoop-installation
Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04
pfisterer/apache-knox-docker
Dockerfile for running Apache Knox (http://knox.apache.org/) in Docker
SarahAyaz/YouTube_Data_Analysis
Analysis of YouTube Data using Hadoop Mapreduce framework in Java.
satyajeetmaharana/floodprediction
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
alex-ber/docker-hive
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
meliodaseren/spark-sql-demo
SparkSQL Quick Start Tutorial
pfisterer/apache-knox-helm
Helm chart for Apache Knox
saitejavishalj/Hotspot-analysis-of-Geospatial-data
Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.
AnkitaSinha98/Customer360-Data-Analysis
Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is very useful for companies.
f2e-awesome/HadoopEcosystem
Hadoop 生态体系(ecosystem)
mayankskb/Hadoop-Times
Practise programs in hadoop ecosystem for refrence
meliodaseren/avro-file-format
Avro File Format Quick Start Tutorial
meliodaseren/spark-streaming-kafka-demo
Spark Streaming & Kafka Quick Start Tutorial
nirmalagra/MovieLensDataAnalysis
Mapreduce program developed in Java for analyzing movie dataset
oykuyildirim/Flume-Service
Getting tweets using Flume service and analyzing tweets
rakeshdey0018/Weblog-Analysis-using-PIG
[BigData] one year weblog analysis using PIG
simple-learning/Hadoop
Hadoop Projects
ArwaEiad/TMDB-Project
This project focuses on analyzing movie data using Pyspark tailored for efficient data processing on Hadoop Distributed File System (HDFS)
PykaAlexandro/A-MapReduce-Vademecum-via-Hadoop
Some basic procedures for parallel computing in the Hadoop environment
Rohit-Jain-2801/HadoopInstallGuide
Apache Hadoop Components Installation Guide on Windows
tingjhenjiang/bigdata_docker_images
資料平行批次與串流處理以及搭建機器學習環境會用到的container
uncleislearning/learning-Hadoop
HDFS、MapReduce、Hive、Zookeeper原理以及实践操作
vineetdcunha/Hadoop_Ecosystem
Processing and transforming data via Hadoop Ecosystem
DiegoBulhoes/hadoop-ansible-single-node
Ambiente com o objetivo de praticar o uso das ferramentas Ansible e Hadoop usando uma única instância
m-r-tanha/Hadoop-Ecosystem
This repository is going to update based on my challenges in installing and using the Hadoop's tools Spark
meliodaseren/structure-streaming-demo
Structure Streaming Quick Start Tutorial
PrathameshNimkar/Big-Data-Analysis-using-the-Hadoop-Ecosystem
Learn and implement the Hadoop Ecosystem to drive Big Data Analytics.
reggert/cumulative
[Work in progress] Client library for simplified access to Apache Accumulo