rdd
There are 204 repositories under rdd topic.
microsoft/Mobius
C# and F# language binding and extensions to Apache Spark
ondra-m/ruby-spark
Ruby wrapper for Apache Spark
mahmoudparsian/data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
zouzias/spark-lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Thomas-George-T/Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
mahmoudparsian/pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
dbis-ilm/stark
A framework for Spatio-Temporal Data Analytics on Spark
asifahmed90/pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Balajirvp/DE-Zoomcamp
Code/Notes for the Data Engineering Zoomcamp by DataTalksClub
fsanaulla/chronicler-spark
InfluxDB connector to Apache Spark on top of Chronicler
LeihuaYe/Causal-Inference-Using-Quasi-Experimental-Methods
Causal Inference Using Quasi-Experimental Methods
derrickoswald/CIMSpark
Spark access to Common Information Model (CIM) files
VinayChaudhari1996/pyspark-dataframe-made-easy
pyspark dataframe made easy
practicalli/doom-emacs
Guide to Clojure REPL Driven Development with Emacs Doom
marcosgambeta/sqlrddpp
SQLRDD for Harbour++ and Harbour
kimaina/openmrs-etl
openmrs - mysql - debezium - kafka - spark - scala
shre1000/Sentiment-Analysis-of-Twitter-Data-using-pySpark-and-Live-Graphs
Sentiment Analysis and Data Visualization
yuanqing/rdd
:pencil: Preview your Markdown locally as it would appear on GitHub, with live updating
changzhiwin/spark-core-analysis
Imitate and rewrite Spark's RDD (core)
PastorGL/OneRing
One Ring is a framework to unify, unite and bind Apache Spark-based computing modules, and run them in parametrized chains
xavierguihot/spark_helper
A bunch of low-level basic methods for data processing and monitoring with Scala Spark
felixthoemmes/rddapp
rddapp: Regression Discontinuity Design Application
g1thubhub/bdrecipes
Big Data Recipes
neerajkesav/SparkJavaExamples
Apache Spark Basics - Java Examples
CarolinaNicasio/APACHESPARK-PYSPARK-2023
PySpark es una biblioteca de procesamiento de datos distribuidos en Python que permite procesar grandes volúmenes de datos en clústeres utilizando el framework Apache Spark, ofreciendo un alto rendimiento y un conjunto de herramientas integradas para el análisis y manejo de datos a gran escala.
NashTech-Labs/Sparkathon
A library having Java and Scala examples for Spark 2.x
chen0040/spark-ml-genetic-programming
Package provides java implementation of big-data genetic programming for Apache Spark
MahsaShk/ApacheSpark
Apache Spark machine learning project using pyspark
rhinempi/sparkhit
sparkhit - analyzing large scale genomic data on the cloud
gogundur/Pyspark-WordCount
Pyspark WordCount
tomerlieber/spark-on-hbase
Reading, writing and deleting from HBase with Spark RDD
Tritbool/MultipleTest4Spark
MT4S - Multiple Tests 4 Spark - a simple Junit/Scalatest testing framework for Apache Spark
JohannesSKunz/ReductionsInOut-of-PocketPrices
Replication files and simulations for Johansson et al 2023 JHE
amageh/replication-performance-standards
Replication of Lindo, Sanders & Oreopoulos (2010), Student Project