spark-structured-streaming
There are 102 repositories under spark-structured-streaming topic.
jaceklaskowski/spark-workshop
Apache Spark™ and Scala Workshops
nama1arpit/reddit-streaming-pipeline
A real-time reddit data streaming pipeline for sentiment analysis of various subreddits
qubole/kinesis-sql
Kinesis Connector for Structured Streaming
chermenin/spark-states
Custom state store providers for Apache Spark
streamnative/awesome-pulsar
A curated list of Pulsar tools, integrations and resources.
tomaztk/Azure-Databricks
Azure Databricks - Advent of 2020 Blogposts
kaantas/spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
AbsaOSS/hyperdrive
Extensible streaming ingestion pipeline on top of Apache Spark
garystafford/streaming-sales-generator
Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python
AndrewKuzmin/spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.5.1
bluejoe2008/spark-http-stream
spark structured streaming via HTTP communication
falaybeg/SparkStreaming-Network-Anomaly-Detection
This repository includes supervised and unsupervised machine learning methods which are used to detect anomalies on network datasets. Decision Tree, Random Forest, Gradient Boost Tree, Naive Bayes, and Logistic Regression were used for supervised learning. K-Means was used for unsupervised learning.
CloudComputingProject-2022/Data_visualization_and_analysis_tool_for_telemetry_data
An naive anomaly detection and data visualization tool for F1 on board telemetry data.
CodeRayZhang/Spark-Example
Spark1.6和spark2.2的示例,包含kafka,flume,structuredstreaming,jedis,elasticsearch,mysql,dataframe
guidok91/spark-structured-streaming-kafka
Spark Structured Streaming data pipeline that processes movie ratings data in real-time.
rvilla87/TwitterTrends
Get Twitter trends with twitter4j, stream it to a Kafka topic, save it to MongoDB and visualize in Google Maps
artem0/kafka-scala-api
Samples for using Kafka within Spark Streaming and Akka Actors, Akka Streams
stephen29xie/tweet-streaming-data-pipeline
Real-time streaming data pipeline for Twitter Tweets
NashTech-Labs/Sparkathon
A library having Java and Scala examples for Spark 2.x
AndrewKuzmin/Analytics-For-IoT-Devices-Using-Spark
Analytics for IoT devices using Apache Spark Structured Streaming 2.4.0
hoseinlook/cpu-anomaly-detection-with-spark
cpu anomaly detection with spark
ozancicek/artan
Online latent state estimation with Spark
renardeinside/spark-streaming-state-store-example
Spark Structured Streaming with State Store
AmadeusITGroup/Elastic-Scaling
Elastic scaling is a library that allows to control the number of resources (executors or workers) instantiated by a Spark Structured Streaming Job in order to optimize the effective microbatch duration.
Pedro-Manoel/iot-analytics-solution-tcc
🎓 Repositório com a solução de IoT Analytics desenvolvida como parte do Trabalho de Conclusão de Curso (TCC) do curso de Ciência da Computação da Universidade Federal de Campina Grande (UFCG)
AlexRogalskiy/spark-patterns
🏆 Spark4You Design patterns
datacircus/pyspark-streaming-base
This project provides an opinionated way to go about crafting Spark Structured Streaming applications with PySpark
haozhang-x/log-analysis-spark
Structured Streaming Log Analysis
JulienPeloton/mini_spark_broker
Design and proof-of-concept for a Broker for astronomy using Apache Spark
anmollp/Zootopia
A distributed streaming data processing pipeline.
ArmanShakeri/Pyspark-upsert-oracle
Pyspark sample for upsert data to oracle table
fermat01/Building-streaming-ETL-Data-pipeline
Streaming data pipeline using apache airflow, kafka , Minio object storage
hadiezatpanah/Spark_Java_MostValuableCustomers
This Spark Java project serves as a demonstration of Gradle Spark configuration, specifically focusing on utilizing the MemoryStream class as the streaming source.
iomete/kafka-streaming-job
Kafka streaming job from iomete. This streaming job copies data from Kafka to Iceberg.
pprzetacznik/datalake
Simple datalake
vvittis/CCFD-RF
Credit Card Fraudulent Detection with Random Forest