Pinned Repositories
hadiezatpanah
PySpark-AWS-Postgres-ETL
An End to End solution to read TSV files from AWS S3 and process and import them into postgres relational database.
Spark-Scala-Data-Pipeline
An End to End solution to read XML data from FTP server and process and import them into postgres relational database
Spark_Java_MostValuableCustomers
This Spark Java project serves as a demonstration of Gradle Spark configuration, specifically focusing on utilizing the MemoryStream class as the streaming source.
Spark_Java_Stateful_Processing
This project presents a distributable solution based on Spark Java, aiming to connect start and end session events together in a stateful manner. The project utilizes `flatMapGroupWithState`functionality which is a powerful feature for stateful stream processing in Spark. It enables you to maintain and update the state across batches.
Spark_Structured_Streaming_Java
In this solution, the issue of creating a table with case-sensitive columns (in the scenario where the table doesn't exist or when writing the table in overwrite mode) in Oracle has been addressed by developing a custom Oracle dialect and registering it.
Trending_Topic_Spark_Streaming_Scala
This is an End to End solution to read data from streaming source (kafka), extract different topic from data in each time window, calculating Hot Topics using a modified Z-Score Algorithm and storing Final Trend Topics in Postgres SQL Database
hadiezatpanah's Repositories
hadiezatpanah/Spark_Java_MostValuableCustomers
This Spark Java project serves as a demonstration of Gradle Spark configuration, specifically focusing on utilizing the MemoryStream class as the streaming source.
hadiezatpanah/hadiezatpanah
hadiezatpanah/PySpark-AWS-Postgres-ETL
An End to End solution to read TSV files from AWS S3 and process and import them into postgres relational database.
hadiezatpanah/Spark-Scala-Data-Pipeline
An End to End solution to read XML data from FTP server and process and import them into postgres relational database
hadiezatpanah/Spark_Java_Stateful_Processing
This project presents a distributable solution based on Spark Java, aiming to connect start and end session events together in a stateful manner. The project utilizes `flatMapGroupWithState`functionality which is a powerful feature for stateful stream processing in Spark. It enables you to maintain and update the state across batches.
hadiezatpanah/Spark_Structured_Streaming_Java
In this solution, the issue of creating a table with case-sensitive columns (in the scenario where the table doesn't exist or when writing the table in overwrite mode) in Oracle has been addressed by developing a custom Oracle dialect and registering it.
hadiezatpanah/Trending_Topic_Spark_Streaming_Scala
This is an End to End solution to read data from streaming source (kafka), extract different topic from data in each time window, calculating Hot Topics using a modified Z-Score Algorithm and storing Final Trend Topics in Postgres SQL Database