rajeshsantha

Azure Data Engineer

Pinned Repositories

Analyze-sentiment-FLUME-HIVE
Twitter sentiment analysis by FLUME
0 1 00
awesome-github-wiki
:neckbeard: Awesome list GitHub Wikis
0 1 00
databricks-crt020-notes
docs, codes and resources to prepare for the CRT020: Databricks Certified Associate Developer for Apache Spark 2.4 with Python 3 certification
Language:Jupyter Notebook1 1 00
DataStructuresAndAlgorithmsInScala
This project contains snippets of Scala code for various problems available on LeetCode,Hackerrank and also Data structures and algorithms implementation that required to solve those these problems.
Language:Scala10 1 04
leetcode-patterns
A pattern-based approach for learning technical interview questions
Language:JavaScript1 1 00
LeetCode-SQL-50
0 1 00
MonitoredStructuredStreaming
Repository for Spark structured streaming use case implementations.
Language:Scala1 2 01
PythonLearningSnippets
Language:Python1 2 01
Spark_Incremental_Load_Automated_POC
This repository contains project of 'Automated Spark incremental data ingestion' from FileSystem to HDFS. The inbound folder will contains the input csv files. When you trigger the spark job , following steps will takes place. Spark will pick the latest arrived file in the inbound folder automatically and validate,process and ingest to HDFS. During the validation, if you found that file is already loaded to HDFS, then you can request new load from spark-submit optional parameters.This optional parameters are developed by scala's scopt library.When you request a new load flag, scala script will fetch a new file from external location(as this is a poc, It is simulated as some other directory than inbound within same file system) to Inbound and load that file to HDFS table. Once the data is read and validated , it will insert into given parameterized avro table or overwrite if table already exists.
Language:Scala2 1 01

rajeshsantha's Repositories

rajeshsantha/DataStructuresAndAlgorithmsInScala
This project contains snippets of Scala code for various problems available on LeetCode,Hackerrank and also Data structures and algorithms implementation that required to solve those these problems.
Language:Scala10 1 04
rajeshsantha/Spark_Incremental_Load_Automated_POC
This repository contains project of 'Automated Spark incremental data ingestion' from FileSystem to HDFS. The inbound folder will contains the input csv files. When you trigger the spark job , following steps will takes place. Spark will pick the latest arrived file in the inbound folder automatically and validate,process and ingest to HDFS. During the validation, if you found that file is already loaded to HDFS, then you can request new load from spark-submit optional parameters.This optional parameters are developed by scala's scopt library.When you request a new load flag, scala script will fetch a new file from external location(as this is a poc, It is simulated as some other directory than inbound within same file system) to Inbound and load that file to HDFS table. Once the data is read and validated , it will insert into given parameterized avro table or overwrite if table already exists.
Language:Scala2 1 01
rajeshsantha/databricks-crt020-notes
docs, codes and resources to prepare for the CRT020: Databricks Certified Associate Developer for Apache Spark 2.4 with Python 3 certification
Language:Jupyter Notebook1 1 00
rajeshsantha/leetcode-patterns
A pattern-based approach for learning technical interview questions
Language:JavaScript1 1 00
rajeshsantha/MonitoredStructuredStreaming
Repository for Spark structured streaming use case implementations.
Language:Scala1 2 01
rajeshsantha/PythonLearningSnippets
Language:Python1 2 01
rajeshsantha/Analyze-sentiment-FLUME-HIVE
Twitter sentiment analysis by FLUME
0 1 00
rajeshsantha/awesome-github-wiki
:neckbeard: Awesome list GitHub Wikis
0 1 00
rajeshsantha/awesome-guidelines
A curated list of high quality coding style conventions and standards.
Language:JavaScript00
rajeshsantha/KafkaSparkStructuredStreaming
Language:Java0 1 00
rajeshsantha/LeetCode-SQL-50
0 1 00
rajeshsantha/Scala-for-the-Impatient-my-solutions
Language:Scala0 2 00
rajeshsantha/coding-interview-university
A complete computer science study plan to become a software engineer.
1 0
rajeshsantha/computer-science
:mortar_board: Path to a free self-taught education in Computer Science!
1 0
rajeshsantha/covid19
Resources for the Udemy Course - Azure Data Factory For Data Engineers - Project on Covid19 by Ramesh Retnasamy
rajeshsantha/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
rajeshsantha/Data-Science--Cheat-Sheet
Cheat Sheets
1 01
rajeshsantha/DatabricksIntegration
This repo is integrated with Databricks course learning
Language:Scala2 0
rajeshsantha/fpinscala
Code, exercises, answers, and hints to go along with the book "Functional Programming in Scala"
Language:Scala1 0
rajeshsantha/free-programming-books
:books: Freely available programming books
rajeshsantha/githubDemo
Its a demo repository to practice advanced git commands.
2 0
rajeshsantha/gitPracticeDir
delete later
Language:Scala2 0
rajeshsantha/Hive_Poc
POC Created from labs, consists of hive data loading with UDFs and UDAFs
Language:Scala1 0
rajeshsantha/InputData
This repo consists of sample datasets and Data that required for POCs and assignments
1 0
rajeshsantha/IntellijAssignments
Language:Scala1 0
rajeshsantha/JupyterRepo
Language:Jupyter Notebook
rajeshsantha/LearningScala
My journey to learn Scala.
Language:Scala1 0
rajeshsantha/Notes
This repo contains preparation notes and logs for POC programs.
Language:Scala1 01
rajeshsantha/professional-programming
A collection of full-stack resources for programmers.
rajeshsantha/terraform-azure
terraform - azure - build
Language:HCL1 0