Pinned Repositories
airbyte
Airbyte is an open-source data integration engine that helps you consolidate your data in your warehouses.
awesome
:sunglasses: Curated list of awesome lists
awesome-algorithms
A curated list of awesome places to learn and/or practice algorithms.
awesome-awesome-awesome
An awesome-awesome list.
awesome-awesomeness
A curated list of awesome awesomeness
awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
flink-registry-avro-row-schema
Flink Avro Format schema that supports schema registry.
spark_streaming_core_example
vikasyadav15's Repositories
vikasyadav15/flink-registry-avro-row-schema
Flink Avro Format schema that supports schema registry.
vikasyadav15/airbyte
Airbyte is an open-source data integration engine that helps you consolidate your data in your warehouses.
vikasyadav15/babar
Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
vikasyadav15/Big-Data
vikasyadav15/bigdata-ecosystem
BigData Ecosystem Dataset
vikasyadav15/chombo
Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm
vikasyadav15/cm_api
Cloudera Manager API Client
vikasyadav15/crossdata
Easy access to big things. Library for Apache Spark extending and improving its capabilities
vikasyadav15/data-engineering-gcp
Data Engineering on Google Cloud Platform
vikasyadav15/developer-roadmap
Roadmap to becoming a web developer in 2018
vikasyadav15/file_format_stack_overflow_pdc
Will Contains All the Code Related to Different File Format Analysis
vikasyadav15/hello-github-actions
vikasyadav15/Interview-Revision
vikasyadav15/JSON-Linting
This application provides simple, secure, 100% client-side, network-free JSON linting that ensures that no one else is seeing the data you are testing. This is worry-free JSON linting.
vikasyadav15/metorikku
A simplified, lightweight ELT Framework based on Apache Spark
vikasyadav15/ndscheduler
A flexible python library for building your own cron-like system, with REST APIs and a Web UI.
vikasyadav15/Optimus
:rocket: Optimus is the missing framework for cleansing (cleaning and much more), pre-processing and exploratory data analysis in a distributed fashion with Apache Spark.
vikasyadav15/pyspark-example-project
Example project and best practices for Python-based Spark ETL jobs and applications.
vikasyadav15/scala-spark-4
vikasyadav15/spark
Apache Spark - A unified analytics engine for large-scale data processing
vikasyadav15/spark-data-ingestion
vikasyadav15/spark-json-schema
JSON schema parser for Apache Spark
vikasyadav15/spark-redshift
Performant Redshift data source for Apache Spark
vikasyadav15/Spark_Streaming_KafkaWordCount
Spark Streaming Kafka Example
vikasyadav15/SqlShift
Mysql to Redshift data transfer using Apache Spark.
vikasyadav15/Udacity_Nanodegree_Project
Udacity Nanodegree AWS Cloud Architect Project Work
vikasyadav15/vikasyadav15.github.io
vikasyadav15/workshops
workshops
vikasyadav15/wos-spark-manager-api
A Flask based application that facilitates IBM Watson OpenScale to read/write files from/to remote HDFS, run and get details about a job running in remote Spark cluster.
vikasyadav15/yfin-etl
Yahoo Finance ETL script