Pinned Repositories
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
flink-statefun
Apache Flink Stateful Functions
Learn-Vim
Learning Vim and Vimscript doesn't have to be hard. This is the guide that you're looking for 📖
mapr-spark-kubernets-cluster
Build a Spark Standalone cluster on Kubernetes using MapR's packages
pyspark-logging
Logging framework for pyspark and ETL project template
Statistical-Inference-Coursera
Course Instructor(s) The primary instructor of this class is Brian Caffo Brian is a professor at Johns Hopkins Biostatistics and co-directs the SMART working group This class is co-taught by Roger Peng and Jeff Leek. In addition, Sean Kross and Nick Carchedi have been helping greatly.
rcpbayindir's Repositories
rcpbayindir/flink-statefun
Apache Flink Stateful Functions
rcpbayindir/airflow-docker
Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
rcpbayindir/airflow-on-k8s
rcpbayindir/Apache-Spark-Tutorials
This repo contains my learnings and practice notebooks on Spark using PySpark (Python Language API on Spark). All the notebooks in the repo can be used as template code for most of the ML algorithms and can be built upon it for more complex problems.
rcpbayindir/book-source-code
Accompanying source code for Istio in Action (Manning)
rcpbayindir/code
rcpbayindir/DataEngineeringProject
Example end to end data engineering project.
rcpbayindir/dbt-trino-incremental-hive
Test dbt project used to test `incremental` materialization with dbt-trino adapter.
rcpbayindir/delta-examples
Delta Lake examples
rcpbayindir/docker-development-youtube-series
rcpbayindir/examples
A repository to host extended examples and tutorials
rcpbayindir/finserv-application-blueprint
Example blueprint application for processing high-speed trading data.
rcpbayindir/flink-mobile-data-usage
rcpbayindir/flink-only-sql
Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this demo, we would explain how data analysts and non-engineers can use only Flink SQL to explore and transform data into insights and actions, without writing any Java or Python code.
rcpbayindir/flinkforward21
Demo repository showing the Strangler Fig Pattern as discussed during our Flink Forward 2021 talk.
rcpbayindir/industry
This repository provides holistic architecture design and reference implementation for industry cloud based on proven success of large scale deployments and at-scale adoption with customers and partners.
rcpbayindir/kafka-plain-java
To explore how Kafka works
rcpbayindir/kafka-sparkStreaming
This repo locally sets up kafka clusters, read local csv files into topic. Performs batch queries, streaming queries and alos incremental queries.
rcpbayindir/kubeflow-spark
Orchestrate Spark Jobs from Kubeflow Pipelines and poll for the status.
rcpbayindir/kubeflow_ops_book_dev
working repo for examples for the kubeflow operations book
rcpbayindir/Machine-Learning-Engineering-with-MLflow
Machine Learning Engineering with MLflow, published by Packt
rcpbayindir/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
rcpbayindir/MLOps-Basics
rcpbayindir/ngods-stocks
New Generation Opensource Data Stack Demo
rcpbayindir/pyspark-example-project
Example project implementing best practices for PySpark ETL jobs and applications.
rcpbayindir/python-pyspark-framework
pyspark framework
rcpbayindir/snowplow-javascript-tracker
Snowplow event tracker for client-side and server-side JavaScript. Add analytics to your websites, web apps and servers.
rcpbayindir/trino-hive-superset-docker
Cloud-native Trino (prestosql) + Hive + Minio + Superset
rcpbayindir/trino-minio-iceberg-example
rcpbayindir/trino-the-definitive-guide
Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)