Pinned Repositories
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
arroyo
Distributed stream processing engine in Rust
ashrae-energy-prediction
awesome-distributed-systems
Awesome list of distributed systems resources
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
bayesian-methods-for-ml
Materials for "Bayesian Methods for Machine Learning" Coursera MOOC
budget-tracker
Ready-to-go Google Sheets + Forms Setup: Track expense/income/investments, credit card usage, budget with an intuitive app-like interface
competitive-data-science
Materials for "How to Win a Data Science Competition: Learn from Top Kagglers" course
cruise-control
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
cumulative-table-design
This repository helps teach people how to correctly define and create cumulative tables!
rohankrao's Repositories
rohankrao/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
rohankrao/arroyo
Distributed stream processing engine in Rust
rohankrao/budget-tracker
Ready-to-go Google Sheets + Forms Setup: Track expense/income/investments, credit card usage, budget with an intuitive app-like interface
rohankrao/cumulative-table-design
This repository helps teach people how to correctly define and create cumulative tables!
rohankrao/Daft
The Python DataFrame for Complex Data
rohankrao/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
rohankrao/ds2
DS2 is an auto-scaling controller for distributed streaming dataflows
rohankrao/engineering-blogs
A curated list of engineering blogs
rohankrao/fann
Approx nearest neighbor search in Rust
rohankrao/feast
Feature Store for Machine Learning
rohankrao/featury
Friendly ML feature store
rohankrao/flink
Apache Flink
rohankrao/flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink
rohankrao/flink-kubernetes-operator
Apache Flink Kubernetes Operator
rohankrao/flink-native-k8s-operator
Flink native Kubernetes Operator is a java based control plane for running Apache Flink native application on Kubernetes.
rohankrao/flink-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
rohankrao/flink_1.15
rohankrao/flink_1.15.3
rohankrao/flytelab
Machine Learning Projects with Flytekit
rohankrao/ibis
The flexibility of Python with the scale and performance of modern SQL.
rohankrao/ibis-flink-example
rohankrao/ibis-substrait
Ibis Substrait Compiler
rohankrao/kafka-connect-iceberg-sink
rohankrao/natural-language-processing
Resources for "Natural Language Processing" Coursera course.
rohankrao/noether
Scala Aggregators used for ML Model metrics monitoring
rohankrao/risingwave
The distributed streaming database. Engineered to offer the simplest and most cost-efficient way for stream processing and management.
rohankrao/sample-kafka-producer
rohankrao/Soduto
Soduto is a KDE Connect compatible client for macOS. It allows better integration between your phones, desktops and tablets.
rohankrao/spline
Data Lineage Tracking And Visualization Solution
rohankrao/starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.