roomanidzee's Stars
seaweedfs/seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
casey/just
🤖 Just a command runner
NewTendermint/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
lightdash/lightdash
Self-serve BI to 10x your data team ⚡️
apache/kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
mito-ds/monorepo
The mitosheet package, trymito.io, and other public Mito code.
bytewax/bytewax
Python Stream Processing
dbt-labs/dbt-utils
Utility functions for dbt projects.
ploomber/jupysql
Better SQL in Jupyter. 📊
databrickslabs/pyspark-ai
English SDK for Apache Spark
LineaLabs/lineapy
Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
Lancetnik/Propan
Propan is a powerful and easy-to-use Python framework for building event-driven applications that interact with any MQ Broker
LucaCanali/Miscellaneous
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter notebooks examples for Spark, examples for Oracle and other DB systems.
zsvoboda/ngods-stocks
New Generation Opensource Data Stack Demo
andreax79/airflow-code-editor
A plugin for Apache Airflow that allows you to edit DAGs in browser
Miksus/red-mail
Advanced email sending for Python
microsoft/sql-spark-connector
Apache Spark Connector for SQL Server and Azure SQL
ClickHouse/clickhouse-kafka-connect
ClickHouse Kafka Connector
sbrunk/storch
GPU accelerated deep learning and numeric computing for Scala 3.
confluentinc/parallel-consumer
Parallel Apache Kafka client wrapper with per message ACK, client side queueing, a simpler consumer/producer API with key concurrency and extendable non-blocking IO processing.
AugustNagro/java-async-await
Async-Await support for Java
Miksus/red-box
Next generation email box manager
typelevel/toolkit
Quickstart your next app with the Typelevel Toolkit!
LamaAni/KubernetesJobOperator
An airflow operator that executes a task in a kubernetes cluster, given a kubernetes yaml configuration or an image refrence.
astronomer/airflow-provider-kafka
A provider package for kafka
yaooqinn/spark-postgres
PostgreSQL and GreenPlum Data Source for Apache Spark
bakdata/streams-bootstrap
Utility functions and base classes for Kafka Streams applications
oracle/spark-oracle
On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.
10xfuturetechnologies/kafka-connect-iceberg
Kafka Connector for Iceberg tables
jitapichab/apache-nifi-kafka
https://medium.com/@jitapichab/apache-nifi-integrate-kafka-to-consume-and-produce-387968b8bd6b