ethany21

ethany21's Stars

awesome-spark/awesome-spark
A curated list of awesome Apache Spark packages and resources.
Language:Shell1.7k329
apache/ignite
Apache Ignite
Language:Java4.8k1.9k
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
Language:Java5.3k2.4k
apache/shardingsphere
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
Language:Java19.8k6.7k
apache/iceberg-rust
Apache Iceberg
Language:Rust600134
apache/hudi-rs
A native Rust library for Apache Hudi, with bindings into Python
Language:Rust13728
apache/doris
Apache Doris is an easy-to-use, high performance and unified analytics database.
Language:Java12.3k3.2k
apache/fury
A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
Language:Java3k220
apache/kvrocks
Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
Language:C++3.5k452
apache/opendal
Apache OpenDAL: access data freely.
Language:Rust3.2k451
apache/polaris
The interoperable, open source catalog for Apache Iceberg
Language:Python1k99
apache/paimon-trino
Trino Connector for Apache Paimon.
Language:Java2527
facebook/rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.
Language:C++28.3k6.3k
debezium/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Language:Java10.4k2.5k
apache/pulsar
Apache Pulsar - distributed pub-sub messaging system
Language:Java14.1k3.6k
redpanda-data/redpanda
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
Language:C++9.4k579
apache/kafka
Mirror of Apache Kafka
Language:Java28.4k13.8k
apache/arrow-adbc
Database connectivity API standard and libraries for Apache Arrow
Language:C#36087
apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Language:C++14.3k3.5k
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
Language:C++22.7k1.8k
trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Language:Java10.2k2.9k
facebookincubator/velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
Language:C++3.4k1.1k
sushant2019/bustub-private
My repository for the code for CMU-DB Intro to Database Course by Andy Pavlo
Language:C++1
paradedb/paradedb
Postgres for Search and Analytics
Language:Rust5.8k165
apache/incubator-xtable
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Language:Java842139
PacktPublishing/In-Memory-Analytics-with-Apache-Arrow-
In-Memory Analytics with Apache Arrow, published by Packt
Language:C++8426
apache/amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
Language:Java839276
Eventual-Inc/Daft
Distributed DataFrame for Python designed for the cloud, powered by Rust
Language:Rust2.1k141
voltrondata/sqlflite
An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.
Language:C++19122
apache/arrow-rs
Official Rust implementation of Apache Arrow
Language:Rust2.5k736