Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
AthenaX
SQL-based streaming analytics platform at scale
bahir-flink
Mirror of Apache Bahir Flink
beam
Apache Beam is a unified programming model for Batch and Streaming
calcite
Mirror of Apache Calcite
ceshi
fucking-algorithm
手把手撕LeetCode题目,扒各种算法套路的裤子。English version supported! Crack LeetCode, not only how, but also why.
realtime-technology
realtime data、realtime computer engine、realtime storage engine
spark
Mirror of Apache Spark
lsyldliu's Repositories
lsyldliu/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
lsyldliu/arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
lsyldliu/differential-dataflow
An implementation of differential dataflow using timely dataflow on Rust.
lsyldliu/dolphinscheduler
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
lsyldliu/duckdb
DuckDB is an in-process SQL OLAP Database Management System
lsyldliu/flink
Mirror of Apache Flink
lsyldliu/flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink
lsyldliu/flink-docker
Docker packaging for Apache Flink
lsyldliu/flink-kubernetes-operator
Apache Flink Kubernetes Operator
lsyldliu/flink-remote-shuffle
Remote Shuffle Service for Flink
lsyldliu/fluss
Fluss is a streaming storage built for real-time analytics.
lsyldliu/gazelle_plugin
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
lsyldliu/hudi
Upserts, Deletes And Incremental Processing on Big Data.
lsyldliu/incubator-paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
lsyldliu/jdk
JDK main-line development
lsyldliu/leveldb
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
lsyldliu/lsyldliu
lsyldliu/mlsql
The Programming Language Designed For Big Data and AI
lsyldliu/nexmark
Benchmarks for queries over continuous data streams.
lsyldliu/papers-we-love
Papers from the computer science community to read and discuss.
lsyldliu/ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
lsyldliu/RxJava
RxJava – Reactive Extensions for the JVM – a library for composing asynchronous and event-based programs using observable sequences for the Java VM.
lsyldliu/schema-registry
Confluent Schema Registry for Kafka
lsyldliu/spring-framework
Spring Framework
lsyldliu/streaming-sql
Kubernetes deployments and examples for various streaming SQL implementations
lsyldliu/streamx
Make Flink|Spark easier!!! The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data laker.
lsyldliu/tiflash
The analytical engine for TiDB
lsyldliu/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL
lsyldliu/useful-scripts
🐌 useful scripts for making developer's everyday life easier and happier, involved java, shell etc.
lsyldliu/velox
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.