saLeox
Specializing in big data platform. Apache Linkis committer and Apache Flink/Streampark/Zeppelin contributor.
Sea LimitedSingapore
saLeox's Stars
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
zhisheng17/flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
apache/dolphinscheduler
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Netflix/conductor
Conductor is a microservices orchestration engine.
debezium/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
datahub-project/datahub
The Metadata Platform for your Data and AI Stack
oauth2-proxy/oauth2-proxy
A reverse proxy that provides authentication with Google, Azure, OpenID Connect and many more identity providers.
apache/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
apache/flink-cdc
Flink CDC is a streaming data integration tool
open-metadata/OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
apache/incubator-streampark
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
apache/opendal
Apache OpenDAL: One Layer, All Storage.
apache/linkis
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
DataLinkDC/dinky
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
running-elephant/datart
Datart is a next generation Data Visualization Open Platform
OpenLineage/OpenLineage
An Open Standard for lineage metadata collection
bytedance/bitsail
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
cnych/qikqiak.com
关注 chatgpt、容器、kubernetes、devops、python、golang、微服务等技术 🎉🎉🎉
jertel/elastalert2
ElastAlert 2 is a continuation of the original yelp/elastalert project. Pull requests are appreciated!
apache/amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
threeknowbigdata/flink_second_understand
该仓库专注于让读者秒懂Flink组件,包含Flink实战代码和文档、200个Flink教程知识点,Flink Datastream、Flink Table、Flink Window、Flink State、Flink Checkpoint、Flink Metrics、Flink Memory、Flink on standalone /yarn/k8s、Flink SQL、Flink CEP、Flink CDC、Flink UDF、PyFlink、Flink新特性、Flink Partition、Flink Memory等知识点。详细链接请看:https ://mp.weixin.qq.com/mp /appmsgalbum?__biz=Mzg5NDY3NzIwMA==&action=getalbum&album_id=2038088622687469575#wechat_redirect
cartershanklin/pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Tencent/Firestorm
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers
cubefs/shuttle
Shuttle:High Available, High Performance Remote Shuffle Service
WeBankFinTech/Streamis
Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
Karql/elastalert-kibana-plugin
ElastAlert Kibana Plugin
apache/linkis-website
Apache Linkis documents
saLeox/flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink