Pinned Repositories
aigc
《构筑大语言模型应用:应用开发与架构设计》一本关于 LLM 在真实世界应用的开源电子书,介绍了大语言模型的基础知识和应用,以及如何构建自己的模型。其中包括Prompt的编写、开发和管理,探索最好的大语言模型能带来什么,以及LLM应用开发的模式和架构设计。
airbyte
Data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.
airbyte-platform
The platform that powers Airbyte. Please file issues in https://github.com/airbytehq/airbyte
Favorites
fluss
Fluss is a streaming storage built for real-time analytics.
kafka
Mirror of Apache Kafka
paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
seatunnel
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
trino-rocketmq
Plug in that supports reading rocketmq data in Trino
sunxiaojian's Repositories
sunxiaojian/debezium
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
sunxiaojian/fluss
Fluss is a streaming storage built for real-time analytics.
sunxiaojian/iceberg
Apache Iceberg
sunxiaojian/kafka
Mirror of Apache Kafka
sunxiaojian/paimon
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
sunxiaojian/seatunnel
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
sunxiaojian/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
sunxiaojian/alldata
🔥🔥 AllData大数据产品是可定义数据中台,以数据平台为底座,以数据中台为桥梁,以机器学习平台为中层框架,以大模型应用为上游产品,提供全链路数字化解决方案。全新会员商业版 X 微信群:https://docs.qq.com/doc/DVHlkSEtvVXVCdEFo
sunxiaojian/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
sunxiaojian/caffeine
A high performance caching library for Java
sunxiaojian/datafusion
Apache DataFusion SQL Query Engine
sunxiaojian/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
sunxiaojian/elasticsearch
Free and Open, Distributed, RESTful Search Engine
sunxiaojian/gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
sunxiaojian/hudi
Upserts, Deletes And Incremental Processing on Big Data.
sunxiaojian/hudi-rs
A native Rust library for Apache Hudi, with bindings into Python
sunxiaojian/iceberg-rust
Apache Iceberg
sunxiaojian/ignite
Apache Ignite
sunxiaojian/kafka-connect-file-pulse
🔗 A multipurpose Kafka Connect connector that makes it easy to parse, transform and stream any file, in any format, into Apache Kafka
sunxiaojian/kyverno
Cloud Native Policy Management
sunxiaojian/lucene
Apache Lucene open-source search software
sunxiaojian/milvus
A cloud-native vector database, storage for next generation AI applications
sunxiaojian/nessie
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
sunxiaojian/paimon-trino
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
sunxiaojian/parquet-java
Apache Parquet
sunxiaojian/polaris
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
sunxiaojian/risingwave
Scalable Postgres for stream processing, analytics, and management. KsqlDB and Apache Flink alternative. 🚀 10x more productive. 🚀 10x more cost-efficient.
sunxiaojian/rust
Empowering everyone to build reliable and efficient software.
sunxiaojian/starrocks
StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.
sunxiaojian/tiered-storage-for-apache-kafka
RemoteStorageManager for Apache Kafka® Tiered Storage