xxzuo's Stars
alibaba/fluss
Fluss is a streaming storage built for real-time analytics.
apache/iceberg
Apache Iceberg
aliyun/aliyun-maxcompute-data-collectors
apache/incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
aliyun/alibabacloud-maxcompute-tool-migrate
alibabacloud-maxcompute-tool-migrate
StarRocks/starrocks
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
apache/griffin
Mirror of Apache griffin
youngfish42/Awesome-FL
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
apache/opendal
Apache OpenDAL: One Layer, All Storage.
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
mermaid-js/mermaid
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
apache/commons-vfs
Apache Commons VFS
tobymao/sqlglot
Python SQL Parser and Transpiler
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
bytedance/bitsail
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
lakesoul-io/LakeSoul
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
tencentmusic/cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,支持sso登录,多租户,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU,边缘计算,serverless,标注平台,自动化标注,数据集管理,大模型微调,vllm大模型推理,llmops,私有知识库,AI模型应用商店,支持模型一键开发/推理/微调,支持国产cpu/gpu/npu芯片,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
apache/gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
anylineorg/anyline
运行时动态注册切换数据源,自动生成SQL(DDL/DML/DQL),读写元数据,对比数据库结构差异。适配100+关系/非关系数据库。 常用于动态场景的底层支持,如:数据中台、可视化、低代码后台、工作流、自定义表单、异构数据库迁移同步、物联网车联网数据处理、数据清洗、运行时自定义报表/查询条件/数据结构、爬虫数据解析等
WeBankFinTech/Schedulis
Schedulis is a high performance workflow task scheduling system that supports high availability and multi-tenant financial level features, Linkis computing middleware, and has been integrated into data application development portal DataSphere Studio
lks-ai/anynode
A Node for ComfyUI that does what you ask it to do
apache/iotdb
Apache IoTDB
solidglue/Recommender_System
推荐系统入门指南,全面介绍了工业级推荐系统的理论知识(王树森推荐系统公开课-基于小红书的场景讲解工业界真实的推荐系统),如何基于TensorFlow2训练模型,如何实现高性能、高并发、高可用的Golang推理微服务。Comprehensively introduced the theory of industrial recommender system, how to trainning models based on TensorFlow2, how to implement the high-performance、high-concurrency and high-available inference services base on Golang.
solidglue/Recommender_System_Inference_Services
Large scale recommender system inference Microservices and APIs (Dubbo 、gRPC and REST ) with Golang.
byzer-org/byzer-lang
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
apache/pinot
Apache Pinot - A realtime distributed OLAP datastore
flowerfine/scaleph
Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernetes Operator and Doris operator.
linktimecloud/kubernetes-data-platform
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
wanghaisheng/healthcaredatastandard
healthcare data standard in China
matter-labs/zksync-era
zkSync era